
Dr Weijia Zhang
Lecturer in Data Science and Applied Statistics
School of Information and Physical Sciences (Data Science and Statistics)
- Email:weijia.zhang@newcastle.edu.au
- Phone:0240550921
Career Summary
Biography
Dr Weijia Zhang’s research aims to bridge the gap between causal inference and machine learning, which enables statistical and computational algorithms to transform the decision-making process of our government, healthcare, and business.
Dr. Weijia "Weja" Zhang graduated with a PhD degree from the University of South Australia in 2018. He has researched and taught a variety of courses in data science and machine learning, first when working as a Research Fellow in his alma mater, and then as an Associate Professor at Southeast University, China, before assuming a position as Lecturer with the University of Newcastle.
As a Lecturer in Data Science and Statistics at the University of Newcastle, Weijia is experienced in statistical causal inference and weakly supervised machine learning. His research has been featured in many prestigious international conferences and journals, and has been successfully adapted to various industries, including some major companies in the automobile, oil and gas, and telecommunications sectors in the Asia Pacific region.
The power of causal inference.
Causal inference is a field of study that deals with the identification of cause-and-effect relationships between variables. It has had a significant impact in various fields, including medicine, public health, economics, and social sciences, among others. Weijia’s work in causality focuses on developing methods to estimate the causal effect from observational data or even data with censoring, which allows researchers to determine the relationship between an exposure or intervention and an outcome of interest, without the need for running costly and time-consuming randomized controlled trials.
Machine learning from ambiguously labelled data
Although supervised machine learning has demonstrated success in various domains, it has a notable limitation in its dependence on large volumes of labelled data. Weakly supervised learning, which involves training algorithms using data with incomplete, ambiguous or noisy labels, is a technique employed in machine learning to overcome this limitation. Weijia's research in weakly supervised learning has sparked a new learning paradigm, known as Multi-Instance Partial Label learning, which enables algorithms to learn from low-quality and ambiguous labels provided by non-expert crowdsourced annotators. This paradigm offers a promising solution to the challenge of limited expert annotations in crucial domains, such as medical imaging and diagnostics.
Bridging causal inference with machine learning
It comes as no surprise that Weijia's research in causal inference and machine learning leads him inevitably to the emerging field of causal machine learning. This area has the potential to enhance the dependability and interpretability of machine learning models, which have impeded the advancement of machine learning applications since its inception. Thanks to Weijia's expertise in weakly-supervised learning, one of his recent works has demonstrated that causality-based models can be more readily achieved when supervision is ambiguous than in traditional supervised learning.
Making realworld impact
In addition to his academic publications, Weijia has extensive experience in translating his research into practical applications. He has collaborated with customer representatives in the telecommunications sector to enhance customer retention, and with production engineers in the oil and gas industry to develop predictive maintenance strategies, among other projects. In a recent collaboration with Chang'an Automobile, a major car manufacturer in the Asia Pacific, Weijia's algorithm was implemented to improve the lifespan for hundreds of thousands of vehicle batteries.
Qualifications
- DOCTOR OF PHILOSOPHY, University of South Australia
Keywords
- Causal Inference
- Data Mining
- Machine Learning
Languages
- English (Fluent)
- Mandarin (Mother)
Fields of Research
| Code | Description | Percentage | 
|---|---|---|
| 490508 | Statistical data science | 60 | 
| 461199 | Machine learning not elsewhere classified | 40 | 
Professional Experience
UON Appointment
| Title | Organisation / Department | 
|---|---|
| Lecturer in Data Science and Applied Statistics | University of Newcastle School of Information and Physical Sciences Australia | 
Academic appointment
| Dates | Title | Organisation / Department | 
|---|---|---|
| 1/7/2021 - 31/1/2023 | Associate Professor | Southeast University School of Computer Science and Engineering China | 
| 28/5/2018 - 17/7/2020 | Research Fellow | The University of South Australia School of Information and Mathematical Science Australia | 
Teaching
| Code | Course | Role | Duration | 
|---|---|---|---|
| INFS 5102 | Unsupervised Methods in Analytics The University of South Australia | Course Coordinator | 18/2/2019 - 30/6/2019 | 
| STAT6160 | Data Analytics for Business Intelligence College of Engineering, Science and Environment, University of Newcastle | Course Coordinator | 20/2/2023 - 30/6/2023 | 
| CSE 5075 | Pattern Recognition Southeast University | Course Coordinator | 1/3/2022 - 30/7/2022 | 
| INFS 5100 | Predictive Analytics The University of South Australia | Course Coordinator | 17/2/2020 - 17/7/2020 | 
| STAT1060 | Business Decision Making College of Engineering, Science and Environment, University of Newcastle | Course Coordinator | 20/2/2023 - 30/6/2023 | 
Publications
For publications that are currently unpublished or in-press, details are shown in italics.
Conference (16 outputs)
| Year | Citation | Altmetrics | Link | |||||
|---|---|---|---|---|---|---|---|---|
| 2025 | Liu X, Zhang W, Zhang ML, 'HACSurv: A Hierarchical Copula-Based Approach for Survival Analysis with Dependent Competing Risks', Proceedings of Machine Learning Research, 258, 3079-3087 (2025) | |||||||
| 2025 | Wang Y, Zhang W, Zhang ML, 'Partial Label Causal Representation Learning for Instance-Dependent Supervision and Domain Generalization', Proceedings of the AAAI Conference on Artificial Intelligence, 39, 21366-21374 (2025) [E1] Partial label learning (PLL) addresses situations where each training example is associated with a set of candidate labels, among which only one corresponds to the true... [more] Partial label learning (PLL) addresses situations where each training example is associated with a set of candidate labels, among which only one corresponds to the true class label. As the candidate labels often come from crowdsourced workers, their generation is inherently dependent on the features of the instance. Existing PLL methods primarily aim to resolve these ambiguous labels to enhance classification accuracy, overlooking the opportunity to use this feature dependency for causal representation learning. This focus on accuracy can make PLL systems vulnerable to stylistic variations and shifts in domain. In this paper, we explore the learning of causal representations within an instance-dependent PLL framework, introducing a new approach that uncovers identifiable latent representations. By separating content from style in the identified causal representation, we introduce CausalPLL+, an algorithm for instance-dependent PLL based on causal representation. Our algorithm performs exceptionally well in terms of both classification accuracy and generalization robustness. Qualitative and quantitative experiments on instance-dependent PLL benchmarks and domain generalization tasks verify the effectiveness of our approach. 
 | |||||||
| 2024 | Zhang W, Ling CK, Zhang X, 'Deep Copula-Based Survival Analysis for Dependent Censoring with Identifiability Guarantees', Proceedings of the AAAI Conference on Artificial Intelligence, 38, 20613-20621 (2024) [E1] 
 | Open Research Newcastle | ||||||
| 2024 | Tang W, Zhang W, Zhang ML, 'Exploiting Conjugate Label Information for Multi-Instance Partial-Label Learning', IJCAI International Joint Conference on Artificial Intelligence, 4973-4981 (2024) [E1] 
 | |||||||
| 2024 | Tang W, Yang YF, Wang Z, Zhang W, Zhang ML, 'Multi-Instance Partial-Label Learning with Margin Adjustment', Advances in Neural Information Processing Systems, 37, 1-24 (2024) [E1] 
 | |||||||
| 2024 | Wang Z, Zhang W, Zhang ML, 'Proposal Feature Learning Using Proposal Relations for Weakly Supervised Object Detection', Proceedings - IEEE International Conference on Multimedia and Expo (2024) [E1] 
 | |||||||
| 2023 | Tang W, Zhang W, Zhang M-L, 'Disambiguated Attention Embedding for Multi-Instance Partial-Label Learning', ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) (2023) [E1] 
 | Open Research Newcastle | ||||||
| 2022 | Zhang W, Zhang X, Deng H-W, Zhang M-L, 'Multi-Instance Causal Representation Learning for Instance Label Prediction and Out-of-Distribution Generalization', Advances in Neural Information Processing Systems (2022) [E1] 
 | |||||||
| 2021 | Zhang W, 'Non-I.I.D. Multi-Instance Learning for Predicting Instance and Bag Labels with Variational Auto-Encoder', Proceedings of the 30th International Joint Conference on Artificial Intelligence (2021) [E1] 
 | |||||||
| 2021 | Zhang W, Liu L, Li J, 'Treatment Effect Estimation with Disentangled Latent Factors', Proceedings of the 35th AAAI Conference on Artificial Intelligence, 35, 10923-10930 (2021) [E1] 
 | |||||||
| 2020 | Zhang W, Liu L, Li J, 'Robust Multi-Instance Learning with Stable Instances', ECAI 2020: the 24th European Conference on Artificial Intelligence, 325, 1682-1689 (2020) [E1] 
 | |||||||
| 2018 | Ng WT, Yu J, Wang M, Li R, Zhang W,  'Design Trends in Smart Gate Driver ICs for Power GaN HEMTs', 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), PEOPLES R CHINA, Qingdao (2018) 
 | |||||||
| 2018 | Zhang W, Thuc DL, Liu L, Li J, 'Estimating heterogeneous treatment effect by balancing heterogeneity and fitness', BMC BIOINFORMATICS, 19 (2018) [E1] 
 | |||||||
| Show 13 more conferences | ||||||||
Journal article (17 outputs)
| Year | Citation | Altmetrics | Link | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 2025 | Deng HW, Zhang W, Zhang ML, 'Instance-dependent label noise learning via separating style from content', Pattern Recognition Letters, 196, 9-15 (2025) [C1] 
 | ||||||||||
| 2025 | Cheng D, Li J, Liu L, Xu Z, Zhang W, Liu J, Le TD, 'Disentangled Representation Learning for Causal Inference With Instruments', IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS [C1] Latent confounders are a fundamental challenge for inferring causal effects from observational data. The instrumental variable (IV) approach is a practical way to addre... [more] Latent confounders are a fundamental challenge for inferring causal effects from observational data. The instrumental variable (IV) approach is a practical way to address this challenge. Existing IV-based estimators need a known IV or other strong assumptions, such as the existence of two or more IVs in the system, which limits the application of the IV approach. In this article, we consider a relaxed requirement, which assumes there is an IV proxy in the system without knowing which variable is the proxy. We propose a variational autoencoder (VAE)-based disentangled representation learning method to learn an IV representation from a dataset with latent confounders and then utilize the IV representation to obtain an unbiased estimation of the causal effect from the data. Extensive experiments on synthetic and real-world data have demonstrated that the proposed algorithm outperforms the existing IV-based estimators and VAE-based estimators. 
 | ||||||||||
| 2024 | Boo YL, Gupta M, Zhang W, Fournier-Viger P, 'Special Issue Editorial on "The Innovative Use of Data Science to Transform How We Work and Live"', DATA SCIENCE AND ENGINEERING, 9, 3-4 (2024) 
 | ||||||||||
| 2024 | Pu Z, Sun H, Zhang X, Zhang W, 'Exploring the Impact of Nudges on Volunteer Task Choices in Cultural Heritage Crowdsourcing: An Eye-Tracking Experiment', Proceedings of the Association for Information Science and Technology, 61, 1074-1076 (2024) [C1] In cultural heritage crowdsourcing, imbalanced volunteer task choices can impact project completion rate and sustainability. Nudges offer a way to alleviate this imbala... [more] In cultural heritage crowdsourcing, imbalanced volunteer task choices can impact project completion rate and sustainability. Nudges offer a way to alleviate this imbalance. This paper proposes three nudges: task order, visual saliency, and feedback based on dual-process theory. We conducted a 2 × 2 × 2 factorial experiment incorporating eye-tracking research methods to investigate how these nudges affect volunteer task choices. The results demonstrate a significant positive impact of visual saliency. While task order and feedback did not demonstrate significant effects, eye-tracking results reveal that they effectively captured participants' attention. Our findings suggest that designing platforms with nudges could address the issue of imbalanced task choices in cultural heritage crowdsourcing projects. 
 | ||||||||||
| 2024 | Sun H, Pu Z, Zhang X, Zhang W, 'The Impact of Ancient Book Image Quality on Cultural Heritage Crowdsourcing Correction Tasks: A Human Error Theory Perspective', Proceedings of the Association for Information Science and Technology, 61, 1090-1092 (2024) [C1] This study investigates the impact of ancient book image quality on the accuracy of cultural heritage crowdsourcing correction tasks. The study processed 27 pictures fr... [more] This study investigates the impact of ancient book image quality on the accuracy of cultural heritage crowdsourcing correction tasks. The study processed 27 pictures from an ancient book to adjust their quality in terms of clarity, seal coverage and background color. Based on human error theory, the study divided common errors made by participants in correction tasks into slips and mistakes, and analyzed the impact of different image qualities on these errors. The experiment was conducted using an online experiment platform, with 25 participants each completing 6 tasks related to clarity, seal coverage, and background color. The results show that the clarity of ancient book images significantly affects the number of rectifications and mistakes in the tasks. Different degrees of seal coverage significantly impact the number of rectifications and slips, while the background color of the images mainly affects the number of mistakes. 
 | ||||||||||
| 2024 | Tang W, Zhang W, Zhang M-L, 'Multi-instance partial-label learning: towards exploiting dual inexact supervision', SCIENCE CHINA-INFORMATION SCIENCES, 67 (2024) [C1] Weakly supervised machine learning algorithms are able to learn from ambiguous samples or labels, e.g., multi-instance learning or partial-label learning. However, in s... [more] Weakly supervised machine learning algorithms are able to learn from ambiguous samples or labels, e.g., multi-instance learning or partial-label learning. However, in some real-world tasks, each training sample is associated with not only multiple instances but also a candidate label set that contains one ground-truth label and some false positive labels. Specifically, at least one instance pertains to the ground-truth label while no instance belongs to the false positive labels. In this paper, we formalize such problems as multi-instance partial-label learning (MIPL). Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems since the former fails to disambiguate a candidate label set, and the latter cannot handle a multi-instance bag. To address these issues, a tailored algorithm named MiplGp, i.e., multi-instance partial-label learning with Gaussian processes, is proposed. MiplGp first assigns each instance with a candidate label set in an augmented label space, then transforms the candidate label set into a logarithmic space to yield the disambiguated and continuous labels via an exclusive disambiguation strategy, and last induces a model based on the Gaussian processes. Experimental results on various datasets validate that MiplGp is superior to well-established multi-instance learning and partial-label learning algorithms for solving MIPL problems. 
 | Open Research Newcastle | |||||||||
| 2022 | Zhang X, Zhang W, Zhao YC, Zhu Q, 'Imbalanced volunteer engagement in cultural heritage crowdsourcing: a task-related exploration based on causal inference', Information Processing and Management, 59, 103027-103027 (2022) [C1] 
 | ||||||||||
| 2022 | Yang S, Zhang Y, Jia Y, Zhang W, 'Local Low-Rank Approximation With Superpixel-Guided Locality Preserving Graph for Hyperspectral Image Classification', IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 15, 7741-7754 (2022) [C1] 
 | ||||||||||
| 2021 | Li J, Zhang W, Liu L, Yu K, Le TD, Liu J,  'A general framework for causal classification', INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 11 127-139 (2021)  [C1] 
 | ||||||||||
| 2021 | Zhang W, Li J, Liu L, 'A Unified Survey of Treatment Effect Heterogeneity Modelling and Uplift Modelling', ACM COMPUTING SURVEYS, 54 (2021) [C1] 
 | ||||||||||
| 2020 | Tomasoni M, Gomez S, Crawford J, Zhang W, Choobdar S, Marbach D, Bergmann S, 'MONET: a toolbox integrating top-performing methods for network modularization', BIOINFORMATICS, 36, 3920-3921 (2020) [C1] 
 | ||||||||||
| 2019 | Brown P, RELISH Consortium , Zhou Y, 'Large expert-curated database for benchmarking document similarity detection in biomedical literature search', Database, 2019 (2019) [C1] 
 | Open Research Newcastle | |||||||||
| 2019 | Choobdar S, Ahsen ME, Crawford J, Tomasoni M, Fang T, Lamparter D, Lin J, Hescott B, Hu X, Mercer J, Natoli T, Narayan R, Aicheler F, Amoroso N, Arenas A, Azhagesan K, Baker A, Banf M, Batzoglou S, Baudot A, Bellotti R, Bergmann S, Boroevich KA, Brun C, Cai S, Caldera M, Calderone A, Cesareni G, Chen W, Chichester C, Cowen L, Cui H, Phuong D, De Domenico M, Dhroso A, Didier G, Divine M, del Sol A, Feng X, Flores-Canales JC, Fortunato S, Gitter A, Gorska A, Guan Y, Guenoche A, Gomez S, Hamza H, Hartmann A, He S, Heijs A, Heinrich J, Hu Y, Huang X, Hughitt VK, Jeon M, Jeub L, Johnson NT, Joo K, Joung I, Jung S, Kalko SG, Kamola PJ, Kang J, Kaveelerdpotjana B, Kim M, Kim Y-A, Kohlbacher O, Korkin D, Krzysztof K, Kunji K, Kutalik Z, Lage K, Lang-Brown S, Thuc DL, Lee J, Lee S, Lee J, Li D, Li J, Liu L, Loizou A, Luo Z, Lysenko A, Ma T, Mall R, Marbach D, Mattia T, Medvedovic M, Menche J, Micarelli E, Monaco A, Mueller F, Narykov O, Norman T, Park S, Perfetto L, Perrin D, Pirro S, Przytycka TM, Qian X, Raman K, Ramazzotti D, Ramsahai E, Ravindran B, Rennert P, Saez-Rodriguez J, Scharfe C, Sharan R, Shi N, Shin W, Shu H, Sinha H, Slonim DK, Spinelli L, Srinivasan S, Subramanian A, Suver C, Szklarczyk D, Tangaro S, Thiagarajan S, Tichit L, Tiede T, Tripathi B, Tsherniak A, Tsunoda T, Turei D, Ullah E, Vahedi G, Valdeolivas A, Vivek J, von Mering C, Waagmeester A, Wang B, Wang Y, Weir BA, White S, Winkler S, Xu K, Xu T, Yan C, Yang L, Yu K, Yu X, Zaffaroni G, Zaslavskiy M, Zeng T, Zhang JD, Zhang L, Zhang W, Zhang L, Zhang X, Zhang J, Zhou X, Zhou J, Zhu H, Zhu J, Zuccon G, Stolovitzky G, Kutalik Z, Lage K, Slonim DK, Saez-Rodriguez J, Cowen LJ, Bergmann S, Marbach D, 'Assessment of network module identification across complex diseases', NATURE METHODS, 16, 843-+ (2019) [C1] 
 | ||||||||||
| 2018 | Xu T, Su N, Liu L, Zhang J, Wang H, Zhang W, Gui J, Yu K, Li J, Le TD, 'miRBaseConverter: an R/Bioconductor package for converting and retrieving miRNA name, accession, sequence and family information in different versions of miRBase', BMC BIOINFORMATICS, 19 (2018) [C1] 
 | ||||||||||
| 2017 | Zhang W, Thuc DL, Liu L, Zhou Z-H, Li J, 'Mining heterogeneous causal effects for personalized cancer treatment', BIOINFORMATICS, 33, 2372-2378 (2017) [C1] 
 | ||||||||||
| 2016 | Zhang W, Thuc DL, Liu L, Zhou Z-H, Li J, 'Predicting miRNA Targets by Integrating Gene Regulatory Knowledge with Expression Profiles', PLOS ONE, 11 (2016) [C1] 
 | ||||||||||
| Show 14 more journal articles | |||||||||||
Preprint (4 outputs)
| Year | Citation | Altmetrics | Link | ||
|---|---|---|---|---|---|
| 2019 | Tomasoni M, Gómez S, Crawford J, Zhang W, Choobdar S, Marbach D, Bergmann S, 'MONET: a toolbox integrating top-performing methods for network modularisation' (2019) 
 | ||||
| 2018 | Choobdar S, Ahsen M, Crawford J, Tomasoni M, Lamparter D, Lin J, et al.,  'Open Community Challenge Reveals Molecular Network Modules with Key Roles in Diseases 
 | ||||
| 2018 | Zhang W, Le T, Liu L, Li J, 'Estimating heterogeneous treatment effects by balancing heterogeneity and fitness' (2018) 
 | ||||
| Show 1 more preprint | |||||
Grants and Funding
Summary
| Number of grants | 6 | 
|---|---|
| Total funding | $218,500 | 
Click on a grant title below to expand the full details for that specific grant.
20243 grants / $20,000
Global Experience Support Funding$9,000
Funding body: The University of Newcastle, NSW
| Funding body | The University of Newcastle, NSW | 
|---|---|
| Project Team | M A Hakim Newton, Marcella Papini, Weijia Zhang, Saiful Islam, Alexandre Mendes, Karen Blackmore | 
| Scheme | Global Experience Support Fund | 
| Role | Investigator | 
| Funding Start | 2024 | 
| Funding Finish | 2024 | 
| GNo | |
| Type Of Funding | Internal | 
| Category | INTE | 
| UON | N | 
CESE Fellowship Accelerator$8,000
Funding body: College of Engineering Science and Environment | the University of Newcastle | Australia
| Funding body | College of Engineering Science and Environment | the University of Newcastle | Australia | 
|---|---|
| Scheme | CESE Excellence Strategic Investment Scheme | 
| Role | Lead | 
| Funding Start | 2024 | 
| Funding Finish | 2024 | 
| GNo | |
| Type Of Funding | Internal | 
| Category | INTE | 
| UON | N | 
CESE Conference Travel Scheme$3,000
Funding body: College of Engineering, Science, & Environment (CESE), The University of Newcastle
| Funding body | College of Engineering, Science, & Environment (CESE), The University of Newcastle | 
|---|---|
| Scheme | Internal Competitive Schemes | 
| Role | Lead | 
| Funding Start | 2024 | 
| Funding Finish | 2024 | 
| GNo | |
| Type Of Funding | Internal | 
| Category | INTE | 
| UON | N | 
20231 grants / $10,000
UON Start-up support funding$10,000
Funding body: College of Engineering, Science & Environment (CESE) Start-up Funding
| Funding body | College of Engineering, Science & Environment (CESE) Start-up Funding | 
|---|---|
| Scheme | College of Engineering, Science & Environment (CESE) Start-up Funding | 
| Role | Lead | 
| Funding Start | 2023 | 
| Funding Finish | 2023 | 
| GNo | |
| Type Of Funding | Internal | 
| Category | INTE | 
| UON | N | 
20221 grants / $63,500
Causality-based weakly supervised learning from inexact supervision$63,500
Funding body: National Natural Science Foundation of China
| Funding body | National Natural Science Foundation of China | 
|---|---|
| Scheme | National Natural Science Foundation of China | 
| Role | Lead | 
| Funding Start | 2022 | 
| Funding Finish | 2022 | 
| GNo | |
| Type Of Funding | External | 
| Category | EXTE | 
| UON | N | 
20211 grants / $125,000
Weakly supervised survival analysis for vehicle start-up battery failure prediction$125,000
Funding body: Changan Automobile
| Funding body | Changan Automobile | 
|---|---|
| Scheme | Industry Funding | 
| Role | Lead | 
| Funding Start | 2021 | 
| Funding Finish | 2022 | 
| GNo | |
| Type Of Funding | External | 
| Category | EXTE | 
| UON | N | 
Research Supervision
Number of supervisions
Current Supervision
| Commenced | Level of Study | Research Title | Program | Supervisor Type | 
|---|---|---|---|---|
| 2024 | PhD | Complex Time series Forecasting Based on Machine Learning | PhD (Statistics), College of Engineering, Science and Environment, The University of Newcastle | Co-Supervisor | 
| 2024 | PhD | Statistics and Machine Learning Methods for Response Prediction and Evaluation | PhD (Statistics), College of Engineering, Science and Environment, The University of Newcastle | Principal Supervisor | 
| 2023 | PhD | Deep Transformation of Tabular Categorical Data for Enhanced Predictive Analytics in Health Informatics | PhD (Computer Science), College of Engineering, Science and Environment, The University of Newcastle | Co-Supervisor | 
| 2022 | PhD | Alzheimer’s Disease Detection Using Deep Learning | PhD (Computer Science), College of Engineering, Science and Environment, The University of Newcastle | Co-Supervisor | 
| 2022 | PhD | Speech Depression Detection Using Deep Learning | PhD (Computer Science), College of Engineering, Science and Environment, The University of Newcastle | Principal Supervisor | 
| 2021 | PhD | Research on multi-instance partial label learning algorithms | Computer Science, Southeast University | Co-Supervisor | 
Past Supervision
| Year | Level of Study | Research Title | Program | Supervisor Type | 
|---|---|---|---|---|
| 2025 | Masters | Weakly supervised survival analysis for histopathology images | Information Technology, Southeast University-Monash University joint graduate school | Co-Supervisor | 
| 2025 | Masters | Interpretable deep generative models | Computer Science, Southeast University | Co-Supervisor | 
| 2025 | Masters | Attention-based weakly supervised object detection | Computer Science, Southeast University | Co-Supervisor | 
| 2025 | Masters | Surivial analysis with competing risks | Computer Science, Southeast University | Co-Supervisor | 
| 2024 | Masters | Causal generative modeling of noisy data | Computer Science, Southeast University | Co-Supervisor | 
| 2024 | Masters | Research on interpretable partial label learning algorithms | Computer Science, Southeast University-Monash University joint graduate school | Co-Supervisor | 
| 2023 | Honours | Adversarial robustness of identifiable variational autoencoder | Artificial Intelligence, Southeast University | Sole Supervisor | 
| 2022 | Honours | Multi-instance learning for histopathology images | Computer Science, Southeast University | Sole Supervisor | 
Research Opportunities
Causal Representation Learning under Weak Supervision
Causal representation learning under weak supervision is a burgeoning area of research that aims to extract meaningful causal structures from data with limited or imprecise guidance. Unlike traditional supervised learning, which relies on a wealth of labeled data, weak supervision involves leveraging noisy, incomplete, or indirectly related signals to infer causal relationships. This approach is particularly valuable in domains where obtaining high-quality labeled data is challenging or expensive, such as in medical diagnosis or climate modeling. By integrating methods from causal inference and representation learning, researchers strive to uncover latent causal factors that drive observed data. These factors can then be used to build more robust and interpretable models that generalize well to new, unseen environments. The key challenge lies in devising algorithms that can effectively disentangle causal influences and handle the inherent uncertainty and noise in weakly supervised data, thereby improving the reliability and applicability of causal insights derived from complex, real-world datasets.
PHD
School of Information and Physical Sciences
1/1/2025 - 31/12/2028
Contact
Doctor Weijia Zhang
University of Newcastle
School of Information and Physical Sciences
weijia.zhang@newcastle.edu.au
Time-to-event modelling under Dependent Censoring
Censoring is the central problem in time-to-event modelling where either the time-to-event (for instance, death or equipment failure), or the time-to censoring (such as loss of follow-up) is observed for each sample. The majority of existing machine learning-based survival analysis methods assume that survival is conditionally independent of censoring given a set of covariates; an assumption that cannot be verified since only marginal distributions is available from the data. The existence of dependent censoring, along with the inherent bias in current estimators has been demonstrated in a variety of applications, accentuating the need for a more nuanced approach. However, existing methods that adjust for dependent censoring require practitioners to specify the ground truth copula. This requirement poses a significant challenge for practical applications, as model misspecification can lead to substantial bias.
PHD
School of Information and Physical Sciences
1/1/2025 - 31/12/2028
Contact
Doctor Weijia Zhang
University of Newcastle
School of Information and Physical Sciences
weijia.zhang@newcastle.edu.au
Research Collaborations
The map is a representation of a researchers co-authorship with collaborators across the globe. The map displays the number of publications against a country, where there is at least one co-author based in that country. Data is sourced from the University of Newcastle research publication management system (NURO) and may not fully represent the authors complete body of work.
| Country | Count of Publications | |
|---|---|---|
| China | 25 | |
| Australia | 22 | |
| United States | 5 | |
| Switzerland | 3 | |
| Spain | 3 | |
| More... | ||
Dr Weijia Zhang
Position
Lecturer in Data Science and Applied Statistics
School of Information and Physical Sciences
College of Engineering, Science and Environment
Focus area
Data Science and Statistics
Contact Details
| weijia.zhang@newcastle.edu.au | |
| Phone | 0240550921 | 
| Links | Personal webpage | 
Office
| Room | SR114 | 
|---|---|
| Building | Social Science | 
| Location | Callaghan Campus University Drive Callaghan, NSW 2308 Australia | 

