Dr Weijia Zhang

Dr Weijia Zhang

Lecturer in Data Science and Applied Statistics

School of Information and Physical Sciences (Data Science and Statistics)

Career Summary

Biography

Dr Weijia Zhang’s research aims to bridge the gap between causal inference and machine learning, which enables statistical and computational algorithms to transform the decision-making process of our government, healthcare, and business.

Dr. Weijia "Weja" Zhang graduated with a PhD degree from the University of South Australia in 2018. He has researched and taught a variety of courses in data science and machine learning, first when working as a Research Fellow in his alma mater, and then as an Associate Professor at Southeast University, China, before assuming a position as Lecturer with the University of Newcastle.

As a Lecturer in Data Science and Statistics at the University of Newcastle, Weijia is experienced in statistical causal inference and weakly supervised machine learning. His research has been featured in many prestigious international conferences and journals, and has been successfully adapted to various industries, including some major companies in the automobile, oil and gas, and telecommunications sectors in the Asia Pacific region.

The power of causal inference.

Causal inference is a field of study that deals with the identification of cause-and-effect relationships between variables. It has had a significant impact in various fields, including medicine, public health, economics, and social sciences, among others. Weijia’s work in causality focuses on developing methods to estimate the causal effect from observational data or even data with censoring, which allows researchers to determine the relationship between an exposure or intervention and an outcome of interest, without the need for running costly and time-consuming randomized controlled trials.

Machine learning from ambiguously labelled data

Although supervised machine learning has demonstrated success in various domains, it has a notable limitation in its dependence on large volumes of labelled data. Weakly supervised learning, which involves training algorithms using data with incomplete, ambiguous or noisy labels, is a technique employed in machine learning to overcome this limitation. Weijia's research in weakly supervised learning has sparked a new learning paradigm, known as Multi-Instance Partial Label learning, which enables algorithms to learn from low-quality and ambiguous labels provided by non-expert crowdsourced annotators. This paradigm offers a promising solution to the challenge of limited expert annotations in crucial domains, such as medical imaging and diagnostics.

Bridging causal inference with machine learning

It comes as no surprise that Weijia's research in causal inference and machine learning leads him inevitably to the emerging field of causal machine learning. This area has the potential to enhance the dependability and interpretability of machine learning models, which have impeded the advancement of machine learning applications since its inception. Thanks to Weijia's expertise in weakly-supervised learning, one of his recent works has demonstrated that causality-based models can be more readily achieved when supervision is ambiguous than in traditional supervised learning.

Making realworld impact

In addition to his academic publications, Weijia has extensive experience in translating his research into practical applications. He has collaborated with customer representatives in the telecommunications sector to enhance customer retention, and with production engineers in the oil and gas industry to develop predictive maintenance strategies, among other projects. In a recent collaboration with Chang'an Automobile, a major car manufacturer in the Asia Pacific, Weijia's algorithm was implemented to improve the lifespan for hundreds of thousands of vehicle batteries.


Qualifications

  • DOCTOR OF PHILOSOPHY, University of South Australia

Keywords

  • Causal Inference
  • Data Mining
  • Machine Learning

Languages

  • English (Fluent)
  • Mandarin (Mother)

Fields of Research

Code Description Percentage
490508 Statistical data science 60
461199 Machine learning not elsewhere classified 40

Professional Experience

UON Appointment

Title Organisation / Department
Lecturer in Data Science and Applied Statistics University of Newcastle
School of Information and Physical Sciences
Australia

Academic appointment

Dates Title Organisation / Department
1/7/2021 - 31/1/2023 Associate Professor Southeast University
School of Computer Science and Engineering
China
28/5/2018 - 17/7/2020 Research Fellow The University of South Australia
School of Information and Mathematical Science
Australia

Teaching

Code Course Role Duration
INFS 5102 Unsupervised Methods in Analytics
The University of South Australia
Course Coordinator 18/2/2019 - 30/6/2019
STAT6160 Data Analytics for Business Intelligence
College of Engineering, Science and Environment, University of Newcastle
Course Coordinator 20/2/2023 - 30/6/2023
CSE 5075 Pattern Recognition
Southeast University
Course Coordinator 1/3/2022 - 30/7/2022
INFS 5100 Predictive Analytics
The University of South Australia
Course Coordinator 17/2/2020 - 17/7/2020
STAT1060 Business Decision Making
College of Engineering, Science and Environment, University of Newcastle
Course Coordinator 20/2/2023 - 30/6/2023
Edit

Publications

For publications that are currently unpublished or in-press, details are shown in italics.


Journal article (16 outputs)

Year Citation Altmetrics Link
2024 Boo YL, Gupta M, Zhang W, Fournier-Viger P, 'Special Issue Editorial on "The Innovative Use of Data Science to Transform How We Work and Live"', DATA SCIENCE AND ENGINEERING, 9 3-4 (2024)
DOI 10.1007/s41019-024-00247-w
2024 Pu Z, Sun H, Zhang X, Zhang W, 'Exploring the Impact of Nudges on Volunteer Task Choices in Cultural Heritage Crowdsourcing: An Eye-Tracking Experiment', Proceedings of the Association for Information Science and Technology, 61 1074-1076 (2024) [C1]

In cultural heritage crowdsourcing, imbalanced volunteer task choices can impact project completion rate and sustainability. Nudges offer a way to alleviate this imbalance. This p... [more]

In cultural heritage crowdsourcing, imbalanced volunteer task choices can impact project completion rate and sustainability. Nudges offer a way to alleviate this imbalance. This paper proposes three nudges: task order, visual saliency, and feedback based on dual-process theory. We conducted a 2 × 2 × 2 factorial experiment incorporating eye-tracking research methods to investigate how these nudges affect volunteer task choices. The results demonstrate a significant positive impact of visual saliency. While task order and feedback did not demonstrate significant effects, eye-tracking results reveal that they effectively captured participants' attention. Our findings suggest that designing platforms with nudges could address the issue of imbalanced task choices in cultural heritage crowdsourcing projects.

DOI 10.1002/pra2.1189
2024 Sun H, Pu Z, Zhang X, Zhang W, 'The Impact of Ancient Book Image Quality on Cultural Heritage Crowdsourcing Correction Tasks: A Human Error Theory Perspective', Proceedings of the Association for Information Science and Technology, 61 1090-1092 (2024) [C1]

This study investigates the impact of ancient book image quality on the accuracy of cultural heritage crowdsourcing correction tasks. The study processed 27 pictures from an ancie... [more]

This study investigates the impact of ancient book image quality on the accuracy of cultural heritage crowdsourcing correction tasks. The study processed 27 pictures from an ancient book to adjust their quality in terms of clarity, seal coverage and background color. Based on human error theory, the study divided common errors made by participants in correction tasks into slips and mistakes, and analyzed the impact of different image qualities on these errors. The experiment was conducted using an online experiment platform, with 25 participants each completing 6 tasks related to clarity, seal coverage, and background color. The results show that the clarity of ancient book images significantly affects the number of rectifications and mistakes in the tasks. Different degrees of seal coverage significantly impact the number of rectifications and slips, while the background color of the images mainly affects the number of mistakes.

DOI 10.1002/pra2.1194
2024 Tang W, Zhang W, Zhang ML, 'Multi-instance partial-label learning: towards exploiting dual inexact supervision', Science China Information Sciences, 67 (2024) [C1]

Weakly supervised machine learning algorithms are able to learn from ambiguous samples or labels, e.g., multi-instance learning or partial-label learning. However, in some real-wo... [more]

Weakly supervised machine learning algorithms are able to learn from ambiguous samples or labels, e.g., multi-instance learning or partial-label learning. However, in some real-world tasks, each training sample is associated with not only multiple instances but also a candidate label set that contains one ground-truth label and some false positive labels. Specifically, at least one instance pertains to the ground-truth label while no instance belongs to the false positive labels. In this paper, we formalize such problems as multi-instance partial-label learning (MIPL). Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems since the former fails to disambiguate a candidate label set, and the latter cannot handle a multi-instance bag. To address these issues, a tailored algorithm named MiplGp, i.e., multi-instance partial-label learning with Gaussian processes, is proposed. MiplGp first assigns each instance with a candidate label set in an augmented label space, then transforms the candidate label set into a logarithmic space to yield the disambiguated and continuous labels via an exclusive disambiguation strategy, and last induces a model based on the Gaussian processes. Experimental results on various datasets validate that MiplGp is superior to well-established multi-instance learning and partial-label learning algorithms for solving MIPL problems.

DOI 10.1007/s11432-023-3771-6
Citations Scopus - 7Web of Science - 3
2024 Cheng D, Li J, Liu L, Xu Z, Zhang W, Liu J, Le TD, 'Disentangled Representation Learning for Causal Inference With Instruments', IEEE Transactions on Neural Networks and Learning Systems, (2024) [C1]

Latent confounders are a fundamental challenge for inferring causal effects from observational data. The instrumental variable (IV) approach is a practical way to address this cha... [more]

Latent confounders are a fundamental challenge for inferring causal effects from observational data. The instrumental variable (IV) approach is a practical way to address this challenge. Existing IV-based estimators need a known IV or other strong assumptions, such as the existence of two or more IVs in the system, which limits the application of the IV approach. In this article, we consider a relaxed requirement, which assumes there is an IV proxy in the system without knowing which variable is the proxy. We propose a variational autoencoder (VAE)-based disentangled representation learning method to learn an IV representation from a dataset with latent confounders and then utilize the IV representation to obtain an unbiased estimation of the causal effect from the data. Extensive experiments on synthetic and real-world data have demonstrated that the proposed algorithm outperforms the existing IV-based estimators and VAE-based estimators.

DOI 10.1109/TNNLS.2024.3512790
Citations Scopus - 1
2022 Zhang W, Li J, Liu L, 'A Unified Survey of Treatment Effect Heterogeneity Modelling and Uplift Modelling', ACM Computing Surveys, 54 1-36 (2022) [C1]
DOI 10.1145/3466818
Citations Scopus - 36Web of Science - 33
2022 Zhang X, Zhang W, Zhao YC, Zhu Q, 'Imbalanced volunteer engagement in cultural heritage crowdsourcing: a task-related exploration based on causal inference', INFORMATION PROCESSING & MANAGEMENT, 59 (2022) [C1]
DOI 10.1016/j.ipm.2022.103027
Citations Scopus - 17Web of Science - 9
2022 Yang S, Zhang Y, Jia Y, Zhang W, 'Local Low-Rank Approximation With Superpixel-Guided Locality Preserving Graph for Hyperspectral Image Classification', IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 15 7741-7754 (2022) [C1]
DOI 10.1109/JSTARS.2022.3199885
Citations Scopus - 6Web of Science - 5
2021 Li J, Zhang W, Liu L, Yu K, Le TD, Liu J, 'A general framework for causal classification', INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 11 127-139 (2021) [C1]
DOI 10.1007/s41060-021-00249-1
Citations Scopus - 6Web of Science - 6
2020 Tomasoni M, Gomez S, Crawford J, Zhang W, Choobdar S, Marbach D, Bergmann S, 'MONET: a toolbox integrating top-performing methods for network modularization', BIOINFORMATICS, 36 3920-3921 (2020) [C1]
DOI 10.1093/bioinformatics/btaa236
Citations Scopus - 15Web of Science - 14
2019 Brown P, RELISH Consortium, Zhou Y, 'Large expert-curated database for benchmarking document similarity detection in biomedical literature search', Database, 2019 (2019) [C1]
DOI 10.1093/database/baz085
Citations Scopus - 23Web of Science - 30
Co-authors Elizabeth Bromfield, Brett Nixon
2019 Choobdar S, Ahsen ME, Crawford J, Tomasoni M, Fang T, Lamparter D, et al., 'Assessment of network module identification across complex diseases', NATURE METHODS, 16 843-+ (2019) [C1]
DOI 10.1038/s41592-019-0509-5
Citations Scopus - 191Web of Science - 174
2018 Xu T, Su N, Liu L, Zhang J, Wang H, Zhang W, et al., 'miRBaseConverter: an R/Bioconductor package for converting and retrieving miRNA name, accession, sequence and family information in different versions of miRBase', BMC BIOINFORMATICS, 19 (2018) [C1]
DOI 10.1186/s12859-018-2531-5
Citations Scopus - 56Web of Science - 47
2017 Zhang W, Thuc DL, Liu L, Zhou Z-H, Li J, 'Mining heterogeneous causal effects for personalized cancer treatment', BIOINFORMATICS, 33 2372-2378 (2017) [C1]
DOI 10.1093/bioinformatics/btx174
Citations Scopus - 33Web of Science - 30
2016 Zhang W, Thuc DL, Liu L, Zhou Z-H, Li J, 'Predicting miRNA Targets by Integrating Gene Regulatory Knowledge with Expression Profiles', PLOS ONE, 11 (2016) [C1]
DOI 10.1371/journal.pone.0152860
Citations Scopus - 16Web of Science - 19
2010 Wang X, Wu X, Wang C, Zhang W, Ouyang Y, Yu Y, He Z, 'Transcriptional suppression of breast cancer resistance protein (BCRP) by wild-type p53 through the NF- B pathway in MCF-7 cells', FEBS LETTERS, 584 3392-3397 (2010)
DOI 10.1016/j.febslet.2010.06.033
Citations Web of Science - 36
Show 13 more journal articles

Conference (14 outputs)

Year Citation Altmetrics Link
2024 Zhang W, Ling CK, Zhang X, 'Deep Copula-Based Survival Analysis for Dependent Censoring with Identifiability Guarantees', Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, Canada (2024) [E1]
DOI 10.1609/aaai.v38i18.30047
Citations Scopus - 4
2024 Tang W, Zhang W, Zhang ML, 'Exploiting Conjugate Label Information for Multi-Instance Partial-Label Learning', IJCAI International Joint Conference on Artificial Intelligence, Jeju, Korea (2024) [E1]
Citations Scopus - 2
2024 Tang W, Yang YF, Wang Z, Zhang W, Zhang ML, 'Multi-Instance Partial-Label Learning with Margin Adjustment', Advances in Neural Information Processing Systems (2024)

Multi-instance partial-label learning (MIPL) is an emerging learning framework where each training sample is represented as a multi-instance bag associated with a candidate label ... [more]

Multi-instance partial-label learning (MIPL) is an emerging learning framework where each training sample is represented as a multi-instance bag associated with a candidate label set. Existing MIPL algorithms often overlook the margins for attention scores and predicted probabilities, leading to suboptimal generalization performance. A critical issue with these algorithms is that the highest prediction probability of the classifier may appear on a non-candidate label. In this paper, we propose an algorithm named MIPLMA, i.e., Multi-Instance Partial-Label learning with Margin Adjustment, which adjusts the margins for attention scores and predicted probabilities. We introduce a margin-aware attention mechanism to dynamically adjust the margins for attention scores and propose a margin distribution loss to constrain the margins between the predicted probabilities on candidate and non-candidate label sets. Experimental results demonstrate the superior performance of MIPLMA over existing MIPL algorithms, as well as other well-established multi-instance learning algorithms and partial-label learning algorithms.

2024 Wang Z, Zhang W, Zhang ML, 'Proposal Feature Learning Using Proposal Relations for Weakly Supervised Object Detection', Proceedings - IEEE International Conference on Multimedia and Expo, Niagara Falls, ON (2024) [E1]
DOI 10.1109/ICME57554.2024.10688185
2023 Tang W, Zhang W, Zhang M-L, 'Disambiguated Attention Embedding for Multi-Instance Partial-Label Learning', ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), LA, New Orleans (2023) [E1]
Citations Scopus - 6
2022 Zhang W, Zhang X, Deng H-W, Zhang M-L, 'Multi-Instance Causal Representation Learning for Instance Label Prediction and Out-of-Distribution Generalization', Advances in Neural Information Processing Systems, New Orleans, USA (2022) [E1]
Citations Scopus - 19
2021 Zhang W, 'Non-I.I.D. Multi-Instance Learning for Predicting Instance and Bag Labels with Variational Auto-Encoder', Proceedings of the 30th International Joint Conference on Artificial Intelligence, Virtual Conference (2021) [E1]
DOI 10.24963/ijcai.2021/465
Citations Scopus - 12Web of Science - 9
2021 Zhang W, Liu L, Li J, 'Treatment Effect Estimation with Disentangled Latent Factors', Proceedings of the 35th AAAI Conference on Artificial Intelligence, Virtual Conference (2021) [E1]
DOI 10.1609/aaai.v35i12.17304
Citations Scopus - 53Web of Science - 31
2020 Zhang W, Liu L, Li J, 'Robust Multi-Instance Learning with Stable Instances', ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, ELECTR NETWORK, European Assoc Artificial Intelligence (2020) [E1]
DOI 10.3233/FAIA200280
Citations Scopus - 13Web of Science - 13
2018 Ng WT, Yu J, Wang M, Li R, Zhang W, 'Design Trends in Smart Gate Driver ICs for Power GaN HEMTs', 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), PEOPLES R CHINA, Qingdao (2018)
Citations Web of Science - 4
2018 Zhang W, Thuc DL, Liu L, Li J, 'Estimating heterogeneous treatment effect by balancing heterogeneity and fitness', BMC BIOINFORMATICS, PEOPLES R CHINA, Yunnan (2018) [E1]
DOI 10.1186/s12859-018-2521-7
Citations Scopus - 5Web of Science - 4
2015 Li W, Pang J, Niu Q, Zhang W, 'Application of improved support vector machine based on shuffled frog leaping algorithm in wind-photovoltaic-battery power forecasting', 2015 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS IHMSC 2015, VOL II, PEOPLES R CHINA, Hangzhou (2015)
DOI 10.1109/IHMSC.2015.248
Citations Web of Science - 1
2015 Li W, Niu Q, Zhang W, Pang J, 'The Application of Spark in the Power Grid Intelligent Decision Analysis Platform', 2015 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS IHMSC 2015, VOL II, PEOPLES R CHINA, Hangzhou (2015)
DOI 10.1109/IHMSC.2015.200
2014 Zhang W-J, Zhou Z-H, 'Multi-Instance Learning with Distribution Change', Proceedings of the 28th AAAI Conference on Artificial Intelligence, Quebec City, CANADA (2014)
Citations Scopus - 14Web of Science - 11
Show 11 more conferences

Preprint (4 outputs)

Year Citation Altmetrics Link
2019 Tomasoni M, Gómez S, Crawford J, Zhang W, Choobdar S, Marbach D, Bergmann S, '
DOI 10.1101/611418
2018 Choobdar S, Ahsen M, Crawford J, Tomasoni M, Lamparter D, Lin J, et al., 'Open Community Challenge Reveals Molecular Network Modules with Key Roles in Diseases
DOI 10.2139/ssrn.3188379
2018 Zhang W, Le T, Liu L, Li J, 'Estimating heterogeneous treatment effects by balancing heterogeneity and fitness (2018)
DOI 10.1101/333278
2018 Xu T, Su N, Liu L, Zhang J, Wang H, Zhang W, et al., 'miRBaseConverter: An R/Bioconductor Package for Converting and Retrieving miRNA Name, Accession, Sequence and Family Information in Different Versions of miRBase (2018)
DOI 10.1101/407148
Show 1 more preprint
Edit

Grants and Funding

Summary

Number of grants 6
Total funding $218,500

Click on a grant title below to expand the full details for that specific grant.


20243 grants / $20,000

Global Experience Support Funding$9,000

Funding body: The University of Newcastle, NSW

Funding body The University of Newcastle, NSW
Project Team

M A Hakim Newton, Marcella Papini, Weijia Zhang, Saiful Islam, Alexandre Mendes, Karen Blackmore

Scheme Global Experience Support Fund
Role Investigator
Funding Start 2024
Funding Finish 2024
GNo
Type Of Funding Internal
Category INTE
UON N

CESE Fellowship Accelerator$8,000

Funding body: College of Engineering Science and Environment | the University of Newcastle | Australia

Funding body College of Engineering Science and Environment | the University of Newcastle | Australia
Scheme CESE Excellence Strategic Investment Scheme
Role Lead
Funding Start 2024
Funding Finish 2024
GNo
Type Of Funding Internal
Category INTE
UON N

CESE Conference Travel Scheme$3,000

Funding body: College of Engineering, Science, & Environment (CESE), The University of Newcastle

Funding body College of Engineering, Science, & Environment (CESE), The University of Newcastle
Scheme Internal Competitive Schemes
Role Lead
Funding Start 2024
Funding Finish 2024
GNo
Type Of Funding Internal
Category INTE
UON N

20231 grants / $10,000

UON Start-up support funding$10,000

Funding body: College of Engineering, Science & Environment (CESE) Start-up Funding

Funding body College of Engineering, Science & Environment (CESE) Start-up Funding
Scheme College of Engineering, Science & Environment (CESE) Start-up Funding
Role Lead
Funding Start 2023
Funding Finish 2023
GNo
Type Of Funding Internal
Category INTE
UON N

20221 grants / $63,500

Causality-based weakly supervised learning from inexact supervision$63,500

Funding body: National Natural Science Foundation of China

Funding body National Natural Science Foundation of China
Scheme National Natural Science Foundation of China
Role Lead
Funding Start 2022
Funding Finish 2022
GNo
Type Of Funding External
Category EXTE
UON N

20211 grants / $125,000

Weakly supervised survival analysis for vehicle start-up battery failure prediction$125,000

Funding body: Changan Automobile

Funding body Changan Automobile
Scheme Industry Funding
Role Lead
Funding Start 2021
Funding Finish 2022
GNo
Type Of Funding External
Category EXTE
UON N
Edit

Research Supervision

Number of supervisions

Completed4
Current10

Current Supervision

Commenced Level of Study Research Title Program Supervisor Type
2024 PhD Statistics and Machine Learning Methods for Response Prediction and Evaluation PhD (Statistics), College of Engineering, Science and Environment, The University of Newcastle Principal Supervisor
2024 PhD Pricing of Life Insurance Premiums Based on Machine Learning Personalized Risk Rating PhD (Statistics), College of Engineering, Science and Environment, The University of Newcastle Co-Supervisor
2023 PhD Deep Transformation of Tabular Categorical Data for Enhanced Predictive Analytics in Health Informatics PhD (Computer Science), College of Engineering, Science and Environment, The University of Newcastle Co-Supervisor
2022 PhD Speech Depression Recognition Using Deep Learning PhD (Computer Science), College of Engineering, Science and Environment, The University of Newcastle Principal Supervisor
2022 Masters Attention-based weakly supervised object detection Computer Science, Southeast University Co-Supervisor
2022 Masters Surivial analysis with competing risks Computer Science, Southeast University Co-Supervisor
2022 PhD Alzheimer’s Disease Detection Using Deep Learning PhD (Computer Science), College of Engineering, Science and Environment, The University of Newcastle Co-Supervisor
2022 Masters Interpretable deep generative models Computer Science, Southeast University Co-Supervisor
2021 Masters Weakly supervised survival analysis for histopathology images Information Technology, Southeast University-Monash University joint graduate school Co-Supervisor
2021 PhD Research on multi-instance partial label learning algorithms Computer Science, Southeast University Co-Supervisor

Past Supervision

Year Level of Study Research Title Program Supervisor Type
2024 Masters Causal generative modeling of noisy data Computer Science, Southeast University Co-Supervisor
2024 Masters Research on interpretable partial label learning algorithms Computer Science, Southeast University-Monash University joint graduate school Co-Supervisor
2023 Honours Adversarial robustness of identifiable variational autoencoder Artificial Intelligence, Southeast University Sole Supervisor
2022 Honours Multi-instance learning for histopathology images Computer Science, Southeast University Sole Supervisor
Edit

Research Opportunities

Causal Representation Learning under Weak Supervision

Causal representation learning under weak supervision is a burgeoning area of research that aims to extract meaningful causal structures from data with limited or imprecise guidance. Unlike traditional supervised learning, which relies on a wealth of labeled data, weak supervision involves leveraging noisy, incomplete, or indirectly related signals to infer causal relationships. This approach is particularly valuable in domains where obtaining high-quality labeled data is challenging or expensive, such as in medical diagnosis or climate modeling. By integrating methods from causal inference and representation learning, researchers strive to uncover latent causal factors that drive observed data. These factors can then be used to build more robust and interpretable models that generalize well to new, unseen environments. The key challenge lies in devising algorithms that can effectively disentangle causal influences and handle the inherent uncertainty and noise in weakly supervised data, thereby improving the reliability and applicability of causal insights derived from complex, real-world datasets.

PHD

School of Information and Physical Sciences

1/1/2025 - 31/12/2028

https://www.weijiazhangxh.com/

Contact

Doctor Weijia Zhang
University of Newcastle
School of Information and Physical Sciences
weijia.zhang@newcastle.edu.au

Time-to-event modelling under Dependent Censoring

Censoring is the central problem in time-to-event modelling where either the time-to-event (for instance, death or equipment failure), or the time-to censoring (such as loss of follow-up) is observed for each sample. The majority of existing machine learning-based survival analysis methods assume that survival is conditionally independent of censoring given a set of covariates; an assumption that cannot be verified since only marginal distributions is available from the data. The existence of dependent censoring, along with the inherent bias in current estimators has been demonstrated in a variety of applications, accentuating the need for a more nuanced approach. However, existing methods that adjust for dependent censoring require practitioners to specify the ground truth copula. This requirement poses a significant challenge for practical applications, as model misspecification can lead to substantial bias.

PHD

School of Information and Physical Sciences

1/1/2025 - 31/12/2028

https://www.weijiazhangxh.com/

Contact

Doctor Weijia Zhang
University of Newcastle
School of Information and Physical Sciences
weijia.zhang@newcastle.edu.au

Edit

Research Collaborations

The map is a representation of a researchers co-authorship with collaborators across the globe. The map displays the number of publications against a country, where there is at least one co-author based in that country. Data is sourced from the University of Newcastle research publication management system (NURO) and may not fully represent the authors complete body of work.

Country Count of Publications
China 24
Australia 21
United States 5
Switzerland 3
Spain 3
More...
Edit

Dr Weijia Zhang

Position

Lecturer in Data Science and Applied Statistics
School of Information and Physical Sciences
College of Engineering, Science and Environment

Focus area

Data Science and Statistics

Contact Details

Email weijia.zhang@newcastle.edu.au
Phone (02) 4055 0921
Links Personal webpage
Twitter

Office

Room SR114
Building Social Science Building
Location Callaghan
University Drive
Callaghan, NSW 2308
Australia
Edit