Dr Weijia Zhang
Lecturer in Data Science and Applied Statistics
School of Information and Physical Sciences (Data Science and Statistics)
- Email:weijia.zhang@newcastle.edu.au
- Phone:(02) 4055 0921
Career Summary
Biography
Dr Weijia Zhang’s research aims to bridge the gap between causal inference and machine learning, which enables statistical and computational algorithms to transform the decision-making process of our government, healthcare, and business.
Dr. Weijia "Weja" Zhang graduated with a PhD degree from the University of South Australia in 2018. He has researched and taught a variety of courses in data science and machine learning, first when working as a Research Fellow in his alma mater, and then as an Associate Professor at Southeast University, China, before assuming a position as Lecturer with the University of Newcastle.
As a Lecturer in Data Science and Statistics at the University of Newcastle, Weijia is experienced in statistical causal inference and weakly supervised machine learning. His research has been featured in many prestigious international conferences and journals, and has been successfully adapted to various industries, including some major companies in the automobile, oil and gas, and telecommunications sectors in the Asia Pacific region.
The power of causal inference.
Causal inference is a field of study that deals with the identification of cause-and-effect relationships between variables. It has had a significant impact in various fields, including medicine, public health, economics, and social sciences, among others. Weijia’s work in causality focuses on developing methods to estimate the causal effect from observational data or even data with censoring, which allows researchers to determine the relationship between an exposure or intervention and an outcome of interest, without the need for running costly and time-consuming randomized controlled trials.
Machine learning from ambiguously labelled data
Although supervised machine learning has demonstrated success in various domains, it has a notable limitation in its dependence on large volumes of labelled data. Weakly supervised learning, which involves training algorithms using data with incomplete, ambiguous or noisy labels, is a technique employed in machine learning to overcome this limitation. Weijia's research in weakly supervised learning has sparked a new learning paradigm, known as Multi-Instance Partial Label learning, which enables algorithms to learn from low-quality and ambiguous labels provided by non-expert crowdsourced annotators. This paradigm offers a promising solution to the challenge of limited expert annotations in crucial domains, such as medical imaging and diagnostics.
Bridging causal inference with machine learning
It comes as no surprise that Weijia's research in causal inference and machine learning leads him inevitably to the emerging field of causal machine learning. This area has the potential to enhance the dependability and interpretability of machine learning models, which have impeded the advancement of machine learning applications since its inception. Thanks to Weijia's expertise in weakly-supervised learning, one of his recent works has demonstrated that causality-based models can be more readily achieved when supervision is ambiguous than in traditional supervised learning.
Making realworld impact
In addition to his academic publications, Weijia has extensive experience in translating his research into practical applications. He has collaborated with customer representatives in the telecommunications sector to enhance customer retention, and with production engineers in the oil and gas industry to develop predictive maintenance strategies, among other projects. In a recent collaboration with Chang'an Automobile, a major car manufacturer in the Asia Pacific, Weijia's algorithm was implemented to improve the lifespan for hundreds of thousands of vehicle batteries.
Qualifications
- DOCTOR OF PHILOSOPHY, University of South Australia
Keywords
- Causal Inference
- Data Mining
- Machine Learning
Languages
- English (Fluent)
- Mandarin (Mother)
Fields of Research
Code | Description | Percentage |
---|---|---|
461106 | Semi- and unsupervised learning | 40 |
490508 | Statistical data science | 40 |
461103 | Deep learning | 20 |
Professional Experience
UON Appointment
Title | Organisation / Department |
---|---|
Lecturer in Data Science and Applied Statistics | University of Newcastle School of Information and Physical Sciences Australia |
Academic appointment
Dates | Title | Organisation / Department |
---|---|---|
1/7/2021 - 31/1/2023 | Associate Professor | Southeast University School of Computer Science and Engineering China |
28/5/2018 - 17/7/2020 | Research Fellow | The University of South Australia School of Information and Mathematical Science Australia |
Teaching
Code | Course | Role | Duration |
---|---|---|---|
INFS 5102 |
Unsupervised Methods in Analytics The University of South Australia |
Course Coordinator | 18/2/2019 - 30/6/2019 |
STAT6160 |
Data Analytics for Business Intelligence College of Engineering, Science and Environment, University of Newcastle |
Course Coordinator | 20/2/2023 - 30/6/2023 |
CSE 5075 |
Pattern Recognition Southeast University |
Course Coordinator | 1/3/2022 - 30/7/2022 |
INFS 5100 |
Predictive Analytics The University of South Australia |
Course Coordinator | 17/2/2020 - 17/7/2020 |
STAT1060 |
Business Decision Making College of Engineering, Science and Environment, University of Newcastle |
Course Coordinator | 20/2/2023 - 30/6/2023 |
Publications
For publications that are currently unpublished or in-press, details are shown in italics.
Journal article (11 outputs)
Year | Citation | Altmetrics | Link | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
2024 |
Tang W, Zhang W, Zhang ML, 'Multi-instance partial-label learning: towards exploiting dual inexact supervision', Science China Information Sciences, 67 (2024) [C1] Weakly supervised machine learning algorithms are able to learn from ambiguous samples or labels, e.g., multi-instance learning or partial-label learning. However, in some real-wo... [more] Weakly supervised machine learning algorithms are able to learn from ambiguous samples or labels, e.g., multi-instance learning or partial-label learning. However, in some real-world tasks, each training sample is associated with not only multiple instances but also a candidate label set that contains one ground-truth label and some false positive labels. Specifically, at least one instance pertains to the ground-truth label while no instance belongs to the false positive labels. In this paper, we formalize such problems as multi-instance partial-label learning (MIPL). Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems since the former fails to disambiguate a candidate label set, and the latter cannot handle a multi-instance bag. To address these issues, a tailored algorithm named MiplGp, i.e., multi-instance partial-label learning with Gaussian processes, is proposed. MiplGp first assigns each instance with a candidate label set in an augmented label space, then transforms the candidate label set into a logarithmic space to yield the disambiguated and continuous labels via an exclusive disambiguation strategy, and last induces a model based on the Gaussian processes. Experimental results on various datasets validate that MiplGp is superior to well-established multi-instance learning and partial-label learning algorithms for solving MIPL problems.
|
Nova | |||||||||
2022 |
Zhang W, Li J, Liu L, 'A Unified Survey of Treatment Effect Heterogeneity Modelling and Uplift Modelling', ACM Computing Surveys, 54 1-36 (2022) [C1]
|
||||||||||
2022 |
Zhang X, Zhang W, Zhao YC, Zhu Q, 'Imbalanced volunteer engagement in cultural heritage crowdsourcing: a task-related exploration based on causal inference', INFORMATION PROCESSING & MANAGEMENT, 59 (2022) [C1]
|
||||||||||
2022 |
Yang S, Zhang Y, Jia Y, Zhang W, 'Local Low-Rank Approximation With Superpixel-Guided Locality Preserving Graph for Hyperspectral Image Classification', IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 15 7741-7754 (2022) [C1]
|
||||||||||
2021 |
Li J, Zhang W, Liu L, Yu K, Le TD, Liu J, 'A general framework for causal classification', INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 11 127-139 (2021) [C1]
|
||||||||||
2020 |
Tomasoni M, Gomez S, Crawford J, Zhang W, Choobdar S, Marbach D, Bergmann S, 'MONET: a toolbox integrating top-performing methods for network modularization', BIOINFORMATICS, 36 3920-3921 (2020) [C1]
|
||||||||||
2019 |
Brown P, RELISH Consortium, Zhou Y, 'Large expert-curated database for benchmarking document similarity detection in biomedical literature search', Database, 2019 (2019) [C1]
|
Nova | |||||||||
2019 |
Choobdar S, Ahsen ME, Crawford J, Tomasoni M, Fang T, Lamparter D, et al., 'Assessment of network module identification across complex diseases', NATURE METHODS, 16 843-+ (2019) [C1]
|
||||||||||
2018 |
Xu T, Su N, Liu L, Zhang J, Wang H, Zhang W, et al., 'miRBaseConverter: an R/Bioconductor package for converting and retrieving miRNA name, accession, sequence and family information in different versions of miRBase', BMC BIOINFORMATICS, 19 (2018) [C1]
|
||||||||||
2017 |
Zhang W, Thuc DL, Liu L, Zhou Z-H, Li J, 'Mining heterogeneous causal effects for personalized cancer treatment', BIOINFORMATICS, 33 2372-2378 (2017) [C1]
|
||||||||||
2016 |
Zhang W, Thuc DL, Liu L, Zhou Z-H, Li J, 'Predicting miRNA Targets by Integrating Gene Regulatory Knowledge with Expression Profiles', PLOS ONE, 11 (2016) [C1]
|
||||||||||
Show 8 more journal articles |
Conference (13 outputs)
Year | Citation | Altmetrics | Link | |||||
---|---|---|---|---|---|---|---|---|
2022 |
Zhang W, Zhang X, Deng H-W, Zhang M-L, 'Multi-Instance Causal Representation Learning for Instance Label Prediction and Out-of-Distribution Generalization', Advances in Neural Information Processing Systems, New Orleans, USA (2022) [E1]
|
|||||||
2021 |
Zhang W, 'Non-I.I.D. Multi-Instance Learning for Predicting Instance and Bag Labels with Variational Auto-Encoder', Proceedings of the 30th International Joint Conference on Artificial Intelligence, Virtual Conference (2021) [E1]
|
|||||||
2021 |
Zhang W, Liu L, Li J, 'Treatment Effect Estimation with Disentangled Latent Factors', Proceedings of the 35th AAAI Conference on Artificial Intelligence, Virtual Conference (2021) [E1]
|
|||||||
2020 |
Zhang W, Liu L, Li J, 'Robust Multi-Instance Learning with Stable Instances', ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, European Assoc Artificial Intelligence, ELECTR NETWORK (2020) [E1]
|
|||||||
2019 |
Zhou J, Yao K, Huang X, Sun G, Zhang W, Ashtaq A, et al., 'Temperature Calculation and Measurement on Power Cable Conductor Based on Equivalent Thermal Circuit and BOTDA', 2019 IEEE INTERNATIONAL CONFERENCE ON ENVIRONMENT AND ELECTRICAL ENGINEERING AND 2019 IEEE INDUSTRIAL AND COMMERCIAL POWER SYSTEMS EUROPE (EEEIC / I&CPS EUROPE), ITALY, Univ Genoa, Genova (2019)
|
|||||||
2019 |
Huang X, Li Y, Wu C, Lin D, Liu Z, Chen Y, et al., 'Temperature Online Monitoring of Submarine Cable Based on BOTDA and FOCT', PROCEEDINGS OF 2019 IEEE 3RD INTERNATIONAL ELECTRICAL AND ENERGY CONFERENCE (CIEEC), PEOPLES R CHINA, Beijing (2019)
|
|||||||
2019 |
Chen Y, Cai C, Wu C, Zhang W, Huang X, Cen Z, Guo Q, 'Detection and Research of 500kV Submarine Cable Routing', PROCEEDINGS OF 2019 IEEE 3RD INTERNATIONAL ELECTRICAL AND ENERGY CONFERENCE (CIEEC), PEOPLES R CHINA, Beijing (2019)
|
|||||||
2018 | Ng WT, Yu J, Wang M, Li R, Zhang W, 'Design Trends in Smart Gate Driver ICs for Power GaN HEMTs', 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), PEOPLES R CHINA, Qingdao (2018) | |||||||
2018 |
Zhang W, Thuc DL, Liu L, Li J, 'Estimating heterogeneous treatment effect by balancing heterogeneity and fitness', BMC BIOINFORMATICS, PEOPLES R CHINA, Yunnan (2018) [E1]
|
|||||||
Show 10 more conferences |
Preprint (4 outputs)
Year | Citation | Altmetrics | Link | ||
---|---|---|---|---|---|
2019 |
Tomasoni M, Gómez S, Crawford J, Zhang W, Choobdar S, Marbach D, Bergmann S, '
|
||||
2018 |
Choobdar S, Ahsen M, Crawford J, Tomasoni M, Lamparter D, Lin J, et al., 'Open Community Challenge Reveals Molecular Network Modules with Key Roles in Diseases
|
||||
2018 |
Zhang W, Le T, Liu L, Li J, 'Estimating heterogeneous treatment effects by balancing heterogeneity and fitness (2018)
|
||||
Show 1 more preprint |
Grants and Funding
Summary
Number of grants | 5 |
---|---|
Total funding | $209,500 |
Click on a grant title below to expand the full details for that specific grant.
20242 grants / $11,000
CESE Fellowship Accelerator$8,000
Funding body: College of Engineering Science and Environment | the University of Newcastle | Australia
Funding body | College of Engineering Science and Environment | the University of Newcastle | Australia |
---|---|
Scheme | CESE Excellence Strategic Investment Scheme |
Role | Lead |
Funding Start | 2024 |
Funding Finish | 2024 |
GNo | |
Type Of Funding | Internal |
Category | INTE |
UON | N |
CESE Conference Travel Scheme$3,000
Funding body: College of Engineering, Science, & Environment (CESE), The University of Newcastle
Funding body | College of Engineering, Science, & Environment (CESE), The University of Newcastle |
---|---|
Scheme | Internal Competitive Schemes |
Role | Lead |
Funding Start | 2024 |
Funding Finish | 2024 |
GNo | |
Type Of Funding | Internal |
Category | INTE |
UON | N |
20231 grants / $10,000
UON Start-up support funding$10,000
Funding body: College of Engineering, Science & Environment (CESE) Start-up Funding
Funding body | College of Engineering, Science & Environment (CESE) Start-up Funding |
---|---|
Scheme | College of Engineering, Science & Environment (CESE) Start-up Funding |
Role | Lead |
Funding Start | 2023 |
Funding Finish | 2023 |
GNo | |
Type Of Funding | Internal |
Category | INTE |
UON | N |
20221 grants / $63,500
Causality-based weakly supervised learning from inexact supervision$63,500
Funding body: National Natural Science Foundation of China
Funding body | National Natural Science Foundation of China |
---|---|
Scheme | National Natural Science Foundation of China |
Role | Lead |
Funding Start | 2022 |
Funding Finish | 2022 |
GNo | |
Type Of Funding | External |
Category | EXTE |
UON | N |
20211 grants / $125,000
Weakly supervised survival analysis for vehicle start-up battery failure prediction$125,000
Funding body: Changan Automobile
Funding body | Changan Automobile |
---|---|
Scheme | Industry Funding |
Role | Lead |
Funding Start | 2021 |
Funding Finish | 2022 |
GNo | |
Type Of Funding | External |
Category | EXTE |
UON | N |
Research Supervision
Number of supervisions
Current Supervision
Commenced | Level of Study | Research Title | Program | Supervisor Type |
---|---|---|---|---|
2024 | PhD | Statistics and Machine Learning Methods for Response Prediction and Evaluation | PhD (Statistics), College of Engineering, Science and Environment, The University of Newcastle | Principal Supervisor |
2022 | Masters | Attention-based weakly supervised object detection | Computer Science, Southeast University | Co-Supervisor |
2022 | Masters | Surivial analysis with competing risks | Computer Science, Southeast University | Co-Supervisor |
2022 | Masters | Interpretable deep generative models | Computer Science, Southeast University | Co-Supervisor |
2021 | Masters | Causal generative modeling of noisy data | Computer Science, Southeast University | Co-Supervisor |
2021 | Masters | Weakly supervised survival analysis for histopathology images | Information Technology, Southeast University-Monash University joint graduate school | Co-Supervisor |
2021 | Masters | Research on interpretable partial label learning algorithms | Computer Science, Southeast University-Monash University joint graduate school | Co-Supervisor |
2021 | PhD | Research on multi-instance partial label learning algorithms | Computer Science, Southeast University | Co-Supervisor |
Past Supervision
Year | Level of Study | Research Title | Program | Supervisor Type |
---|---|---|---|---|
2023 | Honours | Adversarial robustness of identifiable variational autoencoder | Artificial Intelligence, Southeast University | Sole Supervisor |
2022 | Honours | Multi-instance learning for histopathology images | Computer Science, Southeast University | Sole Supervisor |
Research Collaborations
The map is a representation of a researchers co-authorship with collaborators across the globe. The map displays the number of publications against a country, where there is at least one co-author based in that country. Data is sourced from the University of Newcastle research publication management system (NURO) and may not fully represent the authors complete body of work.
Country | Count of Publications | |
---|---|---|
China | 19 | |
Australia | 12 | |
United States | 4 | |
Switzerland | 3 | |
Spain | 3 | |
More... |
Dr Weijia Zhang
Position
Lecturer in Data Science and Applied Statistics
School of Information and Physical Sciences
College of Engineering, Science and Environment
Focus area
Data Science and Statistics
Contact Details
weijia.zhang@newcastle.edu.au | |
Phone | (02) 4055 0921 |
Links |
Personal webpage |
Office
Room | SR114 |
---|---|
Building | Social Science Building |
Location | Callaghan University Drive Callaghan, NSW 2308 Australia |