Dr  Weijia Zhang

Dr Weijia Zhang

Lecturer in Data Science and Applied Statistics

School of Information and Physical Sciences (Data Science and Statistics)

Career Summary

Biography

Dr Weijia Zhang’s research aims to bridge the gap between causal inference and machine learning, which enables statistical and computational algorithms to transform the decision-making process of our government, healthcare, and business.

Dr. Weijia "Weja" Zhang graduated with a PhD degree from the University of South Australia in 2018. He has researched and taught a variety of courses in data science and machine learning, first when working as a Research Fellow in his alma mater, and then as an Associate Professor at Southeast University, China, before assuming a position as Lecturer with the University of Newcastle.

As a Lecturer in Data Science and Statistics at the University of Newcastle, Weijia is experienced in statistical causal inference and weakly supervised machine learning. His research has been featured in many prestigious international conferences and journals, and has been successfully adapted to various industries, including some major companies in the automobile, oil and gas, and telecommunications sectors in the Asia Pacific region.

The power of causal inference.

Causal inference is a field of study that deals with the identification of cause-and-effect relationships between variables. It has had a significant impact in various fields, including medicine, public health, economics, and social sciences, among others. Weijia’s work in causality focuses on developing methods to estimate the causal effect from observational data or even data with censoring, which allows researchers to determine the relationship between an exposure or intervention and an outcome of interest, without the need for running costly and time-consuming randomized controlled trials.

Machine learning from ambiguously labelled data

Although supervised machine learning has demonstrated success in various domains, it has a notable limitation in its dependence on large volumes of labelled data. Weakly supervised learning, which involves training algorithms using data with incomplete, ambiguous or noisy labels, is a technique employed in machine learning to overcome this limitation. Weijia's research in weakly supervised learning has sparked a new learning paradigm, known as Multi-Instance Partial Label learning, which enables algorithms to learn from low-quality and ambiguous labels provided by non-expert crowdsourced annotators. This paradigm offers a promising solution to the challenge of limited expert annotations in crucial domains, such as medical imaging and diagnostics.

Bridging causal inference with machine learning

It comes as no surprise that Weijia's research in causal inference and machine learning leads him inevitably to the emerging field of causal machine learning. This area has the potential to enhance the dependability and interpretability of machine learning models, which have impeded the advancement of machine learning applications since its inception. Thanks to Weijia's expertise in weakly-supervised learning, one of his recent works has demonstrated that causality-based models can be more readily achieved when supervision is ambiguous than in traditional supervised learning.

Making realworld impact

In addition to his academic publications, Weijia has extensive experience in translating his research into practical applications. He has collaborated with customer representatives in the telecommunications sector to enhance customer retention, and with production engineers in the oil and gas industry to develop predictive maintenance strategies, among other projects. In a recent collaboration with Chang'an Automobile, a major car manufacturer in the Asia Pacific, Weijia's algorithm was implemented to improve the lifespan for hundreds of thousands of vehicle batteries.


Qualifications

  • DOCTOR OF PHILOSOPHY, University of South Australia

Keywords

  • Causal Inference
  • Data Mining
  • Machine Learning

Languages

  • English (Fluent)
  • Mandarin (Mother)

Fields of Research

Code Description Percentage
461106 Semi- and unsupervised learning 40
490508 Statistical data science 40
461103 Deep learning 20

Professional Experience

UON Appointment

Title Organisation / Department
Lecturer in Data Science and Applied Statistics University of Newcastle
School of Information and Physical Sciences
Australia

Academic appointment

Dates Title Organisation / Department
1/7/2021 - 31/1/2023 Associate Professor Southeast University
School of Computer Science and Engineering
China
28/5/2018 - 17/7/2020 Research Fellow The University of South Australia
School of Information and Mathematical Science
Australia

Teaching

Code Course Role Duration
INFS 5102 Unsupervised Methods in Analytics
The University of South Australia
Course Coordinator 18/2/2019 - 30/6/2019
STAT6160 Data Analytics for Business Intelligence
College of Engineering, Science and Environment, University of Newcastle
Course Coordinator 20/2/2023 - 30/6/2023
CSE 5075 Pattern Recognition
Southeast University
Course Coordinator 1/3/2022 - 30/7/2022
INFS 5100 Predictive Analytics
The University of South Australia
Course Coordinator 17/2/2020 - 17/7/2020
STAT1060 Business Decision Making
College of Engineering, Science and Environment, University of Newcastle
Course Coordinator 20/2/2023 - 30/6/2023
Edit

Publications

For publications that are currently unpublished or in-press, details are shown in italics.


Journal article (11 outputs)

Year Citation Altmetrics Link
2024 Tang W, Zhang W, Zhang ML, 'Multi-instance partial-label learning: towards exploiting dual inexact supervision', Science China Information Sciences, 67 (2024) [C1]

Weakly supervised machine learning algorithms are able to learn from ambiguous samples or labels, e.g., multi-instance learning or partial-label learning. However, in some real-wo... [more]

Weakly supervised machine learning algorithms are able to learn from ambiguous samples or labels, e.g., multi-instance learning or partial-label learning. However, in some real-world tasks, each training sample is associated with not only multiple instances but also a candidate label set that contains one ground-truth label and some false positive labels. Specifically, at least one instance pertains to the ground-truth label while no instance belongs to the false positive labels. In this paper, we formalize such problems as multi-instance partial-label learning (MIPL). Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems since the former fails to disambiguate a candidate label set, and the latter cannot handle a multi-instance bag. To address these issues, a tailored algorithm named MiplGp, i.e., multi-instance partial-label learning with Gaussian processes, is proposed. MiplGp first assigns each instance with a candidate label set in an augmented label space, then transforms the candidate label set into a logarithmic space to yield the disambiguated and continuous labels via an exclusive disambiguation strategy, and last induces a model based on the Gaussian processes. Experimental results on various datasets validate that MiplGp is superior to well-established multi-instance learning and partial-label learning algorithms for solving MIPL problems.

DOI 10.1007/s11432-023-3771-6
2022 Zhang W, Li J, Liu L, 'A Unified Survey of Treatment Effect Heterogeneity Modelling and Uplift Modelling', ACM Computing Surveys, 54 1-36 (2022) [C1]
DOI 10.1145/3466818
Citations Scopus - 22Web of Science - 13
2022 Zhang X, Zhang W, Zhao YC, Zhu Q, 'Imbalanced volunteer engagement in cultural heritage crowdsourcing: a task-related exploration based on causal inference', INFORMATION PROCESSING & MANAGEMENT, 59 (2022) [C1]
DOI 10.1016/j.ipm.2022.103027
Citations Scopus - 8Web of Science - 1
2022 Yang S, Zhang Y, Jia Y, Zhang W, 'Local Low-Rank Approximation With Superpixel-Guided Locality Preserving Graph for Hyperspectral Image Classification', IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 15 7741-7754 (2022) [C1]
DOI 10.1109/JSTARS.2022.3199885
Citations Scopus - 2
2021 Li J, Zhang W, Liu L, Yu K, Le TD, Liu J, 'A general framework for causal classification', INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 11 127-139 (2021) [C1]
DOI 10.1007/s41060-021-00249-1
Citations Scopus - 5Web of Science - 4
2020 Tomasoni M, Gomez S, Crawford J, Zhang W, Choobdar S, Marbach D, Bergmann S, 'MONET: a toolbox integrating top-performing methods for network modularization', BIOINFORMATICS, 36 3920-3921 (2020) [C1]
DOI 10.1093/bioinformatics/btaa236
Citations Scopus - 10Web of Science - 6
2019 Brown P, RELISH Consortium, Zhou Y, 'Large expert-curated database for benchmarking document similarity detection in biomedical literature search', Database, 2019 (2019) [C1]
DOI 10.1093/database/baz085
Citations Scopus - 17Web of Science - 12
Co-authors Brett Nixon, Elizabeth Bromfield
2019 Choobdar S, Ahsen ME, Crawford J, Tomasoni M, Fang T, Lamparter D, et al., 'Assessment of network module identification across complex diseases', NATURE METHODS, 16 843-+ (2019) [C1]
DOI 10.1038/s41592-019-0509-5
Citations Scopus - 154Web of Science - 107
2018 Xu T, Su N, Liu L, Zhang J, Wang H, Zhang W, et al., 'miRBaseConverter: an R/Bioconductor package for converting and retrieving miRNA name, accession, sequence and family information in different versions of miRBase', BMC BIOINFORMATICS, 19 (2018) [C1]
DOI 10.1186/s12859-018-2531-5
Citations Scopus - 45Web of Science - 37
2017 Zhang W, Thuc DL, Liu L, Zhou Z-H, Li J, 'Mining heterogeneous causal effects for personalized cancer treatment', BIOINFORMATICS, 33 2372-2378 (2017) [C1]
DOI 10.1093/bioinformatics/btx174
Citations Scopus - 28Web of Science - 18
2016 Zhang W, Thuc DL, Liu L, Zhou Z-H, Li J, 'Predicting miRNA Targets by Integrating Gene Regulatory Knowledge with Expression Profiles', PLOS ONE, 11 (2016) [C1]
DOI 10.1371/journal.pone.0152860
Citations Scopus - 15Web of Science - 16
Show 8 more journal articles

Conference (13 outputs)

Year Citation Altmetrics Link
2022 Zhang W, Zhang X, Deng H-W, Zhang M-L, 'Multi-Instance Causal Representation Learning for Instance Label Prediction and Out-of-Distribution Generalization', Advances in Neural Information Processing Systems, New Orleans, USA (2022) [E1]
Citations Scopus - 5
2021 Zhang W, 'Non-I.I.D. Multi-Instance Learning for Predicting Instance and Bag Labels with Variational Auto-Encoder', Proceedings of the 30th International Joint Conference on Artificial Intelligence, Virtual Conference (2021) [E1]
DOI 10.24963/ijcai.2021/465
Citations Scopus - 3
2021 Zhang W, Liu L, Li J, 'Treatment Effect Estimation with Disentangled Latent Factors', Proceedings of the 35th AAAI Conference on Artificial Intelligence, Virtual Conference (2021) [E1]
DOI 10.1609/aaai.v35i12.17304
Citations Scopus - 20Web of Science - 4
2020 Zhang W, Liu L, Li J, 'Robust Multi-Instance Learning with Stable Instances', ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, European Assoc Artificial Intelligence, ELECTR NETWORK (2020) [E1]
DOI 10.3233/FAIA200280
Citations Scopus - 7Web of Science - 6
2019 Zhou J, Yao K, Huang X, Sun G, Zhang W, Ashtaq A, et al., 'Temperature Calculation and Measurement on Power Cable Conductor Based on Equivalent Thermal Circuit and BOTDA', 2019 IEEE INTERNATIONAL CONFERENCE ON ENVIRONMENT AND ELECTRICAL ENGINEERING AND 2019 IEEE INDUSTRIAL AND COMMERCIAL POWER SYSTEMS EUROPE (EEEIC / I&CPS EUROPE), ITALY, Univ Genoa, Genova (2019)
Citations Web of Science - 1
2019 Huang X, Li Y, Wu C, Lin D, Liu Z, Chen Y, et al., 'Temperature Online Monitoring of Submarine Cable Based on BOTDA and FOCT', PROCEEDINGS OF 2019 IEEE 3RD INTERNATIONAL ELECTRICAL AND ENERGY CONFERENCE (CIEEC), PEOPLES R CHINA, Beijing (2019)
DOI 10.1109/CIEEC47146.2019.CIEEC-2019453
Citations Web of Science - 1
2019 Chen Y, Cai C, Wu C, Zhang W, Huang X, Cen Z, Guo Q, 'Detection and Research of 500kV Submarine Cable Routing', PROCEEDINGS OF 2019 IEEE 3RD INTERNATIONAL ELECTRICAL AND ENERGY CONFERENCE (CIEEC), PEOPLES R CHINA, Beijing (2019)
DOI 10.1109/CIEEC47146.2019.CIEEC-2019725
2018 Ng WT, Yu J, Wang M, Li R, Zhang W, 'Design Trends in Smart Gate Driver ICs for Power GaN HEMTs', 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), PEOPLES R CHINA, Qingdao (2018)
2018 Zhang W, Thuc DL, Liu L, Li J, 'Estimating heterogeneous treatment effect by balancing heterogeneity and fitness', BMC BIOINFORMATICS, PEOPLES R CHINA, Yunnan (2018) [E1]
DOI 10.1186/s12859-018-2521-7
Citations Scopus - 5Web of Science - 2
2015 Li W, Pang J, Niu Q, Zhang W, 'Application of improved support vector machine based on shuffled frog leaping algorithm in wind-photovoltaic-battery power forecasting', 2015 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS IHMSC 2015, VOL II, PEOPLES R CHINA, Hangzhou (2015)
DOI 10.1109/IHMSC.2015.248
Citations Web of Science - 1
2015 Li W, Niu Q, Zhang W, Pang J, 'The Application of Spark in the Power Grid Intelligent Decision Analysis Platform', 2015 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS IHMSC 2015, VOL II, PEOPLES R CHINA, Hangzhou (2015)
DOI 10.1109/IHMSC.2015.200
2015 Li W, Zhang W, Pang J, Niu Q, 'Optimal Capacity Allocation of Large-scale Wind-PV-Battery Hybrid System', 2015 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS IHMSC 2015, VOL II, PEOPLES R CHINA, Hangzhou (2015)
DOI 10.1109/IHMSC.2015.251
Citations Web of Science - 1
2014 Zhang W-J, Zhou Z-H, 'Multi-Instance Learning with Distribution Change', Proceedings of the 28th AAAI Conference on Artificial Intelligence, Quebec City, CANADA (2014)
Citations Scopus - 14Web of Science - 9
Show 10 more conferences

Preprint (4 outputs)

Year Citation Altmetrics Link
2019 Tomasoni M, Gómez S, Crawford J, Zhang W, Choobdar S, Marbach D, Bergmann S, '
DOI 10.1101/611418
2018 Choobdar S, Ahsen M, Crawford J, Tomasoni M, Lamparter D, Lin J, et al., 'Open Community Challenge Reveals Molecular Network Modules with Key Roles in Diseases
DOI 10.2139/ssrn.3188379
2018 Zhang W, Le T, Liu L, Li J, 'Estimating heterogeneous treatment effects by balancing heterogeneity and fitness (2018)
DOI 10.1101/333278
2018 Xu T, Su N, Liu L, Zhang J, Wang H, Zhang W, et al., 'miRBaseConverter: An R/Bioconductor Package for Converting and Retrieving miRNA Name, Accession, Sequence and Family Information in Different Versions of miRBase (2018)
DOI 10.1101/407148
Show 1 more preprint
Edit

Grants and Funding

Summary

Number of grants 5
Total funding $209,500

Click on a grant title below to expand the full details for that specific grant.


20242 grants / $11,000

CESE Fellowship Accelerator$8,000

Funding body: College of Engineering Science and Environment | the University of Newcastle | Australia

Funding body College of Engineering Science and Environment | the University of Newcastle | Australia
Scheme CESE Excellence Strategic Investment Scheme
Role Lead
Funding Start 2024
Funding Finish 2024
GNo
Type Of Funding Internal
Category INTE
UON N

CESE Conference Travel Scheme$3,000

Funding body: College of Engineering, Science, & Environment (CESE), The University of Newcastle

Funding body College of Engineering, Science, & Environment (CESE), The University of Newcastle
Scheme Internal Competitive Schemes
Role Lead
Funding Start 2024
Funding Finish 2024
GNo
Type Of Funding Internal
Category INTE
UON N

20231 grants / $10,000

UON Start-up support funding$10,000

Funding body: College of Engineering, Science & Environment (CESE) Start-up Funding

Funding body College of Engineering, Science & Environment (CESE) Start-up Funding
Scheme College of Engineering, Science & Environment (CESE) Start-up Funding
Role Lead
Funding Start 2023
Funding Finish 2023
GNo
Type Of Funding Internal
Category INTE
UON N

20221 grants / $63,500

Causality-based weakly supervised learning from inexact supervision$63,500

Funding body: National Natural Science Foundation of China

Funding body National Natural Science Foundation of China
Scheme National Natural Science Foundation of China
Role Lead
Funding Start 2022
Funding Finish 2022
GNo
Type Of Funding External
Category EXTE
UON N

20211 grants / $125,000

Weakly supervised survival analysis for vehicle start-up battery failure prediction$125,000

Funding body: Changan Automobile

Funding body Changan Automobile
Scheme Industry Funding
Role Lead
Funding Start 2021
Funding Finish 2022
GNo
Type Of Funding External
Category EXTE
UON N
Edit

Research Supervision

Number of supervisions

Completed2
Current8

Current Supervision

Commenced Level of Study Research Title Program Supervisor Type
2024 PhD Statistics and Machine Learning Methods for Response Prediction and Evaluation PhD (Statistics), College of Engineering, Science and Environment, The University of Newcastle Principal Supervisor
2022 Masters Attention-based weakly supervised object detection Computer Science, Southeast University Co-Supervisor
2022 Masters Surivial analysis with competing risks Computer Science, Southeast University Co-Supervisor
2022 Masters Interpretable deep generative models Computer Science, Southeast University Co-Supervisor
2021 Masters Causal generative modeling of noisy data Computer Science, Southeast University Co-Supervisor
2021 Masters Weakly supervised survival analysis for histopathology images Information Technology, Southeast University-Monash University joint graduate school Co-Supervisor
2021 Masters Research on interpretable partial label learning algorithms Computer Science, Southeast University-Monash University joint graduate school Co-Supervisor
2021 PhD Research on multi-instance partial label learning algorithms Computer Science, Southeast University Co-Supervisor

Past Supervision

Year Level of Study Research Title Program Supervisor Type
2023 Honours Adversarial robustness of identifiable variational autoencoder Artificial Intelligence, Southeast University Sole Supervisor
2022 Honours Multi-instance learning for histopathology images Computer Science, Southeast University Sole Supervisor
Edit

Research Collaborations

The map is a representation of a researchers co-authorship with collaborators across the globe. The map displays the number of publications against a country, where there is at least one co-author based in that country. Data is sourced from the University of Newcastle research publication management system (NURO) and may not fully represent the authors complete body of work.

Country Count of Publications
China 19
Australia 12
United States 4
Switzerland 3
Spain 3
More...
Edit

Dr Weijia Zhang

Position

Lecturer in Data Science and Applied Statistics
School of Information and Physical Sciences
College of Engineering, Science and Environment

Focus area

Data Science and Statistics

Contact Details

Email weijia.zhang@newcastle.edu.au
Phone (02) 4055 0921
Links Personal webpage
Twitter

Office

Room SR114
Building Social Science Building
Location Callaghan
University Drive
Callaghan, NSW 2308
Australia
Edit