By mining a vast amount of software data, Associate Professor Hongyu Zhang is developing intelligent methods and tools that improve software quality and development productivity.

Hongyu Zhang

“Currently, software development is largely a manual, time-consuming, and error-prone process”, says Hongyu, “we can improve such a process by learning from software that was written before”.

“Over the years, a large number of software systems have been developed. These software systems are associated with a variety of data such as source code, bugs, logs, incident reports, metric data, etc. The availability of vast amounts of software data opens the opportunity for us to improve software quality and productivity.”

Together with his collaborators and students, Hongyu has proposed many data mining, machine learning (including deep learning), data mining, and information retrieval based methods to extract knowledge from the software data and solve software engineering problems. Some of his works are as follows:

Intelligent programming

To help programmers program, Hongyu proposed many innovative methods that learn from a large amount of source code for effective code search, code summarization, code generation, and code pattern mining. For example, he proposed one of the first deep learning based methods for source code search and API recommendation (FSE’16, ICSE’18), which can help programmers write new programs by searching and reusing existing code. He also proposed neural programming by example (AAAI’17), which targets at a challenging problem of automatically generating a program based on input/output examples through a deep neural network.

Intelligent quality prediction

The quality of software is important. Hongyu proposed many machine learning based methods for predicting defect-prone software modules. He also worked on cloud failure prediction, which predicts future failures of a computing node or a hard disk in a large-scale cloud system based on historical system metric and failure data (FSE’18). Hongyu proposed DeepPerf (ICSE 2019), which utilizes a deep feedforward neural network for predicting the runtime performance of a highly configurable software system. It was the first time that deep neural network was applied for successful software performance prediction.

Intelligent fault detection and diagnosis

Software systems always contain faults (bugs). Hongyu proposed many innovative methods for log-based fault detection, crash-based fault localization, bug report analytics, and incident management. For example, he proposed BugLocator (ICSE’12), which automatically locates buggy source code files based on a bug report. He also works on data-driven methods for compiler testing, with the aim of improving the efficiency of compiler testing.

Making real impact

Apart from scholarly publications, Hongyu is also keen to see the impact of research on practice. When Hongyu was working in Microsoft, he worked closely with Bing and Visual Studio teams on the Bing Developer Assistant (BDA) project. BDA is a Visual Studio Extension that allows developers to search for reusable code snippets based on queries. The BDA tool received more than 450K downloads in 2016. Hongyu has been collaborating with Microsoft Research teams and published many innovative techniques, which were also successfully deployed to real-world online service systems in Microsoft.

An independent 2019 Elsevier Bibliometric Assessment of Software Engineering Scholars ranks Hongyu as the world’s top 20 most prolific Software Engineering researcher in the past decade. He has been recognised in The Australian’s Top Researchers special edition publication (09/2020) as the leading researcher in the field of Software Systems.

Hongyu Zhang

Hongyu Zhang

By mining a vast amount of software data, Associate Professor Hongyu Zhang is developing intelligent methods and tools that improve software quality and development productivity.