2025 |
Gui Y, Li Z, Wan Y, Shi Y, Zhang H, Chen B, et al., 'WebCode2M: A Real-World Dataset for Code Generation from Webpage Designs', Proceedings of the ACM on Web Conference 2025 (2025)
|
|
|
2025 |
Gui Y, Wan Y, Li Z, Zhang Z, Chen D, Zhang H, et al., 'UIC
|
|
|
2025 |
Xie Y, Zhang H, Babar MA, 'Multivariate Time Series Anomaly Detection by Capturing Coarse-Grained Intra- and Inter-Variate Dependencies', Proceedings of the ACM on Web Conference 2025 (2025)
|
|
|
2024 |
Liu C, Zhang X, Zhang H, Wan Z, Huang Z, Yan M, 'An Empirical Study of Code Search in Intelligent Coding Assistant: Perceptions, Expectations, and Directions', Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering (2024) [E1]
|
|
|
2024 |
Liu Y, Zhang H, Le V-H, Miao Y, Li Z, 'Local Search-based Approach for Cost-effective Job Assignment on Large Language Models', Proceedings of the Genetic and Evolutionary Computation Conference Companion (2024) [E1]
|
|
Nova |
2024 |
Li X, Zhang H, Le V-H, Chen P, 'LogShrink: Effective Log Compression by Leveraging Commonality and Variability of Log Data', Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (2024) [E1]
|
|
Nova |
2024 |
Qi B, Sun H, Zhang H, Zhao R, Gao X, 'Modularizing while Training: A New Paradigm for Modularizing DNN Models', Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (2024) [E1]
|
|
|
2024 |
Xu Z, Qiang S, Song D, Zhou M, Wan H, Zhao X, et al., 'DSFM: Enhancing Functional Code Clone Detection with Deep Subtree Interactions', Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (2024) [E1]
|
|
|
2024 |
Gao Y, He Y, Li X, Zhao B, Lin H, Liang Y, et al., 'An Empirical Study on Low GPU Utilization of Deep Learning Jobs', Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (2024) [E1]
|
|
|
2024 |
Liu Y, Zhang H, Li Z, Miao Y, 'Optimizing the Utilization of Large Language Models via Schedule Optimization: An Exploratory Study', International Symposium on Empirical Software Engineering and Measurement, Barcelona, Spain (2024)
|
|
Nova |
2024 |
Xiao Y, Le VH, Zhang H, 'Demonstration-Free: Towards More Practical Log Parsing with Large Language Models', Proceedings - 2024 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024, California, USA (2024) [E1]
|
|
|
2024 |
Sun Z, Wan Y, Li J, Zhang H, Jin Z, Li G, Lyu C, 'Sifting through the Chaff: On Utilizing Execution Feedback for Ranking the Generated Code Candidates', Proceedings - 2024 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024, Sacramento, CA (2024) [E1]
|
|
|
2024 |
Pham L, Ha H, Zhang H, 'Root Cause Analysis for Microservice System based on Causal Inference: How Far Are We?', Proceedings - 2024 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024, Sacramento, CA (2024) [E1]
|
|
|
2024 |
Liu Y, Zhang H, Miao Y, Le VH, Li Z, 'OptLLM: Optimal Assignment of Queries to Large Language Models', Proceedings of the IEEE International Conference on Web Services, ICWS, Shenzhen, China (2024) [E1]
|
|
|
2024 |
Liao Y, Xu M, Lin Y, Teoh X, Xie X, Feng R, et al., 'Detecting and Explaining Anomalies Caused by Web Tamper Attacks via Building Consistency-based Normality', Proceedings - 2024 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024, Sacramento, CA (2024) [E1]
|
|
|
2024 |
Luo C, Lyu S, Zhao Q, Wu W, Zhang H, Hu C, 'Beyond Pairwise Testing: Advancing 3-wise Combinatorial Interaction Testing for Highly Configurable Systems', Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, Vienna, Austria (2024) [E1]
|
|
|
2024 |
Guo L, Wang Y, Shi E, Zhong W, Zhang H, Chen J, et al., 'When to Stop? Towards Efficient Code Generation in LLMs with Excess Token Prevention', Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, Vienna, Austria (2024) [E1]
|
|
|
2024 |
Chen Y, Gao C, Yang Z, Zhang H, Liao Q, 'Bridge and Hint: Extending Pre-trained Language Models for Long-Range Code', Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, Vienna, Austria (2024) [E1]
|
|
|
2024 |
Chu Z, Wan Y, Li Q, Wu Y, Zhang H, Sui Y, et al., 'Graph Neural Networks for Vulnerability Detection: A Counterfactual Explanation', Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, Vienna, Austria (2024) [E1]
|
|
|
2024 |
Nguyen HT, Nguyen LV, Le VH, Zhang H, Le MT, 'Efficient Log-based Anomaly Detection with Knowledge Distillation', Proceedings of the IEEE International Conference on Web Services, ICWS, Shenzhen, China (2024) [E1]
|
|
|
2023 |
Gao Y, Shi X, Lin H, Zhang H, Wu H, Li R, Yang M, 'An Empirical Study on Quality Issues of Deep Learning Platform', 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE, ICSE-SEIP, AUSTRALIA, Melbourne (2023) [E1]
|
|
|
2023 |
Le V-H, Zhang H, 'Log Parsing: How Far Can ChatGPT Go?', 2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, LUXEMBOURG, Echternach (2023) [E1]
|
|
Nova |
2023 |
Shi E, Wang Y, Zhang H, Du L, Han S, Zhang D, Sun H, 'Towards Efficient Fine-Tuning of Pre-trained Code Models: An Experimental Study and Beyond', PROCEEDINGS OF THE 32ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2023, WA, Seattle (2023)
|
|
|
2023 |
Liu J, He S, Chen Z, Li L, Kang Y, Zhang X, et al., 'Incident-aware Duplicate Ticket Aggregation for Cloud Systems', 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, AUSTRALIA, Melbourne (2023) [E1]
|
|
|
2023 |
Zhao Q, Luo C, Cai S, Wu W, Lin J, Zhang H, Hu C, 'CAmpactor: A Novel and Effective Local Search Algorithm for Optimizing Pairwise Covering Arrays', ESEC/FSE 2023 - Proceedings of the 31st ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, San Francisco, CA (2023) [E1]
|
|
|
2023 |
Lin Q, Li T, Zhao P, Liu Y, Ma M, Zheng L, et al., 'EDITS: An Easy-to-difficult Training Strategy for Cloud Failure Prediction', ACM Web Conference 2023 - Companion of the World Wide Web Conference, WWW 2023, Austin, Texas (2023) [E1]
|
|
|
2023 |
Shi E, Wang Y, Gu W, Du L, Zhang H, Han S, et al., 'CoCoSoDa: Effective Contrastive Learning for Code Search', 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, AUSTRALIA, Melbourne (2023) [E1]
|
|
|
2023 |
Le V-H, Zhang H, 'Log Parsing with Prompt-based Few-shot Learning', 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, AUSTRALIA, Melbourne (2023) [E1]
|
|
Nova |
2023 |
Li L, Zhang X, He S, Kang Y, Zhango H, Ma M, et al., 'CONAN: Diagnosing Batch Failures for Cloud Systems', 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE, ICSE-SEIP, AUSTRALIA, Melbourne (2023) [E1]
|
|
Nova |
2023 |
Xu Z, Zhou M, Zhao X, Chen Y, Cheng X, Zhang H, 'xASTNN: Improved Code Representations for Industrial Practice', ESEC/FSE 2023 - Proceedings of the 31st ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, San Francisco, CA (2023) [E1]
|
|
|
2023 |
Gao Y, Gu X, Zhang H, Lin H, Yang M, 'Runtime Performance Prediction for Deep Learning Models with Graph Neural Network', 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE, ICSE-SEIP, AUSTRALIA, Melbourne (2023)
|
|
|
2023 |
Wang Y, Wang J, Zhang H, Wang K, Wang Q, 'What are Pros and Cons? Stance Detection and Summarization on Feature Request', International Symposium on Empirical Software Engineering and Measurement, New Orleans, LA (2023) [E1]
|
|
|
2023 |
Feng Q, Sui Y, Zhang H, 'Uncovering Limitations in Text-to-Image Generation: A Contrastive Approach with Structured Semantic Alignment', Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore (2023) [E1] |
|
|
2023 |
Qiao S, Zhou W, Wen J, Zhang H, Gao M, 'Bi-channel Multiple Sparse Graph Attention Networks for Session-based Recommendation', International Conference on Information and Knowledge Management, Proceedings, Birmingham, UK (2023) [E1]
|
|
|
2023 |
Hu F, Wang Y, Du L, Li X, Zhang H, Han S, Zhang D, 'Revisiting Code Search in a Two-Stage Paradigm', WSDM 2023 - Proceedings of the 16th ACM International Conference on Web Search and Data Mining, Singapore (2023) [E1]
|
|
Nova |
2022 |
Song X, Yan J, Huang Y, Sun H, Zhang H, 'A Collaboration-Aware Approach to Profiling Developer Expertise with Cross-Community Data', 2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, PEOPLES R CHINA, Guangzhou (2022) [E1]
|
|
Nova |
2022 |
Liu Y, Zhang X, He S, Zhang H, Li L, Kang Y, et al., 'UniParser: A Unified Log Parser for Heterogeneous Log Data', WWW 2022: Proceedings of the ACM Web Conference 2022, Lyon, France (2022) [E1]
|
|
Nova |
2022 |
Wang X, Wu Q, Zhang H, Lyu C, Jiang X, Zheng Z, et al., 'HELoC: Hierarchical Contrastive Learning of Source Code Representation', IEEE International Conference on Program Comprehension, Pittsburgh, PA (2022) [E1]
|
|
Nova |
2022 |
Tang W, Wang Y, Zhang H, Han S, Luo P, Zhang D, 'LibDB: An Effective and Efficient Framework for Detecting Third-Party Libraries in Binaries', Proceedings: 2022 Mining Software Repositories Conference (MSR 2022), Pittsburgh, PA (2022) [E1]
|
|
Nova |
2022 |
Gui Y, Wan Y, Zhang H, Huang H, Sui Y, Xu G, et al., 'Cross-Language Binary-Source Code Matching with Intermediate Representations', Proceedings: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2022), Honolulu, HI (2022) [E1]
|
|
Nova |
2022 |
Chen Z, Liu J, Su Y, Zhang H, Ling X, Yang Y, Lyu MR, 'Adaptive Performance Anomaly Detection for Online Service Systems via Pattern Sketching', 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), Pittsburgh, PA (2022) [E1]
|
|
Nova |
2022 |
Chai Y, Zhang H, Shen B, Gu X, 'Cross-Domain Deep Code Search with Meta Learning', 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), PA, Pittsburgh (2022) [E1]
|
|
Nova |
2022 |
Meng X, Wang X, Zhang H, Sun H, Liu X, 'Improving Fault Localization and Program Repair with Deep Semantic Features and Transferred Knowledge', 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), PA, Pittsburgh (2022) [E1]
|
|
Nova |
2022 |
Le V-H, Zhang H, 'Log-based Anomaly Detection with Deep Learning: How Far Are We?', 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), Pittsburgh, PA (2022) [E1]
|
|
Nova |
2022 |
Shi E, Wang Y, Du L, Chen J, Han S, Zhang H, et al., 'On the Evaluation of Neural Code Summarization', 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), PA, Pittsburgh (2022) [E1]
|
|
Nova |
2022 |
Gao Y, Li Z, Lin H, Zhang H, Wu M, Yang M, 'REFTY: Refinement Types for Valid Deep Learning Models', 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), PA, Pittsburgh (2022) [E1]
|
|
Nova |
2022 |
Wan Y, Zhao W, Zhang H, Sui Y, Xu G, Jin H, 'What Do They Capture? - A Structural Analysis of Pre-Trained Language Models for Source Code', 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), PA, Pittsburgh (2022) [E1]
|
|
Nova |
2022 |
Wan Y, He Y, Bi Z, Zhang J, Sui Y, Zhang H, et al., 'NATURALCC: An Open-Source Toolkit for Code Intelligence', 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2022), Pittsburgh, PA (2022) [E1]
|
|
|
2022 |
Gu W, Wang Y, Du L, Zhang H, Han S, Zhang D, Lyu MR, 'Accelerating Code Search with Deep Hashing and Code Classification', PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), Dublin, IRELAND (2022) [E1]
|
|
Nova |
2022 |
Xie Y, Zhang H, Babar MA, 'LogGD: Detecting Anomalies from System Logs with Graph Neural Networks', 2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, PEOPLES R CHINA, Guangzhou (2022) [E1]
|
|
Nova |
2022 |
Liu Y, Yang H, Zhao P, Ma M, Wen C, Zhang H, et al., 'Multi-task Hierarchical Classification for Disk Failure Prediction in Online Service Systems', Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC (2022) [E1]
|
|
Nova |
2022 |
Wan Y, He Y, Bi Z, Zhang J, Sui Y, Zhang H, et al., 'NATURALCC: An Open-Source Toolkit for Code Intelligence', 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2022), PA, Pittsburgh (2022) [E1]
|
|
Nova |
2022 |
Ma M, Liu Y, Tong Y, Li H, Zhao P, Xu Y, et al., 'An empirical investigation of missing data handling in cloud node failure prediction', ESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Singapore (2022) [E1]
|
|
Nova |
2022 |
Wang C, Yang Y, Gao C, Peng Y, Zhang H, Lyu MR, 'No more fine-tuning? an experimental evaluation of prompt tuning in code intelligence', ESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Singapore (2022) [E1]
|
|
Nova |
2022 |
Wan Y, Zhang S, Zhang H, Sui Y, Xu G, Yao D, et al., 'You see what I want you to see: poisoning vulnerabilities in neural code search', ESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Singapore (2022) [E1]
|
|
Nova |
2022 |
Zhang Z, Zhang H, Shen B, Gu X, 'Diet code is healthy: simplifying programs for pre-trained models of code', ESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Singapore (2022) [E1]
|
|
Nova |
2022 |
Luo C, Zhao Q, Cai S, Zhang H, Hu C, 'SamplingCA: effective and efficient sampling-based pairwise testing for highly configurable software systems', ESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Singapore (2022) [E1]
|
|
Nova |
2022 |
Wang X, Zhang X, Li L, He S, Zhang H, Liu Y, et al., 'SPINE: a scalable log parser with feedback guidance', ESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Singapore (2022) [E1]
|
|
Nova |
2022 |
Li H, Miao C, Leung C, Huang Y, Huang Y, Zhang H, Wang Y, 'Exploring Representation-Level Augmentation for Code Search', Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates (2022) [E1]
|
|
Nova |
2022 |
Shi E, Wang Y, Tao W, Du L, Zhang H, Han S, et al., 'RACE: Retrieval-Augmented Commit Message Generation', Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates (2022) [E1]
|
|
Nova |
2022 |
Wang L, Zhao P, Du C, Luo C, Su M, Yang F, et al., 'NENYA: Cascade Reinforcement Learning for Cost-Aware Failure Mitigation at Microsoft 365', Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, USA (2022) [E1]
|
|
Nova |
2022 |
Wang Y, Wang J, Zhang H, Ming X, Shi L, Wang Q, 'Where is Your App Frustrating Users?', 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), PA, Pittsburgh (2022) [E1]
|
|
Nova |
2022 |
Qi B, Sun H, Gao X, Zhang H, 'Patching Weak Convolutional Neural Network Models through Modularization and Composition', ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, Michigan, USA (2022) [E1]
|
|
Nova |
2021 |
Xie Y, Zhang H, Zhang B, Babar MA, Lu S, 'LogDP: Combining Dependency and Proximity for Log-Based Anomaly Detection', Service-Oriented Computing 19th International Conference, ICSOC 2021 Virtual Event, November 22 25, 2021 Proceedings, Virtual (2021) [E1]
|
|
Nova |
2021 |
Luo C, Qiao B, Xing W, Chen X, Zhao P, Chao D, et al., 'Correlation-Aware Heuristic Search for Intelligent Virtual Machine Provisioning in Cloud Systems', Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence AAAI 2021, Virtual (2021) [E1]
|
|
Nova |
2021 |
Gao Y, Zhu Y, Zhang H, Lin H, Yang M, 'Resource-Guided Configuration Space Reduction for Deep Learning Models', 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE) (2021) [E1]
|
|
Nova |
2021 |
Luo C, Zhao P, Chen C, Qiao B, Du C, Zhang H, et al., 'PULNS: Positive-Unlabeled Learning with Effective Negative Sample Selector', Proceedings of the AAAI Conference on Artificial Intelligence, Virtual (2021) [E1]
|
|
Nova |
2021 |
Le VH, Zhang H, 'Log-based Anomaly Detection Without Log Parsing', Proceedings - 2021 36th IEEE/ACM International Conference on Automated Software Engineering, ASE 2021, Melbourne, Australia (2021) [E1]
|
|
Nova |
2021 |
Tao W, Wang Y, Shi E, Du L, Han S, Zhang H, et al., 'On the Evaluation of Commit Message Generation Models: An Experimental Study', Proceedings - 2021 IEEE International Conference on Software Maintenance and Evolution, ICSME 2021, Luxembourg (2021) [E1]
|
|
Nova |
2021 |
Zhang X, Du C, Li Y, Xu Y, Zhang H, Qin S, et al., 'HALO: Hierarchy-aware Fault Localization for Cloud Systems', Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Virtual, Singapore (2021) [E1]
|
|
Nova |
2021 |
Zhang X, Xu Y, Qin S, He S, Qiao B, Li Z, et al., 'Onion: Identifying incident-indicating logs for cloud systems', ESEC/FSE 2021 - Proceedings of the 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece (2021) [E1]
|
|
Nova |
2021 |
Qiao B, Yang F, Luo C, Wang Y, Li J, Lin Q, et al., 'Intelligent container reallocation at Microsoft 365', ESEC/FSE 2021 - Proceedings of the 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece (2021) [E1]
|
|
Nova |
2021 |
Luo C, Sun B, Qiao B, Chen J, Zhang H, Lin J, et al., 'LS-sampling: An effective local search based sampling approach for achieving high t-wise coverage', ESEC/FSE 2021 - Proceedings of the 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece (2021) [E1]
|
|
Nova |
2021 |
Dong H, Qin S, Xu Y, Qiao B, Zhou S, Yang X, et al., 'Effective low capacity status prediction for cloud systems', ESEC/FSE 2021 - Proceedings of the 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece (2021) [E1]
|
|
Nova |
2021 |
Wu D, Jing XY, Zhang H, Zhou Y, Xu B, 'Leveraging Stack Overflow to Detect Relevant Tutorial Fragments of APIs', Proceedings - 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2021, Honolulu, HI (2021) [E1]
|
|
Nova |
2021 |
Li L, Zhang X, Zhao X, Zhang H, Kang Y, Zhao P, et al., 'Fighting the Fog of War: Automated Incident Detection for Cloud Systems', PROCEEDINGS OF THE 2021 USENIX ANNUAL TECHNICAL CONFERENCE, ELECTR NETWORK (2021) [E1]
|
|
Nova |
2021 |
Gu X, Han YS, Kim S, Zhang H, 'Do bugs propagate? an empirical analysis of temporal correlations among software bugs', 35th European Conference on Object-Oriented Programming. Leibniz International Proceedings in Informatics, Aarhus, Denmark (2021) [E1]
|
|
Nova |
2021 |
Luo C, Qiao B, Chen X, Zhao P, Yao R, Zhang H, et al., 'Intelligent Virtual Machine Provisioning in Cloud Computing', Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, Yokohama, Japan (2021) [E1]
|
|
Nova |
2021 |
Luo C, Zhao P, Qiao B, Wu Y, Zhang H, Wu W, et al., 'NTAM: Neighborhood-temporal attention model for disk failure prediction in cloud platforms', The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021, Ljubljana, Slovenia (2021) [E1]
|
|
Nova |
2021 |
Chen Z, Liu J, Su Y, Zhang H, Wen X, Ling X, et al., 'Graph-based Incident Aggregation for Large-Scale Online Service Systems', Proceedings - 2021 36th IEEE/ACM International Conference on Automated Software Engineering, ASE 2021, Melbourne Australia (2021) [E1]
|
|
Nova |
2021 |
Wang W, Chen J, Yang L, Zhang H, Zhao P, Qiao B, et al., 'How Long Will it Take to Mitigate this Incident for Online Service Systems?', Proceedings - International Symposium on Software Reliability Engineering, ISSRE, Wuhan, china (2021) [E1]
|
|
Nova |
2021 |
Wang Y, Li G, Wang Z, Kang Y, Zhou Y, Zhang H, et al., 'Fast Outage Analysis of Large-Scale Production Clouds with Service Correlation Mining', 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), Madrid, ES (2021) [E1]
|
|
Nova |
2021 |
Luo C, Lin J, Cai S, Chen X, He B, Qiao B, et al., 'AutoCCAG: An Automated Approach to Constrained Covering Array Generation', 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), Madrid, ES (2021) [E1]
|
|
Nova |
2021 |
Chen J, Xu N, Chen P, Zhang H, 'Efficient Compiler Autotuning via Bayesian Optimization', 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), Madrid, ES (2021) [E1]
|
|
Nova |
2021 |
Shi E, Wang Y, Du L, Zhang H, Han S, Zhang D, Sun H, 'CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees', EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings, Online and Punta Cana, Dominican Republic (2021) [E1]
|
|
Nova |
2021 |
Kang Y, Wang Z, Zhang H, Chen J, You H, 'APIRecX: Cross-Library API Recommendation via Pre-Trained Language Model', EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings, Punta Cana, Dominican Republic (2021) [E1]
|
|
Nova |
2020 |
Mirjalili S, Zhang H, Mirjalili S, Chalup S, Noman N, 'A Novel U-Shaped Transfer Function for Binary Particle Swarm Optimisation', Soft Computing for Problem Solving 2019. Proceedings of SocProS 2019, Liverpool, UK (2020) [E1]
|
|
Nova |
2020 |
Zhou J, Li F, Dong J, Zhang H, Hao D, 'Cost-Effective Testing of a Deep Learning Model through Input Reduction', 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), Coimbra, Portugal (2020) [E1]
|
|
Nova |
2020 |
Xu Y, Sui K, Yao R, Zhang H, Lin Q, Dang Y, et al., 'Improving service availability of cloud systems by predicting disk error', Proceedings of the 2018 USENIX Annual Technical Conference, USENIX ATC 2018 (2020)
High service availability is crucial for cloud systems. A typical cloud system uses a large number of physical hard disk drives. Disk errors are one of the most important reasons ... [more]
High service availability is crucial for cloud systems. A typical cloud system uses a large number of physical hard disk drives. Disk errors are one of the most important reasons that lead to service unavailability. Disk error (such as sector error and latency error) can be seen as a form of gray failure, which are fairly subtle failures that are hard to be detected, even when applications are afflicted by them. In this paper, we propose to predict disk errors proactively before they cause more severe damage to the cloud system. The ability to predict faulty disks enables the live migration of existing virtual machines and allocation of new virtual machines to the healthy disks, therefore improving service availability. To build an accurate online prediction model, we utilize both disk-level sensor (SMART) data as well as system-level signals. We develop a cost-sensitive ranking-based machine learning model that can learn the characteristics of faulty disks in the past and rank the disks based on their error-proneness in the near future. We evaluate our approach using real-world data collected from a production cloud system. The results confirm that the proposed approach is effective and outperforms related methods. Furthermore, we have successfully applied the proposed approach to improve service availability of Microsoft Azure.
|
|
|
2020 |
Zhang B, Zhang H, Moscato P, Zhang A, 'Anomaly Detection via Mining Numerical Workflow Relations from Logs', 2020 International Symposium on Reliable Distributed Systems (SRDS), online (2020) [E1]
|
|
Nova |
2020 |
Shu Y, Sui Y, Zhang H, Xu G, 'Perf-AL: Performance Prediction for Configurable Software through Adversarial Learning', Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), Online (2020) [E1]
|
|
Nova |
2020 |
Zhang J, Wang X, Zhang H, Sun H, Pu Y, Liu X, 'Learning to handle exceptions', Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (2020) [E1]
|
|
Nova |
2020 |
Zhang R, Xiao W, Zhang H, Liu Y, Lin H, Yang M, 'An Empirical Study on Program Failures of Deep Learning Jobs', Proceedings of the 2020 ACM/IEEE 42nd International Conference on Software Engineering (ICSE), Seoul, South KOrea (2020) [E1]
|
|
Nova |
2020 |
Zhang J, Wang X, Zhang H, Sun H, Liu X, 'Retrieval-Based Neural Source Code Summarization', Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, Seoul, South Korea (2020) [E1]
|
|
Nova |
2020 |
Chen Y, Yang X, Dong H, He X, Zhang H, Lin Q, et al., 'Identifying Linked Incidents in Large-Scale Online Service Systems', Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, online (2020) [E1]
|
|
Nova |
2020 |
Gao Y, Liu Y, Zhang H, Li Z, Zhu Y, Lin H, Yang M, 'Estimating GPU Memory Consumption of Deep Learning Models', Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, online (2020) [E1]
|
|
Nova |
2020 |
Chen Z, Kang Y, Li L, Zhang X, Zhang H, Xu H, et al., 'Towards Intelligent Incident Management: Why We Need It and How We Make It', Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, online (2020) [E1]
|
|
Nova |
2020 |
Jiang J, Lu W, Chen J, Lin Q, Zhao P, Kang Y, et al., 'How to Mitigate the Incident? An Effective Troubleshooting Guide Recommendation Technique for Online Service Systems', Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, online (2020) [E1]
|
|
Nova |
2020 |
Gu J, Luo C, Qin S, Qiao B, Lin Q, Zhang H, et al., 'Efficient Incident Identification from Multi-Dimensional Issue Reports via Meta-Heuristic Search', Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, online (2020) [E1]
|
|
Nova |
2020 |
Chen J, Zhang S, He X, Lin Q, Zhang H, Hao D, et al., 'How Incidental are the Incidents? Characterizing and Prioritizing Incidents for Large-Scale Online Service Systems', ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, online (2020) [E1]
|
|
Nova |
2019 |
Zhang X, Xu Y, Lin Q, Qiao B, Zhang H, Dang Y, et al., 'Robust Log-based Anomaly Detection on Unstable Log Data', Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Tallinn, Estonia (2019) [E1]
|
|
Nova |
2019 |
Chen J, He X, Lin Q, Zhang H, Hao D, Gao F, et al., 'Continuous Incident Triage for Large-Scale Online Service Systems', 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), San Diego, CA (2019) [E1]
|
|
Nova |
2019 |
Gu X, Zhang H, Kim S, 'CodeKernel: A Graph Kernel Based Approach to the Selection of API Usage Examples', 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), San Diego, CA (2019) [E1]
|
|
Nova |
2019 |
Chen J, Wang G, Hao D, Xiong Y, Zhang H, Zhang L, 'History-guided configuration diversification for compiler test-program generation', Proceedings of the 34th International Conference on Automated Software Engineering, San Diego, CA (2019) [E1]
|
|
Nova |
2019 |
Lin J, Cai S, Luo C, Lin Q, Zhang H, 'Towards More Efficient Meta-heuristic Algorithms for Combinatorial Test Generation', Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Tallinn, Estonia (2019) [E1]
|
|
Nova |
2019 |
Zhang X, Lin Q, Xu Y, Qin S, Zhang H, Qiao B, et al., 'Cross-dataset Time Series Anomaly Detection for Cloud Systems', Proceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference, Renton, WA (2019) [E1]
|
|
Nova |
2019 |
Chen J, He X, Lin Q, Xu Y, Zhang H, Hao D, et al., 'An Empirical Investigation of Incident Triage for Online Service Systems', 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), Montreal, Canada (2019) [E1]
|
|
Nova |
2019 |
Luo C, Hoos HH, Cai S, Lin Q, Zhang H, Zhang D, 'Local Search with Efficient Automatic Configuration for Minimum Vertex Cover', Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, Macau, China (2019) [E1]
|
|
Nova |
2019 |
Zhang J, Wang X, Zhang H, Sun H, Wang K, Liu X, 'A Novel Neural Source Code Representation Based on Abstract Syntax Tree', Proceedings of the 41st International Conference on Software Engineering, Montreal, Canada (2019) [E1]
|
|
Nova |
2019 |
Ha H, Zhang H, 'DeepPerf: performance prediction for configurable software with deep sparse neural network', Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, Canada (2019) [E1]
|
|
Nova |
2019 |
Chen X, Qiao B, Zhang W, Wu W, Chintalapati M, Zhang D, et al., 'Neural feature search: A neural architecture for automated feature engineering', Proceedings - IEEE International Conference on Data Mining, ICDM, Beijing, China (2019) [E1]
|
|
Nova |
2019 |
Zhang B, Zhang H, Chen J, Hao D, Moscato P, 'Automatic Discovery and Cleansing of Numerical Metamorphic Relations', 2019 IEEE International Conference on Software Maintenance and Evolution, ICSME 2019, Cleveland, OH (2019) [E1]
|
|
Nova |
2019 |
Li C, Zhou M, Gu Z, Gu M, Zhang H, 'Ares: Inferring error specifications through static analysis', Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019, San Diego, CA (2019) [E1]
|
|
Nova |
2019 |
Ha H, Zhang H, 'Performance-Influence Model for Highly Configurable Software with Fourier Learning and Lasso Regression', Proceedings - 2019 IEEE International Conference on Software Maintenance and Evolution, ICSME 2019, Cleveland, OH (2019) [E1]
|
|
Nova |
2019 |
Chen Y, Zhang H, Yang X, Lin Q, Zhang D, Dong H, et al., 'Outage Prediction and Diagnosis for Cloud Service Systems', The Web Conference. Proceedings of The World Wide Web Conference WWW 2019, San Francisco, CA (2019) [E1]
|
|
Nova |
2019 |
Zhang B, Zhang H, Chen J, Hao D, Moscato P, 'AutoMR: Automatic Discovery and Cleansing of Numerical Metamorphic Relations', 2019 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2019), OH, Cleveland (2019)
|
|
|
2018 |
Lin Q, Hsieh K, Dang Y, Zhang H, Sui K, Xu Y, et al., 'Predicting node failure in cloud service systems', ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Lake Buena Vista, FL (2018) [E1]
|
|
Nova |
2018 |
He S, Lin Q, Lou J-G, Zhang H, Lyu MR, Zhang D, 'Identifying impactful service system problems via log analysis', ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Lake Buena Vista, FL, USA (2018) [E1]
|
|
Nova |
2018 |
Jiang J, Xiong Y, Zhang H, Gao Q, Chen X, 'Shaping program repair space with existing patches and similar code', ISSTA 2018 - Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, Amsterdam, Netherlands (2018) [E1]
|
|
Nova |
2018 |
Tonelli R, Ducasse S, Fenu G, Bracciali A, Amaral V, Arcelli F, et al., 'Message from the chairs', 2018 IEEE 1st International Workshop on Blockchain Oriented Software Engineering, IWBOSE 2018 - Proceedings (2018)
|
|
|
2018 |
Abreu R, Zhang H, 'Message from the QRS 2018 program chairs', Proceedings - 2018 IEEE 18th International Conference on Software Quality, Reliability, and Security, QRS 2018 (2018)
|
|
|
2018 |
Abreu R, Zhang H, 'Message from the QRS 2018 Program Chairs', Proceedings - 2018 IEEE 18th International Conference on Software Quality, Reliability, and Security Companion, QRS-C 2018 (2018)
|
|
|
2018 |
Galster M, Zhang H, 'Message from the ASWEC 2018: Short research paper program committee chairs', Proceedings - 25th Australasian Software Engineering Conference, ASWEC 2018 (2018)
|
|
|
2018 |
Washizaki H, Zhang H, 'Message from the APSEC 2018 Program Co-Chairs', Proceedings - Asia-Pacific Software Engineering Conference, APSEC (2018)
|
|
|
2018 |
Lin Q, Ke W, Lou JG, Zhang H, Sui K, Xu Y, et al., 'BigIN4: Instant, interactive insight identification for multi-dimensional big data', Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK (2018) [E1]
|
|
Nova |
2018 |
Xu Y, Sui K, Yao R, Zhang H, Lin Q, Dang Y, et al., 'Improving Service Availability of Cloud Systems by Predicting Disk Error', Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC 18), Boston, MA (2018) [E1]
|
|
Nova |
2018 |
Barbar M, Sui Y, Zhang H, Chen S, Xue J, 'Live Path CFI Against Control Flow Hijacking Attacks', Information Security and Privacy: 23rd Australasian Conference, ACISP 2018, Wollongong, NSW (2018) [E1]
|
|
Nova |
2018 |
Barbar M, Sui Y, Zhang H, Chen S, Xue J, 'Poster: Live Path Control Flow Integrity', PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING - COMPANION (ICSE-COMPANION, Gothenburg, SWEDEN (2018)
|
|
|
2018 |
Wu R, Wen M, Cheung S-C, Zhang H, 'ChangeLocator: Locate Crash-Inducing Changes Based on Crash Reports', PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), Gothenburg, SWEDEN (2018)
|
|
|
2018 |
Gu X, Zhang H, Kim S, 'Deep code search', ICSE '18 Proceedings of the 40th International Conference on Software Engineering, Gothenburg, Sweden (2018) [E1]
|
|
Nova |
2017 |
Li Z, Jing X, Zhu X, Zhang H, 'Heterogeneous Defect Prediction Through Multiple Kernel Learning and Ensemble Learning', 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), Shanghai, China (2017) [E1]
|
|
Nova |
2017 |
Gu X, Zhang H, Zhang D, Kim S, 'DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning', Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, , August 19-25, 2017, Melbourne, Australia (2017) [E1]
|
|
Nova |
2017 |
Chen J, Bai Y, Hao D, Xiong Y, Zhang H, Xie B, 'Learning to prioritize test programs for compiler testing', ICSE'17 Proceedings of the 39th International Conference on Software Engineering, Buenos Aires, Argentina (2017) [E1]
|
|
Nova |
2017 |
Shu C, Zhang H, 'Neural Programming by Example', Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA., San Francisco, CA (2017) [E1]
|
|
Nova |
2016 |
Chen J, Hu W, Hao D, Xiong Y, Zhang H, Zhang L, Xie B, 'An empirical comparison of compiler testing techniques', Proceedings of the 38th International Conference on Software Engineering, Austin, TX (2016) [E1]
|
|
Nova |
2016 |
Lin Q, Lou J-G, Zhang H, Zhang D, 'iDice: Problem Identification for Emerging Issues', 2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), Austin, TX (2016) [E1]
|
|
Nova |
2016 |
Wu R, Xiao X, Cheung S-C, Zhang H, Zhang C, 'Casper: An Efficient Approach to Call Trace Collection', ACM SIGPLAN NOTICES, St Petersburg, FL (2016) [E1]
|
|
Nova |
2016 |
Zhou M, Cheng X, Guo X, Gu M, Zhang H, Song X, 'Improving Failure Detection by Automatically Generating Test Cases Near the Boundaries', PROCEEDINGS 2016 IEEE 40TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS, VOL 1, Atlanta, GA (2016) [E1]
|
|
|
2016 |
Chen J, Bai Y, Hao D, Xiong Y, Zhang H, Zhang L, Xie B, 'Test Case Prioritization for Compilers: A Text-Vector Based Approach', 2016 9TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST), Chicago, IL (2016) [E1]
|
|
|
2016 |
Lin Q, Zhang H, Lou J-G, Zhang Y, Chen X, 'Log Clustering based Problem Identification for Online Service Systems', 2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING COMPANION (ICSE-C), Austin, TX (2016) [E1]
|
|
Nova |
2016 |
Gu X, Zhang H, Zhang D, Kim S, 'Deep API Learning', FSE'16: PROCEEDINGS OF THE 2016 24TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON FOUNDATIONS OF SOFTWARE ENGINEERING, Seattle, WA (2016) [E1]
|
|
Nova |
2016 |
Zhang H, Jain A, Khandelwal G, Kaushik C, Ge S, Hu W, 'Bing Developer Assistant: Improving Developer Productivity by Recommending Sample Code', FSE'16: PROCEEDINGS OF THE 2016 24TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON FOUNDATIONS OF SOFTWARE ENGINEERING, Seattle, WA (2016) [E1]
|
|
Nova |
2015 |
Ding S, Tan HBK, Zhang H, 'ABOR: An Automatic Framework for Buffer Overflow Removal in C/C plus plus Programs', ENTERPRISE INFORMATION SYSTEMS, ICEIS 2014, Lisbon, PORTUGAL (2015) [E1]
|
|
|
2015 |
Zhu J, He P, Fu Q, Zhang H, Lyu MR, Zhang D, 'Learning to Log: Helping Developers Make Informed Logging Decisions', 2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 1, Florence, ITALY (2015) [E1]
|
|
Nova |
2015 |
Zhou H, Lou J-G, Zhang H, Lin H, Lin H, Qin T, 'An Empirical Study on Quality Issues of Production Big Data Platform', 2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 2, Florence, ITALY (2015) [E1]
|
|
Nova |
2015 |
Ding R, Zhou H, Lou JG, Zhang H, Lin Q, Fu Q, et al., 'Log2: A cost-aware logging mechanism for performance diagnosis', Proceedings of the 2015 USENIX Annual Technical Conference, USENIX ATC 2015 (2015) [E1]
Logging has been a common practice for monitoring and diagnosing performance issues. However, logging comes at a cost, especially for large-scale online service systems. First, th... [more]
Logging has been a common practice for monitoring and diagnosing performance issues. However, logging comes at a cost, especially for large-scale online service systems. First, the overhead incurred by intensive logging is non-negligible. Second, it is costly to diagnose a performance issue if there are a tremendous amount of redundant logs. Therefore, we believe that it is important to limit the overhead incurred by logging, without sacrificing the logging effectiveness. In this paper we propose Log2, a cost-aware logging mechanism. Given a "budget" (defined as the maximum volume of logs allowed to be output in a time interval), Log2 makes the "whether to log" decision through a two-phase filtering mechanism. In the first phase, a large number of irrelevant logs are discarded efficiently. In the second phase, useful logs are cached and output while complying with logging budget. In this way, Log2 keeps the useful logs and discards the less useful ones. We have implemented Log2 and evaluated it on an open source system as well as a real-world online service system from Microsoft. The experimental results show that Log2 can control logging overhead while preserving logging effectiveness.
|
|
Nova |
2015 |
Lv F, Zhang H, Lou J-G, Wang S, Zhang D, Zhao J, 'CodeHow: Effective Code Search based on API Understanding and Extended Boolean Model', 2015 30TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), Lincoln, NE (2015) [E1]
|
|
Nova |
2014 |
'Proceedings of the 9th International Workshop on Advanced Modularization Techniques, AOAsia 2014, Hong Kong, China, November 16, 2014', AOAsia@SIGSOFT FSE (2014) |
|
|
2014 |
Liu K, Tan HBK, Zhang H, 'Mining key and referential constraints enforcement patterns.', SAC (2014) |
|
|
2014 |
'Proceedings of the 5th International Workshop on Emerging Trends in Software Metrics, WETSoM 2014, Hyderabad, India, June 3, 2014', WETSoM (2014) |
|
|
2014 |
Ding S, Tan HBK, Zhang H, 'Automatic Removal of Buffer Overflow Vulnerabilities in C/C++ Programs.', ICEIS (2) (2014) |
|
|
2014 |
Wong C-P, Xiong Y, Zhang H, Hao D, Zhang L, Mei H, 'Boosting Bug-Report-Oriented Fault Localization with Segmentation and Stack-Trace Analysis', 2014 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), Victoria, CANADA (2014) [E1]
|
|
Nova |
2014 |
Ding S, Zhang H, Tan HBK, 'Detecting Infeasible Branches Based on Code Patterns', 2014 SOFTWARE EVOLUTION WEEK - IEEE CONFERENCE ON SOFTWARE MAINTENANCE, REENGINEERING, AND REVERSE ENGINEERING (CSMR-WCRE), Antwerp, BELGIUM (2014) [E1]
|
|
Nova |
2014 |
Counsell S, Marchesi M, Venkatasubramanyam R, Visaggio A, Zhang H, 'Message from the Chairs', 5th International Workshop on Emerging Trends in Software Metrics, WETSoM 2014 - Proceedings (2014) |
|
|
2014 |
Wu R, Zhang H, Cheung SC, Kim S, 'Crashlocator: Locating crashing faults based on crash stacks', 2014 International Symposium on Software Testing and Analysis, ISSTA 2014 - Proceedings (2014) [E1]
Software crash is common. When a crash occurs, software developers can receive a report upon user permission. A crash report typically includes a call stack at the time of crash. ... [more]
Software crash is common. When a crash occurs, software developers can receive a report upon user permission. A crash report typically includes a call stack at the time of crash. An important step of debugging a crash is to identify faulty functions, which is often a tedious and labor-intensive task. In this paper, we propose CrashLocator, a method to locate faulty functions using the crash stack information in crash reports. It deduces possible crash traces (the failing execution traces that lead to crash) by expanding the crash stack with functions in static call graph. It then calculates the suspiciousness of each function in the approximate crash traces. The functions are then ranked by their suspiciousness scores and are recommended to developers for further investigation. We evaluate our approach using real-world Mozilla crash data. The results show that our approach is effective: We can locate 50.6%, 63.7% and 67.5% of crashing faults by examining top 1, 5 and 10 functions recommended by CrashLocator, respectively. Our approach outperforms the conventional stack-only methods significantly.
|
|
Nova |
2014 |
Cao Y, Zhang H, Ding S, 'Symcrash: Selective recording for reproducing crashes', ASE 2014 - Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (2014)
Software often crashes despite tremendous effort on software quality assurance. Once developers receive a crash report, they need to reproduce the crash in order to understand the... [more]
Software often crashes despite tremendous effort on software quality assurance. Once developers receive a crash report, they need to reproduce the crash in order to understand the problem and locate the fault. However, limited information from crash reports often makes crash reproduction difficult. Many "captureand-replay" techniques have been proposed to automatically capture program execution data from the failing code, and help developers replay the crash scenarios based on the captured data. However, such techniques often suffer from heavy overhead and introduce privacy concerns. Recently, methods such as BugRedux were proposed to generate test input that leads to crash through symbolic execution. However, such methods have inherent limitations because they rely on conventional symbolic execution techniques. In this paper, we propose a dynamic symbolic execution method called SymCon, which addresses the limitation of conventional symbolic execution by selecting functions that are hard to be resolved by a constraint solver and using their concrete runtime values to replace the symbols. We then propose SymCrash, a selective recording approach that only instruments and monitors the hard-to-solve functions. SymCrash can generate test input for crashes through SymCon. We have applied our approach to successfully reproduce 13 failures of 6 real-world programs. Our results confirm that the proposed approach is suitable for reproducing crashes, in terms of effectiveness, overhead, and privacy. It also outperforms the related methods.
|
|
|
2014 |
Sun C, Zhang H, Lou JG, Zhang H, Wang Q, Zhang D, Khoo SC, 'Querying sequential software engineering data', Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering (2014) [E1]
We propose a pattern-based approach to effectively and efficiently analyzing sequential software engineering (SE) data. Different from other types of SE data, sequential SE data p... [more]
We propose a pattern-based approach to effectively and efficiently analyzing sequential software engineering (SE) data. Different from other types of SE data, sequential SE data preserves unique temporal properties, which cannot be easily analyzed without much programming effort. In order to facilitate the analysis of sequential SE data, we design a sequential pattern query language (SPQL), which specifies the temporal properties based on regular expressions, and is enhanced with variables and statements to store and manipulate matching states. We also propose a query engine to effectively process the SPQL queries. We have applied our approach to analyze two types of SE data, namely bug report history and source code change history. We experiment with 181,213 Eclipse bug reports and 323,989 code revisions of Android. SPQL enables us to explore interesting temporal properties underneath these sequential data with a few lines of query code and low matching overhead. The analysis results can help better understand a software process and identify process violations.
|
|
Nova |
2014 |
Hu H, Zhang H, Xuan J, Sun W, 'Effective bug triage based on historical bug-fix information', Proceedings - International Symposium on Software Reliability Engineering, ISSRE (2014) [E1]
For complex and popular software, project teams could receive a large number of bug reports. It is often tedious and costly to manually assign these bug reports to developers who ... [more]
For complex and popular software, project teams could receive a large number of bug reports. It is often tedious and costly to manually assign these bug reports to developers who have the expertise to fix the bugs. Many bug triage techniques have been proposed to automate this process. In this paper, we describe our study on applying conventional bug triage techniques to projects of different sizes. We find that the effectiveness of a bug triage technique largely depends on the size of a project team (measured in terms of the number of developers). The conventional bug triage methods become less effective when the number of developers increases. To further improve the effectiveness of bug triage for large projects, we propose a novel recommendation method called Bug Fixer, which recommends developers for a new bug report based on historical bug-fix information. Bug Fixer constructs a Developer-Component-Bug (DCB) network, which models the relationship between developers and source code components, as well as the relationship between the components and their associated bugs. A DCB network captures the knowledge of 'who fixed what, where'. For a new bug report, Bug Fixer uses a DCB network to recommend to triager a list of suitable developers who could fix this bug. We evaluate Bug Fixer on three large-scale open source projects and two smaller industrial projects. The experimental results show that the proposed method outperforms the existing methods for large projects and achieves comparable performance for small projects.
|
|
Nova |
2014 |
Lim MH, Lou JG, Zhang H, Fu Q, Teoh ABJ, Lin Q, et al., 'Identifying Recurrent and Unknown Performance Issues', Proceedings - IEEE International Conference on Data Mining, ICDM (2014) [E1]
For a large-scale software system, especially an online service system, when a performance issue occurs, it is desirable to check whether this issue has occurred before. If there ... [more]
For a large-scale software system, especially an online service system, when a performance issue occurs, it is desirable to check whether this issue has occurred before. If there are past similar issues, a known remedy could be applied. Otherwise, a new troubleshooting process may have to be initiated. The symptom of a performance issue can be characterized by a set of metrics. Due to the sophisticated nature of software systems, manual diagnosis of performance issues based on metric data is typically expensive and laborious. In this paper, we propose a Hidden Markov Random Field (HMRF) based approach to automatic identification of recurrent and unknown performance issues. We formulate the problem of issue identification as a HMRF-based clustering problem. Our approach incorporates the learning of metric discretization thresholds and the optimization of issue clustering. Based on the learned thresholds and cluster centroids, we can achieve accurate identification of recurrent issues and unknown issues. Experimental evaluations on an open benchmark and a large-scale industrial production system show that our approach is effective and outperforms the related state-of-the-art approaches.
|
|
Nova |
2013 |
Hao D, Lan T, Zhang H, Guo C, Zhang L, 'Is This a Bug or an Obsolete Test?', ECOOP 2013 - OBJECT-ORIENTED PROGRAMMING, FRANCE, Sao Paulo (2013) [E1]
|
|
Nova |
2013 |
Liu K, Tan HBK, Zhang H, 'Has This Bug Been Reported?', 2013 20TH WORKING CONFERENCE ON REVERSE ENGINEERING (WCRE), GERMANY, Univ Koblenz, Koblenz (2013) [E1]
|
|
|
2013 |
Zhang H, Gong L, Versteeg S, 'Predicting Bug-Fixing Time: An Empirical Study of Commercial Software Projects', PROCEEDINGS OF THE 35TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2013), San Francisco, CA (2013) [E1]
|
|
Nova |
2013 |
Zhang H, Cheung SC, 'A cost-effectiveness criterion for applying software defect prediction models', 2013 9th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE 2013 - Proceedings (2013)
Ideally, software defect prediction models should help organize software quality assurance (SQA) resources and reduce cost of finding defects by allowing the modules most likely t... [more]
Ideally, software defect prediction models should help organize software quality assurance (SQA) resources and reduce cost of finding defects by allowing the modules most likely to contain defects to be inspected first. In this paper, we study the cost-effectiveness of applying defect prediction models in SQA and propose a basic cost-effectiveness criterion. The criterion implies that defect prediction models should be applied with caution. We also propose a new metric FN/(FN+TN) to measure the cost-effectiveness of a defect prediction model. Copyright 2013 ACM.
|
|
|
2013 |
Gong J, Zhang H, 'BugMap: A topographic map of bugs', 2013 9th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE 2013 - Proceedings (2013)
A large and complex software system could contain a large number of bugs. It is desirable for developers to understand how these bugs are distributed across the system, so they co... [more]
A large and complex software system could contain a large number of bugs. It is desirable for developers to understand how these bugs are distributed across the system, so they could have a better overview of software quality. In this paper, we describe BugMap, a tool we developed for visualizing large-scale bug location information. Taken source code and bug data as the input, BugMap can display bug localizations on a topographic map. By examining the topographic map, developers can understand how the components and files are affected by bugs. We apply this tool to visualize the distribution of Eclipse bugs across components/files. The results show that our tool is effective for understanding the overall quality status of a large-scale system and for identifying the problematic areas of the system. Copyright 2013 ACM.
|
|
|
2013 |
Hao D, Lan T, Zhang H, Guo C, Zhang L, 'Is this a bug or an obsolete test?', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013)
In software evolution, developers typically need to identify whether the failure of a test is due to a bug in the source code under test or the obsoleteness of the test code when ... [more]
In software evolution, developers typically need to identify whether the failure of a test is due to a bug in the source code under test or the obsoleteness of the test code when they execute a test suite. Only after finding the cause of a failure can developers determine whether to fix the bug or repair the obsolete test. Researchers have proposed several techniques to automate test repair. However, test-repair techniques typically assume that test failures are always due to obsolete tests. Thus, such techniques may not be applicable in real world software evolution when developers do not know whether the failure is due to a bug or an obsolete test. To know whether the cause of a test failure lies in the source code under test or in the test code, we view this problem as a classification problem and propose an automatic approach based on machine learning. Specifically, we target Java software using the JUnit testing framework and collect a set of features that may be related to failures of tests. Using this set of features, we adopt the Best-first Decision Tree Learning algorithm to train a classifier with some existing regression test failures as training instances. Then, we use the classifier to classify future failed tests. Furthermore, we evaluated our approach using two Java programs in three scenarios (within the same version, within different versions of a program, and between different programs), and found that our approach can effectively classify the causes of failed tests. © 2013 Springer-Verlag Berlin Heidelberg.
|
|
|
2013 |
Wang J, Dang Y, Zhang H, Chen K, Xie T, Zhang D, 'Mining succinct and high-coverage API usage patterns from source code', IEEE International Working Conference on Mining Software Repositories (2013) [E1]
During software development, a developer often needs to discover specific usage patterns of Application Programming Interface (API) methods. However, these usage patterns are ofte... [more]
During software development, a developer often needs to discover specific usage patterns of Application Programming Interface (API) methods. However, these usage patterns are often not well documented. To help developers to get such usage patterns, there are approaches proposed to mine client code of the API methods. However, they lack metrics to measure the quality of the mined usage patterns, and the API usage patterns mined by the existing approaches tend to be many and redundant, posing significant barriers for being practical adoption. To address these issues, in this paper, we propose two quality metrics (succinctness and coverage) for mined usage patterns, and further propose a novel approach called Usage Pattern Miner (UP-Miner) that mines succinct and high-coverage usage patterns of API methods from source code. We have evaluated our approach on a large-scale Microsoft codebase. The results show that our approach is effective and outperforms an existing representative approach MAPO. The user studies conducted with Microsoft developers confirm the usefulness of the proposed approach in practice. © 2013 IEEE.
|
|
Nova |
2012 |
Zhou J, Zhang H, 'Learning to rank duplicate bug reports', ACM International Conference Proceeding Series (2012) [E1]
For a large and complex software system, the project team could receive a large number of bug reports. Some bug reports could be duplicates as they essentially report the same pro... [more]
For a large and complex software system, the project team could receive a large number of bug reports. Some bug reports could be duplicates as they essentially report the same problem. It is often tedious and costly to manually check if a newly reported bug is a duplicate of an already reported bug. In this paper, we propose BugSim, a method that can automatically retrieve duplicate bug reports given a new bug report. BugSim is based on learning to rank concepts. We identify textual and statistical features of bug reports and propose a similarity function for bug reports based on the features. We then construct a training set by assembling pairs of duplicate and non-duplicate bug reports. We train the weights of features by applying the stochastic gradient descent algorithm over the training set. For a new bug report, we retrieve candidate duplicate reports using the trained model. We evaluate BugSim using more than 45,100 real bug reports of twelve Eclipse projects. The evaluation results show that the proposed method is effective. On average, the recall rate for the top 10 retrieved reports is 76.11%. Furthermore, BugSim outperforms the previous state-of-art methods that are implemented using SVM and BM25F ext. © 2012 ACM.
|
|
Nova |
2012 |
'Proceedings of the 3rd International Workshop on Emerging Trends in Software Metrics, WETSoM 2012, Zurich, Switzerland, June 3, 2012', WETSoM (2012) |
|
|
2012 |
Anderson DJ, Concas G, Lunesu MI, Marchesi M, Zhang H, 'A Comparative Study of Scrum and Kanban Approaches on a Real Case Study Using Simulation', AGILE PROCESSES IN SOFTWARE ENGINEERING AND EXTREME PROGRAMMING, XP 2012, Malmo, SWEDEN (2012) [E1]
|
|
|
2012 |
Zhou J, Zhang H, Lo D, 'Where Should the Bugs Be Fixed? More Accurate Information Retrieval-Based Bug Localization Based on Bug Reports', 2012 34TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), Zurich, SWITZERLAND (2012) [E1]
|
|
Nova |
2012 |
Dang Y, Wu R, Zhang H, Zhang D, Nobel P, 'ReBucket: A Method for Clustering Duplicate Crash Reports Based on Call Stack Similarity', 2012 34TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), Zurich, SWITZERLAND (2012) [E1]
|
|
Nova |
2012 |
Gong L, Lo D, Jiang L, Zhang H, 'Diversity Maximization Speedup for Fault Localization', 2012 PROCEEDINGS OF THE 27TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), Essen, GERMANY (2012) [E1]
|
|
Nova |
2012 |
Tran MH, Colman A, Han J, Zhang H, 'Modeling and Verification of Context-aware Systems', 2012 19TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC), VOL 1, PEOPLES R CHINA, Hong Kong (2012) [E1]
|
|
|
2012 |
Wang J, Zhang H, 'Predicting Defect Numbers Based on Defect State Transition Models', PROCEEDINGS OF THE ACM-IEEE INTERNATIONAL SYMPOSIUM ON EMPIRICAL SOFTWARE ENGINEERING AND MEASUREMENT (ESEM'12), Lund, SWEDEN (2012) [E1]
|
|
Nova |
2012 |
Gong L, Lo D, Jiang L, Zhang H, 'Interactive Fault Localization Leveraging Simple User Feedback', 2012 28TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE (ICSM), ITALY, Riva del Garda (2012) [E1]
|
|
Nova |
2012 |
Ding S, Tan HBK, Liu K, Chandramohan M, Zhang H, 'Detection of buffer overflow vulnerabilities in C/C++ with pattern based limited symbolic evaluation', Proceedings - International Computer Software and Applications Conference (2012)
Buffer overflow vulnerability is one of the major security threats for applications written in C/C++. Among the existing approaches for detecting buffer overflow vulnerability, th... [more]
Buffer overflow vulnerability is one of the major security threats for applications written in C/C++. Among the existing approaches for detecting buffer overflow vulnerability, though flow sensitive based approaches offer higher precision but they are limited by heavy overhead and the fact that many constraints are unsolvable. We propose a novel method to efficiently detect vulnerable buffer overflows in any given control flow graph through recognizing two patterns. The proposed approach first uses syntax analysis to filter away those branches that cannot possibly comply with any of the two patterns before applying a limited symbolic evaluation for a precise matching against the patterns. The proposed approach only needs to evaluate a limited set of selected branch predicates according to the patterns and avoids the need to deal with a large number of general branch predicates. This significantly improves the scalability while not sacrificing the detection precision. Our experiments demonstrate the scalability and efficiency of the proposed method, which demonstrates its applicability. © 2012 IEEE.
|
|
|
2012 |
Grieskamp W, Zhang H, 'Message from the QSIC 2012 Industry Track Chairs', Proceedings - International Conference on Quality Software (2012)
|
|
|
2012 |
Concas G, Canfora G, Tempero E, Zhang H, 'Welcome to 3rd International Workshop on Emerging Trends in Software Metrics (WETSoM 2012)', 2012 3rd International Workshop on Emerging Trends in Software Metrics, WETSoM 2012 - Proceedings (2012)
Welcome to WETSoM2012, the 3rd International Workshop on Emerging Trends in Software Metrics. Since its start, WETSoM attracted a blend of academic and industrial researchers, cre... [more]
Welcome to WETSoM2012, the 3rd International Workshop on Emerging Trends in Software Metrics. Since its start, WETSoM attracted a blend of academic and industrial researchers, creating a stimulating atmosphere to discuss the progresses of software metrics. A key motivation for this workshop is to help overcoming the low impact that software metrics has on current software development. This is pursued by critically examining the evidence for the effectiveness of existing metrics and identifying new directions for metrics. Evidence for existing metrics includes how the metrics have been used in practice and studies showing their effectiveness. Identifying new directions includes use of new theories, such as complex network theory, on which to base metrics. We are pleased that this year WETSoMfeatures 12 technical paper and an exciting keynote on mining developers' communication to assess software quality by Massimiliano di Penta. The program of WETSoM2012 is the result of hard work by many dedicated people; we especially thank the authors of submitted papers and the members of the program committee. Above all, the greatest richness of this workshop is its participants, who shape the discussion and points into new directions for software metrics research and practice. We hope you will have a great time and an unforgettable experience at WETSoM2012. © 2012 IEEE.
|
|
|
2012 |
Anderson DJ, Concas G, Lunesu MI, Marchesi M, Zhang H, 'A comparative study of scrum and kanban approaches on a real case study using simulation', Lecture Notes in Business Information Processing (2012)
We present the application of software process modeling and simulation using an agent-based approach to a real case study of software maintenance. The original process used PSP/TS... [more]
We present the application of software process modeling and simulation using an agent-based approach to a real case study of software maintenance. The original process used PSP/TSP; it spent a large amount of time estimating in advance maintenance requests, and needed to be greatly improved. To this purpose, a Kanban system was successfully implemented, that demonstrated to be able to substantially improve the process without giving up PSP/TSP. We customized the simulator and, using input data with the same characteristics of the real ones, we were able to obtain results very similar to that of the processes of the case study, in particular of the original process. We also simulated, using the same input data, the possible application of the Scrum process to the same data, showing results comparable to the Kanban process. © 2012 Springer-Verlag Berlin Heidelberg.
|
|
|
2011 |
Li YF, Zhang H, 'Integrating software engineering data using semantic web technologies', Proceedings - International Conference on Software Engineering (2011)
A plethora of software engineering data have been produced by different organizations and tools over time. These data may come from different sources, and are often disparate and ... [more]
A plethora of software engineering data have been produced by different organizations and tools over time. These data may come from different sources, and are often disparate and distributed. The integration of these data may open up the possibility of conducting systemic, holistic study of software projects in ways previously unexplored. Semantic Web technologies have been used successfully in a wide array of domains such as health care and life sciences as a platform for information integration and knowledge management. The success is largely due to the open and extensible nature of ontology languages as well as growing tool support. We believe that Semantic Web technologies represent an ideal platform for the integration of software engineering data in a semantic repository. By querying and analyzing such a repository, researchers and practitioners can better understand and control software engineering activities and processes. In this paper, we describe how we apply Semantic Web techniques to integrate object-oriented software engineering data from different sources. We also show how the integrated data can help us answer complex queries about large-scale software projects through a case study on the Eclipse system. © 2011 ACM.
|
|
|
2011 |
Wu R, Zhang H, Kim S, Cheung SC, 'ReLink: Recovering links between bugs and changes', SIGSOFT/FSE 2011 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering (2011) [E1]
Software defect information, including links between bugs and committed changes, plays an important role in software maintenance such as measuring quality and predicting defects. ... [more]
Software defect information, including links between bugs and committed changes, plays an important role in software maintenance such as measuring quality and predicting defects. Usually, the links are automatically mined from change logs and bug reports using heuristics such as searching for specific keywords and bug IDs in change logs. However, the accuracy of these heuristics depends on the quality of change logs. Bird et al. found that there are many missing links due to the absence of bug references in change logs. They also found that the missing links lead to biased defect information, and it affects defect prediction performance. We manually inspected the explicit links, which have explicit bug IDs in change logs and observed that the links exhibit certain features. Based on our observation, we developed an automatic link recovery algorithm, ReLink, which automatically learns criteria of features from explicit links to recover missing links. We applied ReLink to three open source projects. ReLink reliably identified links with 89% precision and 78% recall on average, while the traditional heuristics alone achieve 91% precision and 64% recall. We also evaluated the impact of recovered links on software maintainability measurement and defect prediction, and found the results of ReLink yields significantly better accuracy than those of traditional heuristics. © 2011 ACM.
|
|
Nova |
2011 |
'Proceedings of the 2nd International Workshop on Emerging Trends in Software Metrics, WETSoM 2011, Waikiki, Honolulu, HI, USA, May 24, 2011', WETSoM (2011) |
|
|
2011 |
Kim S, Zhang H, Wu R, Gong L, 'Dealing with Noise in Defect Prediction', 2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), Honolulu, HI (2011) [E1]
|
|
Nova |
2011 |
Concas G, Di Penta M, Tempero E, Zhang H, 'Workshop on Emerging Trends in Software Metrics (WETSoM 2011)', 2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), Honolulu, HI (2011) [E3] |
|
|
2011 |
'Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering' (2011)
|
|
|
2011 |
Liu K, Tan HBK, Chen X, Zhang H, Padmanabhuni BM, 'Automated extraction of data lifecycle support from database applications', SEKE 2011 - Proceedings of the 23rd International Conference on Software Engineering and Knowledge Engineering (2011)
Database application is one of the most common types of systems. Grounded on the simple concept of data lifecycle-any data in database is created from insertion, used via selectio... [more]
Database application is one of the most common types of systems. Grounded on the simple concept of data lifecycle-any data in database is created from insertion, used via selection and modification and terminated at deletion-this paper proposes a novel approach to reverse engineer the data lifecycle automatically from the source code of database applications. The extracted information can be used for the selection of open-source database applications for adaptation. It can also be used for maintenance and verification of database applications. A tool has been developed to implement the proposed approach for PHP-based database applications. Case studies have also been conducted to evaluate the use of the proposed approach.
|
|
|
2011 |
Concas G, Tempero E, Zhang H, Di Penta M, 'Workshop on Emerging Trends in Software Metrics (WETSoM 2011)', Proceedings - International Conference on Software Engineering (2011) |
|
|
2011 |
Jarzabek S, Pettersson U, Zhang H, 'University-industry collaboration journey towards product lines', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011)
Product Lines for mission critical Command and Control systems was a starting point for a long lasting research collaboration between National University of Singapore (NUS) and ST... [more]
Product Lines for mission critical Command and Control systems was a starting point for a long lasting research collaboration between National University of Singapore (NUS) and ST Electronics (Info-Software Systems) Pte Ltd (STEE-InfoSoft). Collaboration was intensified by a joint research project, also involving University of Waterloo and Netron Inc. that led to development of reuse technology called XVCL. The contribution of this paper is twofold: First, we describe collaboration modes, factors that were critical to sustain collaboration, and benefits for university and industry gained over years. Among the main benefits, STEE-InfoSoft advanced its reuse practice by applying XVCL in several software Product Line projects, while NUS team received early feedback from STEE-InfoSoft which helped refine XVCL reuse methods and keep academic research in sync with industrial realities. Academic findings and industrial pilots have opened new unexpected research directions. Second, we draw lessons learned from many projects, to explain the general nature and significance of problems addressed with the XVCL approach. © 2011 Springer-Verlag.
|
|
|
2010 |
'Proceedings of the 2010 ICSE Workshop on Emerging Trends in Software Metrics, WETSoM 2010, Cape Town, South Africa, May 4, 2010', WETSoM (2010) |
|
|
2010 |
Zhang H, Jarzabek S, 'A Hybrid Approach to Feature-Oriented Programming in XVCL', SOFTWARE PRODUCT LINES: GOING BEYOND, SOUTH KOREA, Jeju Island (2010)
|
|
|
2010 |
Zhang H, Shi B, Zhang L, 'Automatic Checking of License Compliance', 2010 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, Timisoara, ROMANIA (2010)
|
|
|
2010 |
Zhang H, Wu R, 'Sampling Program Quality', 2010 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, Timisoara, ROMANIA (2010)
|
|
|
2010 |
Zhang H, Nelson A, Menzies T, 'On the value of learning from defect dense components for software defect prediction', ACM International Conference Proceeding Series (2010)
BACKGROUND: Defect predictors learned from static code measures can isolate code modules with a higher than usual probability of defects. AIMS: To improve those learners by focusi... [more]
BACKGROUND: Defect predictors learned from static code measures can isolate code modules with a higher than usual probability of defects. AIMS: To improve those learners by focusing on the defect-rich portions of the training sets. METHOD: Defect data CM1, KC1, MC1, PC1, PC3 was separated into components. A subset of the projects (selected at random) were set aside for testing. Training sets were generated for a NaiveBayes classifier in two ways. In sample the dense treatment, the components with higher than the median number of defective modules were used for training. In the standard treatment, modules from any component were used for training. Both samples were run against the test set and evaluated using recall, probability of false alarm, and precision. In addition, under sampling and over sampling was performed on the defect data. Each method was repeated in a 10-by-10 cross-validation experiment. RESULTS: Prediction models learned from defect dense components out-performed standard method, under sampling, as well as over sampling. In statistical rankings based on recall, probability of false alarm, and precision, models learned from dense components won 4-5 times more often than any other method, and also lost the least amount of times. CONCLUSIONS: Given training data where most of the defects exist in small numbers of components, better defect predictors can be trained from the defect dense components.
|
|
|
2010 |
Canfora G, Concas G, Marchesi M, Tempero E, Zhang H, 'Workshop on Emerging Trends in Software Metrics (WETSoM 2010)', Proceedings - International Conference on Software Engineering (2010)
The Workshop on Emerging Trends in Software Metrics aims at bringing together researchers and practitioners to discuss the progress of software metrics. The motivation for this wo... [more]
The Workshop on Emerging Trends in Software Metrics aims at bringing together researchers and practitioners to discuss the progress of software metrics. The motivation for this workshop is the low impact that software metrics has on current software development. The goals of this workshop are to critically examine the evidence for the effectiveness of existing metrics and to identify new directions for development of software metrics. © 2010 ACM.
|
|
|
2010 |
Canfora G, Concas G, Marchesi M, Tempero E, Zhang H, 'Proceedings - International Conference on Software Engineering: Foreword', Proceedings - International Conference on Software Engineering (2010) |
|
|
2009 |
Liu L, Zhang H, Ma W, Shan Y, Xu J, Peng F, Burda T, 'Understanding Chinese Characteristics of Requirements Engineering', PROCEEDINGS OF THE 2009 17TH IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE, Atlanta, GA (2009)
|
|
|
2009 |
Jarzabek S, Xue Y, Zhang H, Lee Y, 'Avoiding Some Common Preprocessing Pitfalls with Feature Queries', APSEC 09: SIXTEENTH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, PROCEEDINGS, MALAYSIA, Bat Ferringhi (2009)
|
|
|
2009 |
Zhang H, 'An Investigation of the Relationships between Lines of Code and Defects', 2009 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, CONFERENCE PROCEEDINGS, Edmonton, CANADA (2009)
|
|
|
2009 |
Jarzabek S, Zhang H, Lee Y, Xue Y, Shaikh N, 'Increasing Usability of Preprocessing for Feature Management in Product Lines with Queries', 2009 31ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, COMPANION VOLUME, Vancouver, CANADA (2009)
|
|
|
2008 |
Zhang H, 'Exploring Regularity in Source Code: Software Science and Zipf's Law', FIFTEENTH WORKING CONFERENCE ON REVERSE ENGINEERING, PROCEEDINGS, BELGIUM, Antwerp (2008)
|
|
|
2008 |
Zhang H, 'An initial study of the growth of Eclipse defects', Proceedings - International Conference on Software Engineering (2008)
We analyze the Eclipse defect data from June 2004 to November 2007, and find that the growth of the number of defects can be well modeled by polynomial functions. Furthermore, we ... [more]
We analyze the Eclipse defect data from June 2004 to November 2007, and find that the growth of the number of defects can be well modeled by polynomial functions. Furthermore, we can predict the number of future Eclipse defects based on the nature of defect growth. Copyright 2008 ACM.
|
|
|
2008 |
Hongyu Z, 'The scale-free nature of semantic web ontology', Proceeding of the 17th International Conference on World Wide Web 2008, WWW'08 (2008)
Semantic web ontology languages, such as OWL, have been widely used for knowledge representation. Through empirical analysis of real-world ontologies we discover that, like many n... [more]
Semantic web ontology languages, such as OWL, have been widely used for knowledge representation. Through empirical analysis of real-world ontologies we discover that, like many natural and social phenomenon, the semantic web ontology is also "scale-free".
|
|
|
2007 |
Zhang H, Zhang X, Gu M, 'Predicting defective software components from code complexity measures', 13TH PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING, PROCEEDINGS, Melbourne, AUSTRALIA (2007)
|
|
|
2007 |
Zhang H, Tan HBK, 'An empirical study of class sizes for large Java systems', 14TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, PROCEEDINGS, Nagoya, JAPAN (2007)
|
|
|
2007 |
Zhang H, Tan HBK, 'An empirical study of class sizes for large java systems', Proceedings - Asia-Pacific Software Engineering Conference, APSEC (2007)
We perform an empirical study of class sizes (in terms of Lines of Code) on a number of large Java software systems, and discover an interesting pattern - that many classes have o... [more]
We perform an empirical study of class sizes (in terms of Lines of Code) on a number of large Java software systems, and discover an interesting pattern - that many classes have only small sizes whereas a few classes have large size. We call this phenomenon the small class phenomenon. Further analysis shows that the class sizes follow the lognormal distribution. Having understood the distribution of class sizes, we then derive a general size estimation model, which reveals the relationship between the size of a large Java system and the number oficiasses the system has. In this paper, we also show that the adoption of objectorientation is a possible cause of the small class phenomenon. We believe our study reveals the regularity that emerges from large-scale object-oriented software construction, and hope our research can contribute to a deep understanding of computer programming. © 2007 IEEE.
|
|
|
2007 |
Zhang H, Zhang X, Gu M, 'Predicting defective software components from code complexity measures', Proceedings - 13th Pacific Rim International Symposium on Dependable Computing, PRDC 2007 (2007)
The ability to predict defective modules can help us allocate limited quality assurance resources effectively and efficiently. In this paper, we propose a complexitybased method f... [more]
The ability to predict defective modules can help us allocate limited quality assurance resources effectively and efficiently. In this paper, we propose a complexitybased method for predicting defect-prone components. Our method takes three code-level complexity measures as input, namely Lines of Code, McCabe's Cyclomatic Complexity and Halstead's Volume, and classifies components as either defective or non-defective. We perform an extensive study of twelve classification models using the public NASA dataseis. Cross-validation results show that our method can achieve good prediction accuracy. This study confirms that static code complexity measures can be useful indicators of component quality. © 2007 IEEE.
|
|
|
2007 |
Peng D, Jarzabek S, Rajapakse DC, Zhang H, 'Reuse of database access layer components in JEE product lines: Limitations and a possible solution (Case Study)', 19th International Conference on Software Engineering and Knowledge Engineering, SEKE 2007 (2007)
We set up an experiment to evaluate JEE as a platform for product line development. While JEE provides many useful mechanisms for reuse of common services/components, still we fou... [more]
We set up an experiment to evaluate JEE as a platform for product line development. While JEE provides many useful mechanisms for reuse of common services/components, still we found that systematic across-the-board reuse in application domain-specific areas was hard. The main difficulty was the lack of a mechanism to represent groups of similar components in a generic, adaptable form. Such similar components arise as the number of variant features of a product line grows, and we need to accommodate legal combinations of variant features in components of a product line architecture. Such uncontrolled growth of similar component versions hinders productivity of reuse-based development and raises maintenance costs. In the paper, we study the manifestation of this problem in the JEE¿ database access layer. Interactive Development Environments such as NetBeans or JBuilder speed up the development process, but they do not address the source of the problem, which is the lack of mechanisms to design generic components capable of accommodating variant features in various combinations. We filled this gap with a "mixed strategy" solution based on generative programming technique of XVCL applied on top of JEE. In the paper, we highlight the nature of the problems we encountered and our solution. Copyright © (2007) by Knowledge Systems Institute (KSI).
|
|
|
2006 |
Tan HBK, Zhao Y, Zhang H, 'Estimating LOC for information systems from their conceptual data models', Proceedings - International Conference on Software Engineering (2006)
Effort and cost estimation is crucial in software management. Estimation of software size plays a key role in the estimation. Line of Code (LOG) is still a commonly used software ... [more]
Effort and cost estimation is crucial in software management. Estimation of software size plays a key role in the estimation. Line of Code (LOG) is still a commonly used software size measure. Despite the fact that software sizing is well recognized as an important problem for more than two decades, there is still much problem in existing methods. Conceptual data model is widely used in the requirements analysis for information systems. It is also not difficult to construct conceptual data models in the early stage of developing information systems. Much characteristic of an information system is actually reflected from its conceptual data model. We explore into the use of conceptual data model for estimating LOC. This paper proposes a novel method for estimating LOG for an information system from its conceptual data model through the use of multiple linear regression model. We have validated the method through collecting samples from both the industry and open-source systems. Copyright 2006 ACM.
|
|
|
2006 |
Jarzabek S, Zhang H, Shen RU, Lam VT, Zhenxin S, 'Analysis of meta-programs: An example', International Journal of Software Engineering and Knowledge Engineering (2006)
Meta-programs are generic, incomplete, adaptable programs that are instantiated at construction time to meet specific requirements. Templates and generative techniques are example... [more]
Meta-programs are generic, incomplete, adaptable programs that are instantiated at construction time to meet specific requirements. Templates and generative techniques are examples of meta-programming techniques. Understanding of meta-programs is more difficult than understanding of concrete, executable programs. Static and dynamic analysis methods have been applied to ease understanding of programs - can similar methods be used for meta-programs? In our projects, we build meta-programs with a meta-programming technique called XVCL. Meta-programs in XVCL are organized into a hierarchy of meta-components from which the XVCL processor generates concrete, executable programs that meet specific requirements. We developed an automated system that analyzes XVCL meta-programs, and presents developers with information that helps them work with meta-programs more effectively. Our system conducts both static and dynamic analysis of a. meta-program. An integral part of our solution is a query language, FQL in which we formulate questions about meta-prograin properties. An FQL query processor automatically answers a class of queries. The analysis method described in the paper is specific to XVCL. However, the principle of our approach can be applied to other meta-programming systems. We believe readers interested in metaprogramming in general will find some of the lessons from our experiment interesting and useful. © World Scientific Publishing Company.
|
|
|
2005 |
Sun J, Zhang H, Li YF, Wang H, 'Formal semantics and verification for feature modeling', Proceedings of the IEEE International Conference on Engineering of Complex Computer Systems, ICECCS (2005)
Research on features has received much attention in the domain engineering community. Feature modeling plays an important role in the design and implementation of complex software... [more]
Research on features has received much attention in the domain engineering community. Feature modeling plays an important role in the design and implementation of complex software systems. However, the presentation and analysis of feature models are still largely informal. There is also an increasing need for methods and tools that can support automated feature model analysis. This paper presents a formal engineering approach to the specification and verification of feature models. A formal semantics for the feature modeling language is defined using first-order logic. It provides a precise and rigorous formal interpretation for the graphical notation. In addition, further validation of the semantics using the Z/EVES theorem prover is presented. Finally, we demonstrate that the consistency of a feature model and its configurations can be automatically verified by encoding the semantics into the Alloy Analyzer. A case study of the Key Word in Context (KWIC) index systems feature model is presented to illustrate the verification process. © 2005 IEEE.
|
|
|
2005 |
Zhang HY, Bradbury JS, Cordy JR, Dingel J, 'Implementation and verification of implicit-invocation systems using source transformation', FIFTH IEEE INTERNATIONAL WORKSHOP ON SOURCE CODE ANALYSIS AND MANIPULATION, PROCEEDINGS, Budapest, HUNGARY (2005)
|
|
|
2003 |
Zhang H, Jarzabek S, 'An XVCL approach to handling variants: A KWIC product line example', Proceedings - Asia-Pacific Software Engineering Conference, APSEC (2003)
We developed XVCL (XML-based Variant Configuration Language), a method and tool for product lines, to facilitate handling variants in reusable software assets (such as architectur... [more]
We developed XVCL (XML-based Variant Configuration Language), a method and tool for product lines, to facilitate handling variants in reusable software assets (such as architecture, code components or UML models). XVCL is a newer version of Bassett's frames [1], a technology that has achieved substantial productivity improvements in large data processing product lines written in COBOL. Despite its simplicity, XVCL can effectively manage a wide range of product line variants from a compact base of meta-components, structured for effective reuse. We applied XVCL in two medium-size product line projects and a number of smaller case studies. In this paper, we communicate XVCL's capabilities to support product lines by means of a simple, but still interesting, example of the KWIC system introduced by Parnas in 1970's. We show how we can handle functional variants, variant design decisions and implementation-level variants in a generic KWIC system.
|
|
|
2003 |
Jarzabek S, Ong WC, Zhang H, 'Handling variant requirements in domain modeling', Journal of Systems and Software (2003)
Domain models describe common and variant requirements for a family of similar systems. Although most of the notations, such as UML, are meant for modeling a single system, they c... [more]
Domain models describe common and variant requirements for a family of similar systems. Although most of the notations, such as UML, are meant for modeling a single system, they can be extended to model variants. We have done that and applied such extended notations in our projects. We soon found that our models with variants were becoming overly complicated, undermining the major role of domain analysis which is understanding. One variant was often reflected in many models and any given model was affected by many variants. The number of possible variant combinations was growing rapidly and mutual dependencies among variants even further complicated the domain model. We realized that our purely descriptive domain model was only useful for small examples but it did not scale up. In this paper, we describe a modeling method and a Flexible Variant Configuration tool (FVC for short) that alleviate the above mentioned problems. In our approach, we start by modeling so-called domain defaults, i.e., requirements that characterize a typical system in a domain. Then, we describe variants as deltas in respect to domain defaults. The FVC interprets variants to produce customized domain model views for a system that meets specific requirements. We implemented the above concepts using commercial tools Netron Fusion¿ and Rational Rose¿. In the paper, we illustrate our domain modeling method and tool with examples from the Facility Reservation System domain. © 2003 Elsevier Inc. All rights reserved.
|
|
|
2003 |
Jarzabek S, Bassett P, Zhang H, Zhang W, 'XVCL: XML-based variant configuration language', Proceedings - International Conference on Software Engineering (2003)
XML-based Variant Configuration Language (XVCL) is a meta-programming technique and tool that provides effective reuse mechanisms. It includes a methodology and a tool-the XVCL pr... [more]
XML-based Variant Configuration Language (XVCL) is a meta-programming technique and tool that provides effective reuse mechanisms. It includes a methodology and a tool-the XVCL processor. The methodology shows how to discover the structure of the solution for the application domain and for the types of variants one wants to address. The XVCL processor automates the routine yet error-prone program construction tasks, allowing to focus on what is novel about the problem domains, requiring creativity.
|
|
|
2002 |
Swe SM, Zhang H, Jarzabek S, 'XVCL: A tutorial', ACM International Conference Proceeding Series (2002)
XVCL (XML-based Variant Configuration Language) is a general-purpose mark-up language for configuring variants in programs and other types of documents. We can apply XVCL to confi... [more]
XVCL (XML-based Variant Configuration Language) is a general-purpose mark-up language for configuring variants in programs and other types of documents. We can apply XVCL to configure variants in a variety of software assets such as software architecture, program code, test cases, technical and user-level program documentation or requirement specifications. The principles of the XVCL have been thoroughly tested in practice. XVCL is based on the same concepts as the frame technology [1]. Frame technology has been extensively applied in industry to manage variants and evolve multi-million-line, COBOL-based, information systems. An independent analysis showed that frame technology has reduced large software project costs by over 84% and their times-to-market by 70%, when compared to industry norms [1, 2]. At the same time, we found that the principles of XVCL are not easy to communicate. In this paper, we describe a subset of XVCL. We trust this subset of XVCL is easy to understand and still effectively communicates essential XVCL concepts. To illustrate the XVCL method, we further describe an XVCL solution to handling variants in a Notepad system. Copyright 2002 ACM.
|
|
|
2001 |
Durrani TS, Leyman AR, 'Message from the chairmen', IEEE Workshop on Statistical Signal Processing Proceedings (2001) |
|
|
2001 |
Wong TW, Jarzabek S, Swe SM, Shen R, Zhang H, 'XML implementation of frame processor', Proceedings of SSR'01 2001 Symposium on Software Reusability (2001)
A quantitative study has shown that frame technology [1] supported by Fusion¿ toolset can lead to reduction in time-to-market (70%) and project costs (84%). Frame technology has b... [more]
A quantitative study has shown that frame technology [1] supported by Fusion¿ toolset can lead to reduction in time-to-market (70%) and project costs (84%). Frame technology has been developed to handle large COBOL-based business software product families. We wished to investigate how the principle of frame approach can be applied to support product families in other application domains, in particular to build distributed component-based systems written in Object-Oriented languages. As Fusion¿ is tightly coupled with COBOL, we implemented our own tools based on frame concepts using the XML technology. In our solution, a generic architecture for a product family is a hierarchy of XML documents. Each such document contains a reusable program fragment instrumented for change with XML tags. We use a tool built on top of XML parsing framework JAXP to process documents in order to produce a custom member of a product family. Our solution is cost-effective and extensible. In the paper, we describe our solution, illustrating its use with examples. We intend to make our solution available to public in order to encourage investigation of frame concepts in other application domains, implementation languages and platforms.
|
|
|
2001 |
Zhang H, Jarzabek S, Swe SM, 'XVCL approach to separating concerns in product family assets', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001)
In this paper, we describe an XML-based language, called XVCL, for managing variants in component-based product families. Using XVCL, we can organize product family assets and ins... [more]
In this paper, we describe an XML-based language, called XVCL, for managing variants in component-based product families. Using XVCL, we can organize product family assets and instrument them to accommodate variants. A tool that interprets XVCL and provides semi-automatic support for asset customization is also introduced. In our projects, we applied XVCL to manage variants in UML domain models and in generic architectures for product families. We have achieved simple forms of separation of concerns (in both models and architectures) and we are investigating advanced forms in current work. We plan to compare XVCL to other emerging techniques that lead to separating of concerns in software models, documents, architectures and code.
|
|
|
2001 |
Jarzabek S, Zhang H, 'XML-based method and tool for handling variant requirements in domain models', Proceedings of the IEEE International Conference on Requirements Engineering (2001)
A domain model describes common and variant requirements for a system family. UML notations used in requirements analysis and software modeling can be extended with "variatio... [more]
A domain model describes common and variant requirements for a system family. UML notations used in requirements analysis and software modeling can be extended with "variation points" to cater for variant requirements. However, UML models for a large single system are already complicated enough. With variants - UML domain models soon become too complicated to be useful. The main reasons are the explosion of possible variant combinations, complex dependencies among variants and inability to trace variants from a domain model down to the requirements for a specific system, member of a family. We believe that the above mentioned problems cannot be solved at the domain model description level alone. In the paper, we propose a novel solution based on a tool that interprets and manipulates domain models to provide analysts with customized, simple domain views. We describe a variant configuration language that allows us to instrument domain models with variation points and record variant dependencies. An interpreter of this language produces customized views of a domain model, helping analysts understand and reuse software models. We describe the concept of our approach and its simple implementation based on XML and XMI technologies.
|
|
|