About Me
I am currently a Ph.D. candidate under the supervision of Prof. Tianwei Zhang at S-Lab, College of Computing and Data Science of Nanyang Technological University, Singapore. Before that, I received my M.Sc. degree in Electrical Engineering from National University of Singapore in 2022 and my B.Eng. degree in Information Engineering from Zhejiang University in 2020.
Research Interests
- Distributed Training
- Systems for Graph Learning
- Machine Learning for Systems
Publications
TorchGT: A Holistic System for Large-scale Graph Transformer Training
Meng Zhang*, Jie Sun*, Qinghao Hu, Peng Sun, Zeke Wang, Yonggang Wen, Tianwei Zhang
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2024
[Paper]
Sylvie: 3D-adaptive and Universal System for Large-scale Graph Neural Network Training
Meng Zhang, Qinghao Hu, Cheng Wan, Haozhao Wang, Peng Sun, Yonggang Wen, Tianwei Zhang
IEEE International Conference on Data Engineering (ICDE), 2024
[Paper] [Code]
Characterization of Large Language Model Development in the Datacenter
Qinghao Hu*, Zhisheng Ye*, Zerui Wang*, Guoteng Wang, Meng Zhang, Qiaoling Chen, Peng Sun, Dahua Lin, Xiaolin Wang, Yingwei Luo, Yonggang Wen, Tianwei Zhang
USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2024
[Paper]
FedDSE: Distribution-aware Sub-model Extraction for Federated Learning over Resource-constrained Devices
Haozhao Wang, Yabo Jia, Meng Zhang, Qinghao Hu, Hao Ren, Peng Sun, Yonggang Wen, Tianwei Zhang
The Web Conference (WWW), 2024
[Paper]
Lucid: A non-intrusive, scalable and interpretable scheduler for deep learning training jobs
Qinghao Hu*, Meng Zhang*, Peng Sun, Yonggang Wen, and Tianwei Zhang
Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023
Distinguished Paper Award
[Paper] [Code]
Hydro: Surrogate-based Hyperparameter Tuning Service in Datacenters
Qinghao Hu, Zhisheng Ye, Meng Zhang, Qiaoling Chen, Peng Sun, Yonggang Wen, Tianwei Zhang
USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2023
[Paper] [Code]
Preprint
Boosting Distributed Full-graph GNN Training with Asynchronous One-bit Communication
Meng Zhang, Qinghao Hu, Peng Sun, Yonggang Wen, Tianwei Zhang
arXiv, 2023
[Paper]
* denotes Equal Contribution
Experiences

System Research Intern | NDS Group @ Shanghai AI Lab
Jun 2023 - present

Research Intern | Tencent JARVIS Lab
Oct 2020 - Feb 2021

Research Intern | Singapore University of Technology and Design
Advisor: Prof. Simon Perrault
Jul 2019 - Sept 2021
Professional Services
[EuroSys 2023] Shadow Committee Member
[MLSys 2023] AE Committee Member
[OSDI 2023] Presenter & AE Committee Member