Welcome to Da Zheng’s personal website

I’m a senior applied scientist at AWS AI. Here I’m building frameworks and algorithms to bring graph ML technologies in production. This incudes DGL for graph neural networks (GNN), DGL-KE for knowledge graph embeddings, DistDGL for scaling GNN training to billion-scale graphs, TGL for temporal graph neural networks. My research interest covers a wide range of areas, including high-performance computing, large-scale data analysis systems, data mining and machine learning. I got a PhD from the department of computer science at the Johns Hopkins University. During my PhD, I worked on FlashGraph and FlashR, frameworks for large-scale graph analysis and data analysis on solid-state drives (SSDs).

Selected Publications

  • Zonghan Wu, Da Zheng, Shirui Pan, Quan Gan, Guodong Long, George Karypis, TraverseNet: Unifying Space and Time in Message Passing for Traffic Forecasting, IEEE Transactions on Neural Networks and Learning Systems 2022 pdf

  • Vassilis N Ioannidis, Xiang Song, Da Zheng, Houyu Zhang, Jun Ma, Yi Xu, Belinda Zeng, Trishul Chilimbi, George Karypis, Efficient and effective training of language and graph neural network models, arXiv:2206.10781 2022 pdf

  • Chunxing Yin, Da Zheng, Israt Nisa, Christos Faloutos, George Karypis, Richard Vuduc, Nimble GNN Embedding with Tensor-Train Decomposition, KDD 2022 pdf

  • Hongkuan Zhou, Da Zheng, Israt Nisa, Vasileios Ioannidis, Xiang Song, George Karypis, TGL: A General Framework for Temporal GNN Training on Billion-Scale Graphs, VLDB 2022 pdf

  • Da Zheng, Xiang Song, Chengru Yang, Dominique LaSalle, George Karypis, Distributed Hybrid CPU and GPU training for Graph Neural Networks on Billion-Scale Heterogeneous Graphs, KDD 2022 pdf

  • Anil Gaihre, Da Zheng, Scott Weitze, Lingda Li, Shuaiwen Leon Song, Caiwen Ding, Xiaoye S Li, Hang Liu Dr. Top-k: delegate-centric Top-k on GPUs, SC 2021 pdf

  • Zonghan Wu, Da Zheng, Shirui Pan, Quan Gan, Guodong Long, George Karypis, TraverseNet: Unifying Space and Time in Message Passing, DLG-KDD 2021 pdf

  • Jialin Dong* , Da Zheng* , Lin F Yang, Geroge Karypis, Global Neighbor Sampling for Mixed CPU-GPU Training on Giant Graphs, KDD 2021 pdf (* indicates equal contribution)

  • Joshua T Vogelstein, Eric W Bridgeford, Minh Tang, Da Zheng, Christopher Douville, Randal Burns, Mauro Maggioni, Supervised dimensionality reduction for big data, Nature communications 2021 html

  • Saurav Manchanda, Da Zheng, George Karypis, Schema-Aware Deep Graph Convolutional Networks for Heterogeneous Graphs, IEEE Big Data 2021 pdf

  • Balasubramaniam Srinivasan, Da Zheng, George Karypis, Learning over Families of Sets – Hypergraph Representation Learning for Higher Order Tasks, SDM 2021 pdf

  • Da Zheng, Chao Ma, Minjie Wang, Jinjing Zhou, Qidong Su, Xiang Song, Quan Gan, Zheng Zhang, George Karypis, DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs, arXiv:2010.05337, 2020 pdf

  • Yuwei Hu, Zihao Ye, Minjie Wang, Jiali Yu, Da Zheng, Mu Li, Zheng Zhang, Zhiru Zhang, Yida Wang, Featgraph: A flexible and efficient backend for graph neural network systems, in SC 2020 pdf

  • Vassilis N Ioannidis, Da Zheng, George Karypis, PanRep: Universal node embeddings for heterogeneous graphs, arXiv:2007.10445, 2020 pdf

  • Da Zheng, Xiang Song, Chao Ma, Zeyuan Tan, Zihao Ye, Jin Dong, Hao Xiong, Zheng Zhang, George Karypis, DGL-KE: Training knowledge graph embeddings at scale, in SIGIR 2020 pdf

  • Qi Zhu, Hao Wei, Bunyamin Sisman, Da Zheng, Christos Faloutsos, Xin Luna Dong, Jiawei Han, Collective Multi-type Entity Alignment Between Knowledge Graphs, in the Web Conference 2020 pdf

  • Minjie Wang, Lingfan Yu, Da Zheng, Quan Gan, Yu Gai, Zihao Ye, Mufei Li, Jinjing Zhou, Qi Huang, Chao Ma, Ziyue Huang, Qipeng Guo, Hao Zhang, Haibin Lin, Junbo Zhao, Jinyang Li, Alexander Smola, Zheng Zhang, Deep graph library: Towards efficient and scalable deep learning on graphs, arXiv:1909.01315, 2019 pdf

  • Da Zheng, Disa Mhembere, Joshua Vogelstein, Carey E. Priebe, Randal Burns, FlashR: parallelize and scale R for machine learning using SSDs, in PPoPP 2018 pdf

  • Disa Mhembere, Da Zheng, Carey E Priebe, Joshua T Vogelstein, Randal Burns, knor: A NUMA-optimized in-memory, distributed and semi-external-memory k-means library, in HPDC 2017 pdf

  • Da Zheng, Disa Mhembere, Vince Lyzinski, Joshua Vogelstein, Carey E. Priebe, Randal Burns, Semi-External Memory Sparse Matrix Multiplication on Billion-node Graphs in a Multicore Architecture, in Transactions on Parallel and Distributed Systems, 2017 pdf

  • Heng Wang, Da Zheng, Randal Burns, Carey Priebe, Active Community Detection in Massive Graphs, in SDM-Networks 2015 [pdf]

  • Da Zheng, Disa Mhembere, Randal Burns, Joshua Vogelstein, Carey Priebe, Alexander S. Szalay, FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs, in FAST 2015 pdf

  • Da Zheng, Randal Burns, Alexander S. Szalay, Toward Millions of File System IOPS on Low-Cost, Commodity Hardware, in Supercomputing 2013 pdf

  • Da Zheng, Randal Burns, Alexander S. Szalay, A Parallel Page Cache: IOPS and Caching for Multicore Systems, in HotStorage 2012 pdf