Jie Ye

Hi! I am a PhD student in the Department of Computer Science at the Illinois Institute of Technology (IIT) and a member of the Gnosis Research Center, advised by Prof. Xian-He Sun and Dr.Kougkas Anthony from IIT, and co-advised by Dr. Bogdan Nicolae from Argonne National Laboratory (ANL). My research focuses on accelerating and optimizing DNN/LLM models inference serving; data transfer techniques (e.g., GPU-to-GPU data transfer, GPUDirect data transfer);KV cache management (compression/eviction); Scheduling; Model offloading; KV cache offloading and transferring; optimizing LLM training (e.g., CPU-GPU hybrid computation).

Publications

Conferences

  1. Jie Ye, Jaime Cernuda, Avinash Maurya, Xian-He Sun, Anthony Kougkas, and Bogdan Nicolae. "Characterizing the Behavior and Impact of KV Caching on Transformer Inferences under Concurrency". IPDPS’25: The 53rd International Conference on Parallel Processing (Milan, Italy, 2025). [Paper]
  2. Avinash Maurya, Jie Ye, M. Mustafa Rafique, Franck Cappello, and Bogdan Nicolae. "Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading". Middleware'24: The 25th International Middleware Conference (Hong Kong, 2024). [Paper]
  3. Meng Tang, Jaime Cernuda, Jie Ye, Luanzheng Guo, Nathan R. Tallent, Anthony Kougkas, and Xian-He Sun. "DaYu: Optimizing Distributed Scientific Workflows by Decoding Dataflow Semantics and Dynamics". Cluster'24: 2024 IEEE International Conference on Cluster Computing (Kobe, Japan, 2024). [Paper]
  4. Jie Ye, Jaime Cernuda, Neeraj Rajesh, Keith Bateman, Orcun Yildiz, Tom Peterka, Arnur Nigmetov, Dmitriy Morozov, Xian-He Sun, Anthony Kougkas, and Bogdan Nicolae. "Viper: A High-Performance I/O Framework for Transparently Updating, Storing, and Transferring Deep Neural Network Models". ICPP’24: The 53nd International Conference on Parallel Processing (Gotland, Sweden, 2024). [Paper]
  5. Jaime Cernuda, Jie Ye, Anthony Kougkas, and Xian-He Sun. "HStream: A hierarchical data streaming engine for high-throughput scientific applications". ICPP’24: The 53nd International Conference on Parallel Processing (Gotland, Sweden, 2024). [Paper]
  6. Keith Bateman, Neeraj Rajesh, Jaime Cernuda, Luke Logan, Jie Ye, Stephen Herbein, Anthony Kougkas, and Xian-He Sun. "LuxIO: Intelligent Resource Provisioning and Auto-Configuration for Storage Services". HiPC’22: The 29th edition of the IEEE International Conference on High Performance Computing, Data, and Analytics (Bengaluru, India, 2022). [Paper]
  7. Jaime Cernuda, Hariharan Devarajan, Luke Logan, Keith Bateman, Neeraj Rajesh, Jie Ye, Anthony Kougkas, and Xian-He Sun. "HFlow: A Dynamic and Elastic Multi-Layered Data Forwarder". Cluster’21: The 2021 IEEE International Conference on Cluster Computing (Portland, OR, 2021). [Paper]
  8. Neeraj Rajesh, Hariharan Devarajan, Jaime Cernuda, Keith Bateman, Luke Logan, Jie Ye, Anthony Kougkas, and Xian-He Sun. "Apollo: An ML-assisted Real-Time Storage Resource Observer". HPDC’21: The 30th ACM International Symposium on High-Performance Parallel and Distributed Computing (Sweden 2021). [Paper]

Workshop

  1. Krishna Teja Chitty-Venkata, Jie Ye, Murali Emani. "MoPEQ: Mixture of Mixed Precision Quantized Experts". BiVision'25 ICCV-workshop: The 3rd Workshop on Binary and Extreme Quantization for Computer Vision, colocated with ICCV'25. [Paper]
  2. Avinash Maurya, Jie Ye, M. Mustafa Rafique, Franck Cappello, and Bogdan Nicolae. "Breaking the Memory Wall: A Study of I/O Patterns and GPU Memory Utilization for Hybrid CPU-GPU Offloaded Optimizers". FlexScience'24 HPDC-workshop: The 14th Workshop on AI and Scientific Computing at Scale using Flexible Computing Infrastructures, colocated with HPDC'24(Pisa, Italy, 2024). [Paper]

Posters

  1. Ismail Muradli, Jie Ye, Luke Logan, Anthony Kougkas, and Xian-He Sun. "Insights into GPUDirect Data Transfer through NIXL Benchmarking". eScience'25: The 21st IEEE International eScience Conference(Chicago, IL, 2025). [Poster][Extended Abstract]
  2. Jie Ye, Bogdan Nicolae, Xian-He Sun, and Anthony Kougkas. "Accelerate LLM Inference with Asynchronous Model Offloading". SSDBM'25: The 37th International Conference on Scalable Scientific Data Management(Columbus, Ohio, 2025). [Poster]
  3. Jie Ye, Bogdan Nicolae, Anthony Kougkas, and Xian-He Sun. "Uncover the Overhead and Resource Usage for Handling KV Cache Overflow in LLM Inference". SC'24: The International Conference for High Performance Computing, Networking, Storage, and Analysis(Atlanta, GA, 2024). Nominated for the Best Research Poster Award! [Poster][Extended Abstract]
  4. Jie Ye, Jaime Cernuda, Bogdan Nicolae, Anthony Kougkas, and Xian-He Sun. "A High-Performance I/O Framework for Accelerating DNN Model Updates Within Deep Learning Workflow". SC'23: The International Conference for High Performance Computing, Networking, Storage, and Analysis(Denver, CO, 2023). [Poster][Extended Abstract]
  5. Jie Ye, Anthony Kougkas, and Xian-He Sun. "HDF5 VOL Connector to Apache Arrow". SC'21: The International Conference for High Performance Computing, Networking, Storage, and Analysis(St. Louis, MO, 2021). [Poster][Extended Abstract]

Contact Details

Jie Ye
Ph.D. Student
Department of Computer Science
Illinois Institute of Technology
10 West 35th Street Chicago, IL 60616
Best way to contact me by e-mail: jye20@hawk.illinoistech.edu

Adapted from a template designed by Andreas Viklund