Experience
Experience
Building high-performance, high-reliability network infrastructure for AI at cloud scale.
I work on networking benchmarks, telemetry, resilient transport, and agentic infrastructure systems for large-scale AI and HPC clusters. My work spans production engineering, research collaboration, and practical systems design for cloud-scale infrastructure.

Software Engineer II · Azure HPC Team
Microsoft · Vancouver, Canada / Beijing, China
MRC and Resilient AI Supercomputer Networking
Network Benchmarking for AI/HPC Infrastructure
- Work on networking benchmarks and deployment readiness for high-performance AI clusters.
- Focus on topology-aware benchmarking across NVLink, multi-node NVLink, InfiniBand, and Ethernet fabrics.
- Develop practical reliability and performance signals for large-scale cluster buildout and production readiness.
Agentic Platform for Infrastructure Workflows
- Explore an Agentic Platform for AI infrastructure workflows, connecting LLM-driven agents with benchmark selection, infrastructure evaluation, and operational automation.
- Build reliable interfaces between model capabilities and infrastructure engineering tasks.

Software Engineer · Azure HPC Team
Microsoft China · Beijing, China
SuperBench: Benchmarking and Topology-Aware Evaluation
- Worked on the open-source SuperBench benchmarking framework for cloud AI infrastructure.
- Focused on making benchmarking scalable, topology-aware, and useful for production readiness; related paper: USENIX ATC 2024 Best Paper.
Moneo: Telemetry and Performance Observability
- Worked on the open-source Moneo telemetry stack for GPU, InfiniBand, and custom performance signals.
- Helped turn low-level system metrics into actionable signals for anomaly detection and infrastructure optimization.
Education
M.Eng. in Electrical Engineering · GPA 3.91/4.0
Oct. 2019 – Sept. 2021
Supervisor: Prof. Hiroshi Hasegawa
Thesis: Resource Allocation in Elastic Optical Networks via Reinforcement Learning
B.Sc. in Electrical Information and Engineering
Sept. 2015 – July 2019
Supervisor: Dr. Qing Liu
Thesis: An Encoding-Free Genetic Algorithm for Topology Optimization
Skills
PythonC/C++RustGolangShellTorchSlurmInfiniBandMPINCCLMegatron-LMAzure OpenAILLM Agents