Experience


A tiny Research Scientist and Engineering Director
at ByteDance

Work Experience


Feb. 2025–Present

ByteDance Seed

Principal-level Research Scientist & Engineering Director

  • Lead research and engineering strategy for AI compilers, model deployment, and distributed training and inference acceleration.
  • Guide MLIR-based compiler initiatives (Triton-distributed), GPU/ASIC acceleration, and large-scale training and inference systems.
  • Mentor cross-disciplinary projects in programming languages, computer architecture, and system acceleration.

May 2023–Feb. 2025

ByteDance Seed/AML

Senior Staff-level Research Scientist & Engineering Manager

  • Led an MLIR-based compiler team targeting CPU, GPU, edge, AI ASICs, and distributed systems; drove SW/HW co-design for multiple generations of in-house AI ASICs and DPUs. Directed ByteIR in collaboration with OpenXLA, Torch-MLIR, and ONNX-MLIR.
  • Managed production deployment of 1,000s of models across 10,000s of in-house AI ASICs.
  • Led a heterogeneous computing team optimizing training, inference, and deployment across commercial DLAs; oversaw benchmarking via ByteMLPerf.
  • Directed GPU acceleration initiatives, including FLUX and an AMD GPU acceleration effort.
  • Led Triton projects for in-house AI ASICs and distributed systems (Triton-distributed).
  • Supervised distributed LLM training and inference acceleration, including veScale and ShadowKV.
  • Initiated a large-scale model performance estimation program.
  • Mentored projects in programming languages, computer architecture, and acceleration.

Jun 2021–May 2023

ByteDance AML

Staff-level Research Scientist

  • Founded and led a model compilation team for an MLIR-based AI compiler across CPUs, GPUs, and AI ASICs, including SW/HW co-design for AI ASICs; launched ByteIR.
  • Led a production team deploying numerous models on in-house AI ASICs.

Sept 2018–Jun 2021

Microsoft Cloud & AI

Senior Software Engineer

  • Led an AI compilation team building an LLVM/MLIR-based compiler for Maia 100, GPUs, and CPUs within ONNX Runtime. Led the Argo and Nuphar projects, delivering up to 3x cost reduction and 10x faster model release cycles for Azure Cognitive Services.
  • Developed performance optimizations and a programming model for Maia 100.
  • Co-designed pipeline parallelism for ONNX Runtime training.

July 2017–Aug 2018

Microsoft AI & Research

Software Engineer

  • Co-led the AI compiler and runtime for CNTK models, achieving 10x+ speedups over native CNTK.

Fall 2009–Summer 2017

IMPACT Lab., UIUC

Supervised by Dr. Wen-mei W. Hwu

Research Assistant

  • Researched high-performance GPU computing, compiler optimization, and computer architecture.
  • Built a performance-portable programming system for CPUs, GPUs, FPGAs, and clusters, spanning language, compiler, and runtime design.
  • Contributed to benchmark suites for hardware characterization and optimization, including Parboil, SPEC ACCEL, and Chai.
  • Developed GPU optimization techniques for dynamic parallelism, data transformations, reductions, and dense/sparse BLAS; contributed the first GPU pivoting tridiagonal solver for NVIDIA cuSPARSE and multi-dimensional EMD for signal processing.
  • Analyzed cache sensitivity and implemented cache protection, bypassing, and thread throttling to improve throughput.

Summer 2012

NVIDIA

Intern

  • Implemented real-time image inpainting optimizations for Tegra 3 to improve graphics performance.

Feb. 2008–July 2009

Ultrasonic Imaging Lab., NTU

Supervised by Dr. Pai-Chi Li

Full-time Research Assistant & Engineer of a Stealth Mode Startup

  • Designed and developed an embedded heterogeneous system for high-frequency real-time ultrasonic imaging using FPGA and GPU; the system was commercialized by a startup.

Sept. 2004–Jun. 2006

MPAC Lab, NTU

Supervised by Dr. Homer H. Chen

Undergrad. Research Assistant

  • Conducted early research on the rolling shutter effect in CMOS cameras, developing analysis and compensation techniques.
  • Explored light-field camera design and its visual effects.

Education


Jul. 2017

University of Illinois at Urbana-Champaign (UIUC), IL

Ph.D. in Electrical and Computer Engineering (ECE)

  • Advisor: Dr. Wen-mei Hwu.

Aug. 2014

University of Illinois at Urbana-Champaign (UIUC), IL

M.S. in Electrical and Computer Engineering (ECE)

  • Advisor: Dr. Wen-mei Hwu.

Jun. 2007

National Taiwan University (NTU), Taipei, Taiwan

B.S. in Electrical Engineering (EE)

  • Minor in Mathematics.

Aug. 2006–May 2007

UIUC

Visiting Student in ECE

Honors & Awards


2012–2013

ECE, UIUC

Dan Vivoli Endowed Fellowship

2009–2011

NSF, USA

Integrative Graduate Education and Research Traineeship (IGERT): Neuroengineering

2006–2007

Taiwan

Taiwan Merit Scholarship

2005

National Science Council, Taiwan

Undergraduate Student Research Fellowship

2005

Pan Wen-Yuan Foundation, Taiwan

Pan Wen-Yuan Scholarship

2002-2007, 3 times

NTU, Taiwan

Presidential Awards

2001

APMO

Gold Medal in Asian Pacific Mathematics Olympiad

Let's Get In Touch!