About
Shawn is a highly experienced technology professional with a passion for engineering and algorithms. He was a founding member of Kuaishou Technology (SEHK:1024) AI Platform and the Kuaishou Seattle AI Lab. At Kuaishou, he directed Engineering with an emphasis on Personalization Infrastructure and successfully led multiple core teams to drive innovative solutions. With his expertise, he developed the first GPU-based large-scale advertising recommendation system at Kuaishou, which generated an annual revenue of 6 billion USD. In 2021, he created the world's largest-scale recommender system (ACM SIGKDD 2022). He was also instrumental in inventing and developing cutting-edge systems utilizing the latest technologies, including large-scale storage engines, high-performance deep learning infrastructure, and model compression frameworks. His strong leadership and expertise in the field make him a valuable asset to any team.
In addition to his extensive experience in the technology industry, Shawn holds a PhD in Computer Science and Artificial Intelligence from the University of Rochester. During his academic pursuits, he made notable contributions to the field of distributed machine learning, including publishing important results on the theoretical justification of asynchronous SGD (NeurIPS 2015 spotlight ) and the first decentralized SGD with linear speedup (NeurIPS 2017 oral ). He received the 30 New Generation Digital Economy Talents (30 位新生代数字经济人才) award from the World Internet Conference and Big Data Digest in 2019.
Shawn's expertise in technology is rooted in a lifelong passion for programming that began even before he started preschool. This comprehensive engineering knowledge, combined with a deep understanding of algorithms, has enabled him to guide teams to successful outcomes on even the most challenging real-world problems. Throughout his career, he has consistently demonstrated his ability to apply his technical know-how and leadership skills to drive impactful results and make a significant impact in his field.
Selected Projects
-
PERSIA
Kuaishou's advanced GPU-based large-scale learning system designed
for ad recommendation and CTR prediction tasks. Launched in 2018 by
Shawn, PERSIA has been leading the charge in the field of
recommendation systems and was open-sourced in 2021. With the
ability to support models with up to 100 trillion parameters, PERSIA
is the fastest public recommendation model training framework
available. Built using Rust for high-performance computing and
communication, PERSIA is a testament to Kuaishou's commitment to
advancing the state of the art in recommendation systems.
- Training Deep Learning-based recommender models with 100 trillion parameters over Google Cloud.
- PERSIA, the largest recommended training system in the history of open source by far.
- Story: 640x Faster GPU-Based Learning System for Ad Recommendation.
- Story: Innovation, Balance, and Big Picture: The Speed of Kwai Commercialization.
(in collaboration with ETH Zurich) -
Bagua
Kuaishou's deep learning training acceleration framework
designed to tackle the challenges of large-scale training tasks.
Bagua offers a comprehensive solution to speed up the training
process, including data loader optimization, advanced distributed
training algorithms, network communication optimization, and more.
Developed to solve the training bottleneck at Kuaishou Technology,
where more than a million videos are uploaded every hour, Bagua has
been instrumental in maintaining the company's position at the
forefront of innovation.
(in collaboration with ETH Zurich)
- Hammer The automatic deep learning model compression tool developed by Kuaishou Technology. With Hammer, reducing the size of large models while maintaining their accuracy has never been easier. This innovative tool has already made a significant impact at Kuaishou, saving thousands of GPU cards and enabling the successful deployment of hundreds of complex models. (Hammer also helped the TAMU-KWAI team win the 2nd prize in IEEE Low-Power Computer Vision Challenge).
-
DPSGD/ADPSGD Algorithms
The DPSGD/ADPSGD algorithms are revolutionary decentralized training
solutions for artificial intelligence, offering unparalleled speed
and accuracy. Unlike traditional training algorithms, these
decentralized algorithms are optimized for cloud computing
environments where network conditions and machine performance can
vary. As demonstrated by IBM, the DPSGD/ADPSGD algorithms can
drastically reduce training times for speech recognition AI,
from a week to just 11 hours, while also delivering a 10x improvement in performance.
(in collaboration with IBM Thomas J. Watson Research Center )
-
DouZero
A strong game AI for DouDizhu
. The corresponding research paper was accepted by ICML 2021.
(in collaboration with DATA Lab )
- cproxy A handy tool Shawn created to apply proxies transparently on individual processes using cgroups.
Academic Professional Activities
- Senior Program Committee of AAAI
- Program Committee of ICML, NeurIPS , ICLR, AISTATS , AAAI, and ScaDL
-
Journal Reviewer of
- Journal of Machine Learning Research (JMLR)
- Machine Learning
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- IEEE Transactions on Information Theory
- IEEE Transactions on Network Science and Engineering
- IEEE Transactions on Neural Networks and Learning Systems
- IEEE Transactions on Knowledge and Data Engineering
- IEEE Transactions on Signal Processing
- IEEE Internet of Things Journal
- Data Mining and Knowledge Discovery
- BIT Numerical Mathematics
- Computational Optimization and Applications
- Optimization Methods and Software
- European Journal of Operational Research
- Journal of Parallel and Distributed Computing
- Pattern Recognition
- Neural Networks
- Parallel Computing
- Neurocomputing
- Measurement
- International Journal of Electrical Power & Energy Systems
- Journal of Optimization Theory and Applications