Scott Hoang Personal Webpage

Email
scotthoang1996@gmail.com
Phone
+1 (828) 407-3399
Birthday
July 29, 1996
Location

Austin, Texas, USA

About me

I'm a second year PhD student advised under professor Atlas Wang . I work on Graph Neural Network, Efficient Learning, Deep Neural Network sparsity, Mixture of Experts and Sparse Neural Network. But sometime I also dapple a little bit into LLM, and Computer Vision.

Outside of research and studies, I enjoy doing human things like reading, photography and crossfit. Sometime, you can catch me outside during day-light, be sure to approach slowly, I am easily spook.

What i'm doing

Researching

Researching and developing state-of-the art methods in Machine Learning and Artificial Intelligence.
Application development

Working with industry to bring research ideas into working applications.
Photography

I am passionate about photography. I take nature and street photos mostly.

Resume

Education

University of Texas at Austin
2021 — Present
Ph.D Student, Electrical and Computer Eng.
Texas A&M
2018 — 2021
M.S Student, Computer Science
University of Washington
2014 — 2018
B.Eng Student, Electrical and Computer Eng.

Experience

Apple
June, 2023— Present
Research intern. Supervisor: Dr. Minsik Cho

Topic: LLM compression and pruning for edge devices.
SRI International
June, 2022 — Aug, 2022
Research intern. Supervisor: Dr.Subhodev Das

Topic: Out-of-Context detection in satellite images.
University of Massachusetts, Amherst
Sept, 2021 — June, 2022
Research collaboration. Supervisor: Prof. Prashant Shenoy

Topic: Hardware constrained dynamic inference for Deep Neural Network.
Tencent America
March, 2020 — Dec, 2021
Research Intern. Supervisor: Dr. Shih-Yao Lin

Topic: Neural Architectural Search for Human Pose Estimation.
Texas A&M Health Science Center
Sept, 2019 — Dec, 2019
Research Intern. Supervisor: Prof. Jun Wang

Topic: Software development for neuron detection and visualization
University of Washington
Sept, 2016 — Dec, 2017
Undergraduate Research

Topic: Electromagnetic sensing and PCB development

My skills

Python
100%
Pytorch
90%
GNN
80%
Computer Vision
70%
Large Language Model
60%

Publications

Revisiting Pruning at initialization through the lens of Ramanujan graph
ICLR 2023 — Oral [link]
Pruning at Initialization (PaI) in neural networks, which outperforms random pruning by identifying efficient subnetworks at the start, is gaining attention. However, the performance gap with post-training pruning remains. Our research interprets PaI using the Ramanujan Graph concept, highlighting a lack of correlation between highly sparse, connected networks and PaI's performance. We introduce the Iterative Mean Difference of Bound (IMDB) to relax the upper limit on the largest nontrivial eigenvalue of sparse networks' layers, and the Normalized Random Coefficient (NaRC) as a lower limit. Our analysis with these metrics shows that IMDB-conserving subnetworks perform better, and NaRC identifies regions with highly connected, sparse, non-trivial Ramanujan expanders.
AutoMARS: Searching to Compress Multi-Modality Recommendation Systems
CIKM 2022 - Poster [link]
Web applications use Recommendation Systems (RS) to manage consumer over-choice, with multi-modality inputs (e.g., user interaction, images, texts, rating scores) boosting performance. However, these demand more computational resources and storage. Real-world RS must balance time, space, and user experience budgets. Therefore, efficiently compressing multi-modality RS is vital. This paper introduces a compression method for multi-modality RS called Auto Multi-modAlity Recommendation System (AutoMARS). It uses neural architecture search and distillation to allocate resources based on the importance of input data in preserving recommendation efficacy. AutoMARS outperformed previous compression methods in tests on three different Amazon datasets, for instance, achieving a 20% accuracy increase and a 65% reduction over baselines on the Amazon Beauty dataset.
AutoCoG: A Unified Data-Model Co-Search Framework for Graph Neural Networks
AutoML 2022 - Poster [link]
Neural architecture search (NAS) has demonstrated success in discovering promising architectures for vision or language modeling tasks, and it has recently been introduced to searching for graph neural networks (GNNs) as well. Despite the preliminary success, GNNs struggle in dealing with heterophily or low-homophily graphs where connected nodes may have different class labels and dissimilar features. To this end, we propose co-optimizing both the input graph topology and the model’s architecture topology simultaneously. That yields AutoCoG, the first unified data-model co-search NAS framework for GNNs. By defining a highly flexible data-model co-search space, AutoCoG is gracefully formulated as a principled bi-level optimization that can be end-to-end solved by the differentiable search methods. Experiments show AutoCoG achieves gains of up to 4% for Actor, 7.3% on average for Web datasets, 0.17% for CoAuthor-CS, and finally 5.4% for Wikipedia-Photo benchmarks.
MM-Hand: 3D-Aware Multi-Modal Guided Hand Generation for 3D Hand Pose Synthesis
MM 2020 - Poster [link]
Estimating the 3D hand pose from a monocular RGB image is important but challenging. A solution is training on large-scale RGB hand images with accurate 3D hand keypoint annotations. However, it is too expensive in practice. Instead, we develop a learning-based approach to synthesize realistic, diverse, and 3D pose-preserving hand images under the guidance of 3D pose information. We propose a 3D-aware multi-modal guided hand generative network (MM-Hand), together with a novel geometry-based curriculum learning strategy. Our extensive experimental results demonstrate that the 3D-annotated images generated by MM-Hand qualitatively and quantitatively outperform existing options. Moreover, the augmented data can consistently improve the quantitative performance of the state-of-the-art 3D hand pose estimators on two benchmark datasets. The code will be available at https://github.com/ScottHoang/mm-hand.
3M-POSE: MULTI-RESOLUTION, MULTI-PATH AND MULTI-OUTPT NEURAL ARCHITECTURE SEARCH FOR BOTTOM-UP POSE PREDICTION
Master thesis [link]
Human pose estimation is a challenging computer vision task and often hinges on carefully handcrafted architectures. This paper aims to be the first to apply Neural Architectural Search (NAS) to automatically design a bottom-up, one-stage human pose estimation model with significantly lower computational costs and smaller model size than existing bottom-up approaches. Our framework dubbed 3M-Pose co-searches and co-trains with the novel building block of Early Escape Layers (EELs), producing native modular architectures that are optimized to support dynamic inference for even lower average computational cost. To flexibly explore the fine-grained spectrum between the performance and computational budget, we propose Dynamic Ensemble Gumbel Softmax (Dyn-EGS), a novel approach to sample micro and macro search spaces by allowing varying numbers of operators and inputs to be individually selected for each cell. We additionally enforce a computational constraint with a student-teacher guidance to avoid the trivial search collapse caused by the pursuit of lightweight models. Experiments demonstrate 3M-Pose to find models of drastically superior speed and efficiency compared to existing works, reducing computational costs by up to 93% and parameter size by up to 75% at the cost of minor loss in performance.