About me

I'm a second year PhD student advised under professor Atlas Wang . I work on Graph Neural Network, Efficient Learning, Deep Neural Network sparsity, Mixture of Experts and Sparse Neural Network. But sometime I also dapple a little bit into LLM, and Computer Vision.

Outside of research and studies, I enjoy doing human things like reading, photography and crossfit. Sometime, you can catch me outside during day-light, be sure to approach slowly, I am easily spook.

What i'm doing

  • design icon

    Researching

    Researching and developing state-of-the art methods in Machine Learning and Artificial Intelligence.

  • Web development icon

    Application development

    Working with industry to bring research ideas into working applications.

  • camera icon

    Photography

    I am passionate about photography. I take nature and street photos mostly.

Resume

Education

  1. University of Texas at Austin

    2021 — Present

    Ph.D Student, Electrical and Computer Eng.

  2. Texas A&M

    2018 — 2021

    M.S Student, Computer Science

  3. University of Washington

    2014 — 2018

    B.Eng Student, Electrical and Computer Eng.

Experience

  1. Apple

    June, 2023— Present

    Research intern. Supervisor: Dr. Minsik Cho

    Topic: LLM compression and pruning for edge devices.

  2. SRI International

    June, 2022 — Aug, 2022

    Research intern. Supervisor: Dr.Subhodev Das

    Topic: Out-of-Context detection in satellite images.

  3. University of Massachusetts, Amherst

    Sept, 2021 — June, 2022

    Research collaboration. Supervisor: Prof. Prashant Shenoy

    Topic: Hardware constrained dynamic inference for Deep Neural Network.

  4. Tencent America

    March, 2020 — Dec, 2021

    Research Intern. Supervisor: Dr. Shih-Yao Lin

    Topic: Neural Architectural Search for Human Pose Estimation.

  5. Texas A&M Health Science Center

    Sept, 2019 — Dec, 2019

    Research Intern. Supervisor: Prof. Jun Wang

    Topic: Software development for neuron detection and visualization

  6. University of Washington

    Sept, 2016 — Dec, 2017

    Undergraduate Research

    Topic: Electromagnetic sensing and PCB development

My skills

  • Python
    100%
  • Pytorch
    90%
  • GNN
    80%
  • Computer Vision
    70%
  • Large Language Model
    60%

Publications

  1. Revisiting Pruning at initialization through the lens of Ramanujan graph

    ICLR 2023 — Oral [link]

    Pruning at Initialization (PaI) in neural networks, which outperforms random pruning by identifying efficient subnetworks at the start, is gaining attention. However, the performance gap with post-training pruning remains. Our research interprets PaI using the Ramanujan Graph concept, highlighting a lack of correlation between highly sparse, connected networks and PaI's performance. We introduce the Iterative Mean Difference of Bound (IMDB) to relax the upper limit on the largest nontrivial eigenvalue of sparse networks' layers, and the Normalized Random Coefficient (NaRC) as a lower limit. Our analysis with these metrics shows that IMDB-conserving subnetworks perform better, and NaRC identifies regions with highly connected, sparse, non-trivial Ramanujan expanders.

  2. AutoMARS: Searching to Compress Multi-Modality Recommendation Systems

    CIKM 2022 - Poster [link]

    Web applications use Recommendation Systems (RS) to manage consumer over-choice, with multi-modality inputs (e.g., user interaction, images, texts, rating scores) boosting performance. However, these demand more computational resources and storage. Real-world RS must balance time, space, and user experience budgets. Therefore, efficiently compressing multi-modality RS is vital. This paper introduces a compression method for multi-modality RS called Auto Multi-modAlity Recommendation System (AutoMARS). It uses neural architecture search and distillation to allocate resources based on the importance of input data in preserving recommendation efficacy. AutoMARS outperformed previous compression methods in tests on three different Amazon datasets, for instance, achieving a 20% accuracy increase and a 65% reduction over baselines on the Amazon Beauty dataset.

  3. AutoCoG: A Unified Data-Model Co-Search Framework for Graph Neural Networks

    AutoML 2022 - Poster [link]

    Neural architecture search (NAS) has demonstrated success in discovering promising architectures for vision or language modeling tasks, and it has recently been introduced to searching for graph neural networks (GNNs) as well. Despite the preliminary success, GNNs struggle in dealing with heterophily or low-homophily graphs where connected nodes may have different class labels and dissimilar features. To this end, we propose co-optimizing both the input graph topology and the model’s architecture topology simultaneously. That yields AutoCoG, the first unified data-model co-search NAS framework for GNNs. By defining a highly flexible data-model co-search space, AutoCoG is gracefully formulated as a principled bi-level optimization that can be end-to-end solved by the differentiable search methods. Experiments show AutoCoG achieves gains of up to 4% for Actor, 7.3% on average for Web datasets, 0.17% for CoAuthor-CS, and finally 5.4% for Wikipedia-Photo benchmarks.

  4. MM-Hand: 3D-Aware Multi-Modal Guided Hand Generation for 3D Hand Pose Synthesis

    MM 2020 - Poster [link]

    Estimating the 3D hand pose from a monocular RGB image is important but challenging. A solution is training on large-scale RGB hand images with accurate 3D hand keypoint annotations. However, it is too expensive in practice. Instead, we develop a learning-based approach to synthesize realistic, diverse, and 3D pose-preserving hand images under the guidance of 3D pose information. We propose a 3D-aware multi-modal guided hand generative network (MM-Hand), together with a novel geometry-based curriculum learning strategy. Our extensive experimental results demonstrate that the 3D-annotated images generated by MM-Hand qualitatively and quantitatively outperform existing options. Moreover, the augmented data can consistently improve the quantitative performance of the state-of-the-art 3D hand pose estimators on two benchmark datasets. The code will be available at https://github.com/ScottHoang/mm-hand.

  5. 3M-POSE: MULTI-RESOLUTION, MULTI-PATH AND MULTI-OUTPT NEURAL ARCHITECTURE SEARCH FOR BOTTOM-UP POSE PREDICTION

    Master thesis [link]

    Human pose estimation is a challenging computer vision task and often hinges on carefully handcrafted architectures. This paper aims to be the first to apply Neural Architectural Search (NAS) to automatically design a bottom-up, one-stage human pose estimation model with significantly lower computational costs and smaller model size than existing bottom-up approaches. Our framework dubbed 3M-Pose co-searches and co-trains with the novel building block of Early Escape Layers (EELs), producing native modular architectures that are optimized to support dynamic inference for even lower average computational cost. To flexibly explore the fine-grained spectrum between the performance and computational budget, we propose Dynamic Ensemble Gumbel Softmax (Dyn-EGS), a novel approach to sample micro and macro search spaces by allowing varying numbers of operators and inputs to be individually selected for each cell. We additionally enforce a computational constraint with a student-teacher guidance to avoid the trivial search collapse caused by the pursuit of lightweight models. Experiments demonstrate 3M-Pose to find models of drastically superior speed and efficiency compared to existing works, reducing computational costs by up to 93% and parameter size by up to 75% at the cost of minor loss in performance.

Blog

Contact

Contact Form