I am Tuan, a postdoc fellow working on foundation models and their development for scientific discovery, especially for proteins and human genetics. I am jointly supervised by Prof. Vasilis Ntranos at UCSF and the Data Sciences group at Maze Therapeutics.
Education: I obtained my Ph.D. in Computer Sciences (minor in Statistics) with Prof. Kangwook Lee, studying modular neural networks built on pre-trained models. Previously, I completed my M.S. with Prof. Vikas Singh, studing GANs for graph-structured data and my B.E. with Prof. Tru Cao, studying AI systems for disease forecast and healthcare.
My research interests are AI/ML and AI4Science. My current foci are language models and modular deep learning, with the applied research in computational biology.
Link | Topic | Title | Summary | Github | |
---|---|---|---|---|---|
NeurIPS'23 | LLM | Large Language Models of Code Fail at Completing Code with Potential Bugs | summary | code | |
TL;DR: | LLMs may fail drastically at completing functional code when potential bugs (aka anti-flow pattens) exist in the context. | ||||
EMNLP'22 (Findings) | Multimodal | Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment | summary | code | |
TL;DR: | Text-Image correlation (via CLIP embedding) can be effeciently utilized with static embedding for robust word translation. | ||||
NeurIPS'22 | LLM | LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks | summary | code | |
TL;DR: | Pretrained LLMs, via language-interface, can be useful for learning non-language tasks, e.g., tabular data classification. | ||||
ICMLW'22 | GAN, PEFT | Improved Input Reprogramming for GAN Conditioning | summary | code | |
TL;DR: | Pretrained GANs can be efficiently repurposed (without modification) to conditionally generate samples in their support. | ||||
ICML'21 (Oral) | MLSys, GAN | Coded-InvNet for Resilient Prediction Serving Systems | summary | code | |
TL;DR: | Coded-InvNet is a coded computation method combined with image-to-image translation to improve resilience of MLSS. | ||||
TPAMI'20 | GAN, Medical Imaging | Performing Group Difference Testing on Graph Structured Data from GANs: Analysis and Applications in Neuroimaging | code | ||
TL;DR: | Analyzing when GAN-based data can obtain the similar conclusions with trained data in scientific or biomedical studies. | ||||
AAAI'20 (Oral) | Optimization, GAN | The Promise of Conditional Gradient Methods for Training Deep Models | code | ||
TL;DR: | Conditional gradients can be utilized to faster training of deep networks with provably better generalization guarantees. |
Link | Topic | Title | Summary | Github |
---|---|---|---|---|
MobiCom'22 | Healthcare | PROS: an Efficient Pattern-Driven Compressive Sensing Framework for Low-Power Biopotentialbased Wearables with On-chip Intelligence | code | |
MobiSys'21 | Healthcare | WAKE: A Behind-the-ear Wearable System for Microsleep Detection | ||
IEEE TMC'21 | Healthcare | Detection of Microsleep Events with a Behind-the-ear Wearable System | ||
Oxford Journal'18 | Epidemiology | Forecasting Dengue Incidences: Statistical and Dynamic Models | ||
CtaD'17 | Medical Imaging | Graph Imputation techniques for estimating amyloid positivity from longitudinal cognitive and MRI measurements for efficient secondary prevention trials | ||
ACIIDS'16 (Oral) | Epidemiology | Forecasting the Magnitude of Dengue in Southern Vietnam |
Link | Topic | Title |
---|---|---|
US 11087525 | AI Framework, Inverse Graphics | Unsupervised learning of three dimensional visual alphabet |
US 16186121 | Algorithm, Training Framework | Training System for Artificial Neural Networks Having a Global Weight Constrainer |