Data Attribution for Segmentation Models
Investigated data attribution for pretrained image segmentation models. By extending TRAK, we can curate a subset of training images from large image datasets that's up to 3x smaller that improves model performance, and identify mislabeled examples in training data.
Presented at NeurIPS ATTRIB Workshop 2023.
Affine Transformations for Outlier-Resilient Post-Training Quantization on Diffusion Transformers
Final project for 6.5940 (Efficient Deep Learning Computing), Fall 2024 at MIT. We adapted FlatQuant, a post-training quantization method for LLMs which learns affine transformations to mitigate outliers weights and activations, for diffusion transformers. We achieved comparable results on key image quality metrics (FID, CLIP score, LPIPS) when quantizing from fp16 to w8a8/w6a6.
Guidance for Diffusion Language Models
Final project for 6.7960 (Deep Learning), Fall 2024 at MIT. We explored training-free guidance methods for discrete diffusion language models, adapting ideas from both diffusion (autoguidance) and language modeling (contrastive decoding) to improve generation quality. We achieved improvements in perplexity over baseline diffusion LMs on WikiText2, without any additional training.
zyzx
A natural language shell built in Zig, using a custom Mixtral 8x7B model fine-tuned on bash commands. Built at TreeHacks 2024.
Sakhi
A language learning tool built around practicing natural conversations with a chatbot, which can be personalized to any scenario. Built at LAHacks 2023.
StitchIt
A collaborative journaling app you can use to compile memories with your friends, supporting text, images, and audio. Built for web.lab 2023.