Investigated data attribution for pretrained image segmentation models. By extending TRAK, we can curate a subset of training images from large image datasets that's up to 3x smaller that improves model performance, and identify mislabeled examples in training data.
Presented at NeurIPS ATTRIB Workshop 2023.
Final project for 6.5940 (Efficient Deep Learning Computing), Fall 2024 at MIT. We adapted FlatQuant, a post-training quantization method for LLMs which learns affine transformations to mitigate outliers weights and activations, for diffusion transformers. We achieved comparable results on key image quality metrics (FID, CLIP score, LPIPS) when quantizing from fp16 to w8a8/w6a6.
Final project for 6.7960 (Deep Learning), Fall 2024 at MIT. We explored training-free guidance methods for discrete diffusion language models, adapting ideas from both diffusion (autoguidance) and language modeling (contrastive decoding) to improve generation quality. We achieved improvements in perplexity over baseline diffusion LMs on WikiText2, without any additional training.