Orthogonal Butterfly: Parameter-Efficient Orthogonal Adaptation of Foundation Models via Butterfly Factorization (Talk)
Large foundation models are becoming ubiquitous, but training them from scratch is prohibitively expensive. Thus, efficiently adapting these powerful models to downstream tasks is increasingly important. In this paper, we study a principled finetuning paradigm -- Orthogonal Finetuning (OFT) -- for downstream task adaptation. Despite demonstrating good generalizability, OFT still uses a fairly large number of trainable parameters due to the high dimensionality of orthogonal matrices. To address this, we start by examining OFT from an information transmission perspective, and then identify a few key desiderata that enable better parameter-efficiency. Inspired by how the Cooley-Tukey fast Fourier transform algorithm enables efficient information transmission, we propose an efficient orthogonal parameterization using butterfly structures. We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT). By subsuming OFT as a special case, BOFT introduces a generalized orthogonal finetuning framework. Finally, we conduct an extensive empirical study of adapting large vision transformers, large language models, and text-to-image diffusion models to various downstream tasks in computer vision and natural language. The results validate the effectiveness of BOFT as a generic finetuning method.
Biography: Weiyang Liu is a PhD student with Max Planck Institute for Intelligent Systems, Tübingen and University of Cambridge under the Cambridge-Tuebingen PhD program, supervised by Bernhard Schölkopf and Adrian Weller. His research focuses on principled modeling of inductive bias in machine learning. He is mostly attracted by the power of foundation models these days.
Details
- 19 October 2023 • 10:00 am - 11:00 am
- N3.022 Aquarium and Zoom
- Perceiving Systems