PEFT is parameter efficient fine-tuning. LoRA Consider some matrix:

\begin{equation} W_0 \in \mathbb{R}^{d \times k} \end{equation}

Key intuition: gradient matricies have low intrinsic rank. We consider the following update:

\begin{equation} W_0 + \Delta W = W_0 + \alpha BA \end{equation}

where B \in \mathbb{R}^{d \times r}, A \in \mathbb{R}^{r \times k}, and r \ll \min(d,k). where \alpha is the trade off between pre-trained knowledge and task specific knowledge.

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?