LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning method designed specifically for Large Language Models (LLMs). Instead of updating all the model’s weights during training, LoRA freezes the original pre-trained weights and adds a small number of trainable parameters through low-rank matrices inserted into targeted layers (commonly attention and feedforward layers). This approach drastically reduces the number of trainable parameters, enabling:
In technical terms, LoRA decomposes the weight update matrix into the product of two smaller matrices — one with a lower rank — and adds them to the existing weights only during the forward pass. This maintains the expressiveness of the full model while optimizing for efficiency.
LoRA has become a standard method for customizing massive models like GPT, BERT, or LLaMA on domain-specific data without the need to retrain or store the full model for each task.
즉각적인 GPU 클라우드 액세스를 통해 인류의 AI 야망을 강화합니다.
2860 잔커 로드스위트 100 캘리포니아 산호세 95134
GMI Cloud
278 Castro St, Mountain View, CA 94041
Taiwan Office
GMI Computing International Ltd., Taiwan Branch
6F, No. 618, Ruiguang Rd., Neihu District, Taipei City 114726, Taiwan
Singapore Office
GMI Computing International Pte. Ltd.
1 Raffles Place, #21-01, One Raffles Place, Singapore 048616


© 2024 판권 소유.