• GPU 算力方案
  • Cluster Engine
  • Application Platform
  • NVIDIA H200
  • NVIDIA GB200 NVL72
  • 解決方案
    
    GPU 算力租賃Cluster EngineInference EngineAI 應用開發平台
  • GPUs
    
    H200NVIDIA GB200 NVL72NVIDIA HGX™ B200
  • 定價
  • 關於
    
    關於我們部落格Discourse合作夥伴聯絡我們
  • 關於我們
  • 部落格
  • Discourse
  • 合作夥伴
  • 聯絡我們
  • 開始使用
繁體中文
繁體中文

English
日本語
한국어
繁體中文
一鍵啟用聯繫專家

LoRA LLM

Get startedfeatures

Related terms

Deep Learning
Large Language Model (LLM)
BACK TO GLOSSARY

LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning method designed specifically for Large Language Models (LLMs). Instead of updating all the model’s weights during training, LoRA freezes the original pre-trained weights and adds a small number of trainable parameters through low-rank matrices inserted into targeted layers (commonly attention and feedforward layers). This approach drastically reduces the number of trainable parameters, enabling:

  • Faster training times
  • Reduced hardware requirements
  • More adaptable multi-task models

In technical terms, LoRA decomposes the weight update matrix into the product of two smaller matrices — one with a lower rank — and adds them to the existing weights only during the forward pass. This maintains the expressiveness of the full model while optimizing for efficiency.

LoRA has become a standard method for customizing massive models like GPT, BERT, or LLaMA on domain-specific data without the need to retrain or store the full model for each task.

Empowering humanity's AI ambitions with instant GPU cloud access.

278 Castro St, Mountain View, CA 94041

  • GPU 算力租賃
  • Cluster Engine
  • AI 應用開發平台
  • 定價
  • 關於我們
  • Glossary
  • Blog
  • Careers
  • About Us
  • Partners
  • Contact Us

訂閱 GMI Cloud 電子報

Subscribe to our newsletter

Email
Submitted!
Oops! Something went wrong while submitting the form.
ISO27001:2022
SOC 2 Type 1

© 2024 版權所有。

隱私政策

使用條款