• GPU 인스턴스
  • 클러스터 엔진
  • Application Platform
  • NVIDIA H200
  • NVIDIA GB200 NVL72
  • 제작품
    
    GPU 인스턴스클러스터 엔진Inference Engine애플리케이션 플랫폼
  • GPUs
    
    H200NVIDIA GB200 NVL72NVIDIA HGX™ B200
  • 요금제
  • 회사
    
    회사 소개블로그Discourse파트너문의하기
  • 회사 소개
  • 블로그
  • Discourse
  • 파트너
  • 문의하기
  • 시작해 보세요
한국어
한국어

English
日本語
한국어
繁體中文
시작해 보세요Contact Sales

Pruning

Get startedfeatures

Related terms

A.I. (인공 지능)
BACK TO GLOSSARY

Pruning in artificial intelligence particularly in deep learning, refers to the systematic removal of parts of a neural network (such as weights, neurons, or even layers) that contribute little to the model’s performance. The main goal is to make the model smaller, faster, and more efficient while maintaining similar accuracy or predictive capabilities.

Why Pruning Is Used:

  • Reduce model size: Pruning decreases the number of parameters, making the model easier to store and deploy, especially on edge devices like smartphones or IoT sensors.
  • Speed up inference: Fewer parameters mean fewer computations during prediction, which leads to faster response times.
  • Lower energy consumption: Pruned models use less computational power, which is useful for both sustainability and hardware constraints.
  • Combat overfitting: By eliminating redundant or weak connections, pruning can help the model generalize better on unseen data.

How It Works:

  1. Train a full model to achieve baseline performance.
  2. Evaluate the importance of individual weights, neurons, or filters using metrics like magnitude (L1/L2 norm) or gradient-based scores.
  3. Remove (prune) the least important ones based on a threshold or target sparsity.
  4. Fine-tune or retrain the model to recover any lost accuracy.

Types of Pruning:

  • Weight pruning: Removes specific weights (connections) in the network.
  • Neuron pruning: Eliminates entire neurons or filters (in CNNs).
  • Structured pruning: Removes entire channels, layers, or blocks for better hardware compatibility.
  • Dynamic pruning: Prunes during training instead of after.

Pruning is commonly used in combination with other techniques like quantization or knowledge distillation to further optimize models for production use.

‍

Sign up for our newsletter

즉각적인 GPU 클라우드 액세스를 통해 인류의 AI 야망을 강화합니다.

[email protected]

2860 잔커 로드스위트 100 캘리포니아 산호세 95134

  • GPU 인스턴스
  • 클러스터 엔진
  • 애플리케이션 플랫폼
  • 가격 책정
  • Glossary
  • 회사 소개
  • Blog
  • Partners
  • 블로그
  • 문의하기

© 2024 판권 소유.

개인정보 보호 정책

이용 약관