CryptoSpiel.com
No Result
View All Result
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams
No Result
View All Result
CryptoSpiel.com
No Result
View All Result

TEAL Introduces Training-Free Activation Sparsity to Boost LLM Efficiency

September 1, 2024
in Blockchain
Reading Time: 2 mins read
A A
0
Bitcoin Addresses Holding Between 100 and 10,000 BTC Hit a 7-Week High
0
SHARES
2
VIEWS
ShareShareShareShareShare


Zach Anderson
Sep 01, 2024 08:34

TEAL offers a training-free approach to activation sparsity, significantly enhancing the efficiency of large language models (LLMs) with minimal degradation.





TEAL (Training-Free Activation Sparsity in LLMs) has emerged as a groundbreaking approach to improve the efficiency of large language models (LLMs) without requiring additional training. According to together.ai, this method applies magnitude pruning to hidden states throughout the model, achieving 40-50% activation sparsity with minimal degradation. This innovation allows for the transfer of fewer weights to on-chip memory, addressing the memory-bound nature of LLM inference and translating into 1.53-1.8x wall-clock speedups in single-batch decoding.

Background

LLMs are known for their massive size, which poses challenges during inference, primarily due to the speed limitations of transferring parameters from device memory to registers. Various techniques such as quantization, weight sparsity, and speculative decoding have been developed to tackle this ‘memory wall’. Activation sparsity, which leverages zero values in hidden states, is a less explored method that avoids transferring unnecessary weight channels during decoding.

Older models like OPT-175B show high activation sparsity, enabling methods like DejaVu to achieve significant speedups. However, newer models like LLaMA have moved to SwiGLU variants, making it harder to apply such methods. Recent research has attempted to ‘recover’ models that exhibit activation sparsity, but these require extensive retraining on massive datasets.

Motivating Study: Distributional Properties of Activations in LLMs

Research has shown that hidden states in LLMs exhibit outliers and are zero-centered with similar distributional shapes across layers. Specifically, states before MLP and Attention Blocks are Gaussian-shaped, while intermediate states are Laplacian-shaped. This suggests that many low-magnitude activations can be pruned with negligible model degradation, a concept also observed in other studies like CATS.

TEAL

TEAL introduces an optimization by sparsifying every tensor in the model, achieving near-zero degradation at 25% sparsity and minimal degradation at 40% sparsity. At 50% sparsity, Llama-3 variants show slightly more degradation compared to older Llama-2 and Mistral variants. TEAL outperforms CATS by sparsifying every tensor and choosing to sparsify through input, yielding lower error.

Hardware-Aware Speed-up

To benchmark real-world speedups, TEAL was integrated with GPT-Fast, achieving significant speedups of up to 1.53x and 1.8x at 40% and 50% sparsity, respectively. While the kernel is faster than cuBLAS at 0% sparsity, there is still room for further optimization.

Compatibility with Quantization

TEAL also demonstrates compatibility with quantization, another technique for efficient LLM inference. Combining activation sparsity and quantization unlocks new regimes for transferring memory to GPU registers, allowing for higher inference speed-ups.

Applications

TEAL’s most immediate application is accelerating inference in resource-constrained edge settings, particularly in single-batch scenarios. It also aids inference providers like Together AI, which hosts over 100 open-source models across a large fleet of GPUs, by serving models more efficiently.

Image source: Shutterstock


Credit: Source link

RELATED POSTS

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

Tether’s Strategic Investment in Generative Bionics Boosts Innovative Humanoid Robotics

Harvey Integrates NetDocuments for Enhanced Legal Document Management

Buy JNews
ADVERTISEMENT
ShareTweetSendPinShare
Previous Post

Canadian Court Orders Man to Repay $1.2 Million in Bitcoin Loan Dispute

Next Post

763K Trades, $2.7B Volume Boost

Related Posts

Galaxy Digital: Ethereum Developers Discuss Key Upgrades During Latest Consensus Call
Blockchain

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

December 9, 2025
Tether Implements Wallet-Freezing Policy Aligned with US Regulations
Blockchain

Tether’s Strategic Investment in Generative Bionics Boosts Innovative Humanoid Robotics

December 8, 2025
Understanding Ambiguity: Causes and Effects
Blockchain

Harvey Integrates NetDocuments for Enhanced Legal Document Management

December 8, 2025
Next Post
763K Trades, $2.7B Volume Boost

763K Trades, $2.7B Volume Boost

Bitcoin ETF vs Buying BTC Directly: What’s Better?

Is Bitcoin Too Big to Fail? (opinion)

Recommended Stories

No Content Available

Popular Stories

  • Court Docs Reveal FTX Allowed Alameda to Borrow $65,000,000,000 for Trading, Made Firm Exempt From Liquidation

    Court Docs Reveal FTX Allowed Alameda to Borrow $65,000,000,000 for Trading, Made Firm Exempt From Liquidation

    0 shares
    Share 0 Tweet 0
  • Getting Started with BTTC: Writing Your First Smart Contract

    0 shares
    Share 0 Tweet 0
  • Onecoin Victims Petition Bulgaria for Seizure of Assets and Compensation – Bitcoin News

    0 shares
    Share 0 Tweet 0
  • New Web 3.0 Sports Game Announces Major Pack Sale, Token…

    0 shares
    Share 0 Tweet 0
  • This new protocol allows crypto traders to capture DeFi volatility

    0 shares
    Share 0 Tweet 0
CryptoSpiel.com

This is an online news portal that aims to provide the latest crypto news, blockchain, regulations and much more stuff like that around the world. Feel free to get in touch with us!

What’s New Here!

  • How crypto derivatives liquidation drove Bitcoin’s 2025 crash
  • Robinhood Charges Into Indonesia as Next Explosive Crypto Market
  • Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

Subscribe Now

Loading
  • Live Crypto Prices
  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2021 - cryptospiel.com - All rights reserved!

No Result
View All Result
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams

© 2021 - cryptospiel.com - All rights reserved!

Please enter CoinGecko Free Api Key to get this plugin works.