CryptoSpiel.com
No Result
View All Result
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams
No Result
View All Result
CryptoSpiel.com
No Result
View All Result

Together AI Boosts NVIDIA H200 and H100 GPU Cluster Performance with Kernel Collection

September 6, 2024
in Blockchain
Reading Time: 3 mins read
A A
0
Brazilian fintech giant XP Inc Launches Crypto Trading Platform XTAGE
0
SHARES
7
VIEWS
ShareShareShareShareShare


Joerg Hiller
Sep 06, 2024 07:14

Together AI enhances NVIDIA H200 and H100 GPU clusters with its Together Kernel Collection, offering significant performance improvements in AI training and inference.





Together AI has announced a significant enhancement to its GPU clusters with the integration of the NVIDIA H200 Tensor Core GPU, according to together.ai. This upgrade will be accompanied by the Together Kernel Collection (TKC), a custom-built kernel stack designed to optimize AI operations, providing substantial performance boosts for both training and inference tasks.

Enhanced Performance with TKC

The Together Kernel Collection (TKC) is engineered to accelerate common AI operations significantly. When compared to standard PyTorch implementations, TKC offers up to a 24% speedup for frequently used training operators and up to a 75% speedup for FP8 inference operations. This improvement is poised to reduce GPU hours, leading to cost efficiencies and faster time to market.

Training and Inference Optimization

TKC’s optimized kernels, such as the multi-layer perceptron (MLP) with SwiGLU activation, are crucial for training large language models (LLMs) like Llama-3. These kernels are reported to be 22-24% faster than standard implementations, with potential improvements up to 10% faster compared to the best existing baselines. Inference tasks benefit from a robust stack of FP8 kernels, which Together AI has optimized to deliver more than 75% speedup over base PyTorch implementations.

Native PyTorch Compatibility

TKC is fully integrated with PyTorch, enabling AI developers to utilize its optimizations seamlessly within their existing frameworks. This integration simplifies the adoption of TKC, making it as easy as changing import statements within PyTorch.

Production-Level Testing

Together AI ensures that TKC undergoes rigorous testing to meet production-level standards, guaranteeing high performance and reliability for real-world applications. All Together GPU Clusters, whether H200 or H100, will feature TKC out of the box.

NVIDIA H200: Faster Performance and Larger Memory

The NVIDIA H200 Tensor Core GPU, built on the Hopper architecture, is designed for high-performance AI and HPC workloads. According to NVIDIA, the H200 offers 40% faster inference performance on Llama 2 13B and 90% faster on Llama 2 70B, compared to its predecessor, the H100. The H200 features 141GB of HBM3e memory and 4.8TB/s of memory bandwidth, nearly doubling the capacity and 1.4 times the bandwidth of the H100.

High-Performance Interconnectivity

Together GPU Clusters leverage the SXM form factor for high bandwidth and fast data transfer, supported by NVIDIA’s NVLink and NVSwitch technologies for ultra-high-speed communication between GPUs. Combined with NVIDIA Quantum-2 3200Gb/s InfiniBand Networking, this setup is ideal for large-scale AI training and HPC workloads.

Cost-Effective Infrastructure

Together AI offers significant cost savings, with infrastructure designed to be up to 75% more cost-effective compared to cloud providers like AWS. The company also provides flexible commitment options, from one month to five years, ensuring the right resources at every stage of the AI development lifecycle.

Reliability and Support

Together AI’s GPU clusters come with a 99.9% uptime SLA and are backed by rigorous acceptance testing. The company’s White Glove Service offers end-to-end support, from cluster setup to ongoing maintenance, ensuring peak performance for AI models.

Flexible Deployment Options

Together AI provides several deployment options, including Slurm for high-performance workload management, Kubernetes for containerized AI workloads, and bare metal clusters running Ubuntu for direct access and ultimate flexibility. These options cater to different AI project needs, from large-scale training to production-level inference.

Together AI continues to support the entire AI lifecycle with its high-performance NVIDIA H200 GPU Clusters and the Together Kernel Collection. The platform is designed to optimize performance, reduce costs, and ensure reliability, making it an ideal choice for accelerating AI development.

Image source: Shutterstock


Credit: Source link

RELATED POSTS

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

Tether’s Strategic Investment in Generative Bionics Boosts Innovative Humanoid Robotics

Harvey Integrates NetDocuments for Enhanced Legal Document Management

Buy JNews
ADVERTISEMENT
ShareTweetSendPinShare
Previous Post

How to Create a Discord Voice Bot Using ChatGPT

Next Post

Can Today’s Bitcoin Options Expiry Reverse Market Momentum? 

Related Posts

Galaxy Digital: Ethereum Developers Discuss Key Upgrades During Latest Consensus Call
Blockchain

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

December 9, 2025
Tether Implements Wallet-Freezing Policy Aligned with US Regulations
Blockchain

Tether’s Strategic Investment in Generative Bionics Boosts Innovative Humanoid Robotics

December 8, 2025
Understanding Ambiguity: Causes and Effects
Blockchain

Harvey Integrates NetDocuments for Enhanced Legal Document Management

December 8, 2025
Next Post
Can Today’s Bitcoin Options Expiry Reverse Market Momentum? 

Can Today’s Bitcoin Options Expiry Reverse Market Momentum? 

Polygon Launches New Division Polygon Studios, Focusing on Blockchain Games and NFT Fields

Polygon (MATIC) Miden Alpha Testnet v4 Launches with New Features

Recommended Stories

No Content Available

Popular Stories

  • Winklevoss Twins Continue Crypto Donation Spree With Another $1,000,000 in Bitcoin (BTC)

    Trader Says DeFi Altcoin Aave Witnessing Clear Trend Switch, Updates Forecast on Two Low-Cap Coins

    0 shares
    Share 0 Tweet 0
  • Institutional investors have dropped BTC interest, waiting for it to hit $30K

    0 shares
    Share 0 Tweet 0
  • NFT Market Projected to Reach $200 Billion in 2030 – Bitcoin News

    0 shares
    Share 0 Tweet 0
  • Thunder Terminal’s Rapid Response to $240,000 Hack: Security Measures and Hacker’s Ransom Demand

    0 shares
    Share 0 Tweet 0
  • Value Locked in Defi Rebounds — Smart Contract Tokens CPH, LUNA, XCP Lead the Pack – Defi Bitcoin News

    0 shares
    Share 0 Tweet 0
CryptoSpiel.com

This is an online news portal that aims to provide the latest crypto news, blockchain, regulations and much more stuff like that around the world. Feel free to get in touch with us!

What’s New Here!

  • How crypto derivatives liquidation drove Bitcoin’s 2025 crash
  • Robinhood Charges Into Indonesia as Next Explosive Crypto Market
  • Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

Subscribe Now

Loading
  • Live Crypto Prices
  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2021 - cryptospiel.com - All rights reserved!

No Result
View All Result
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams

© 2021 - cryptospiel.com - All rights reserved!

Please enter CoinGecko Free Api Key to get this plugin works.