AMD Radeon PRO GPUs and ROCm Software Expand LLM Inference Capabilities

Felix Pinkston
Aug 31, 2024 01:52

AMD’s Radeon PRO GPUs and ROCm software enable small enterprises to leverage advanced AI tools, including Meta’s Llama models, for various business applications.

AMD has announced advancements in its Radeon PRO GPUs and ROCm software, enabling small enterprises to leverage Large Language Models (LLMs) like Meta’s Llama 2 and 3, including the newly released Llama 3.1, according to AMD.com.

New Capabilities for Small Enterprises

With dedicated AI accelerators and substantial on-board memory, AMD’s Radeon PRO W7900 Dual Slot GPU offers market-leading performance per dollar, making it feasible for small firms to run custom AI tools locally. This includes applications such as chatbots, technical documentation retrieval, and personalized sales pitches. The specialized Code Llama models further enable programmers to generate and optimize code for new digital products.

The latest release of AMD’s open software stack, ROCm 6.1.3, supports running AI tools on multiple Radeon PRO GPUs. This enhancement allows small and medium-sized enterprises (SMEs) to handle larger and more complex LLMs, supporting more users simultaneously.

Expanding Use Cases for LLMs

While AI techniques are already prevalent in data analysis, computer vision, and generative design, the potential use cases for AI extend far beyond these areas. Specialized LLMs like Meta’s Code Llama enable app developers and web designers to generate working code from simple text prompts or debug existing code bases. The parent model, Llama, offers extensive applications in customer service, information retrieval, and product personalization.

Small enterprises can utilize retrieval-augmented generation (RAG) to make AI models aware of their internal data, such as product documentation or customer records. This customization results in more accurate AI-generated outputs with less need for manual editing.

Local Hosting Benefits

Despite the availability of cloud-based AI services, local hosting of LLMs offers significant advantages:

Data Security: Running AI models locally eliminates the need to upload sensitive data to the cloud, addressing major concerns about data sharing.
Lower Latency: Local hosting reduces lag, providing instant feedback in applications like chatbots and real-time support.
Control Over Tasks: Local deployment allows technical staff to troubleshoot and update AI tools without relying on remote service providers.
Sandbox Environment: Local workstations can serve as sandbox environments for prototyping and testing new AI tools before full-scale deployment.

AMD’s AI Performance

For SMEs, hosting custom AI tools need not be complex or expensive. Applications like LM Studio facilitate running LLMs on standard Windows laptops and desktop systems. LM Studio is optimized to run on AMD GPUs via the HIP runtime API, leveraging the dedicated AI Accelerators in current AMD graphics cards to boost performance.

Professional GPUs like the 32GB Radeon PRO W7800 and 48GB Radeon PRO W7900 offer sufficient memory to run larger models, such as the 30-billion-parameter Llama-2-30B-Q8. ROCm 6.1.3 introduces support for multiple Radeon PRO GPUs, enabling enterprises to deploy systems with multiple GPUs to serve requests from numerous users simultaneously.

Performance tests with Llama 2 indicate that the Radeon PRO W7900 offers up to 38% higher performance-per-dollar compared to NVIDIA’s RTX 6000 Ada Generation, making it a cost-effective solution for SMEs.

With the evolving capabilities of AMD’s hardware and software, even small enterprises can now deploy and customize LLMs to enhance various business and coding tasks, avoiding the need to upload sensitive data to the cloud.

Image source: Shutterstock

Credit: Source link

AMD Radeon PRO GPUs and ROCm Software Expand LLM Inference Capabilities

Anthropic Reveals Claude Code Tool Design Philosophy Behind AI Agent Development

Riot Platforms Sells $289M in Bitcoin as Mining Output Drops 4% in Q1

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

NVIDIA Introduces Fast Inversion Technique for Real-Time Image Editing

Exploring the Future of Real World Assets in DeFi

Related Posts

Anthropic Reveals Claude Code Tool Design Philosophy Behind AI Agent Development

Riot Platforms Sells $289M in Bitcoin as Mining Output Drops 4% in Q1

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

Exploring the Future of Real World Assets in DeFi

Quidax Becomes Nigeria’s first SEC licensed Crypto Exchange

Recommended Stories

Popular Stories

Could XRP Be a Good Investment During Economic Uncertainty? 70% Post-Shutdown Rally History Says Yes – What About XRP Tundra?

Manta Foundation Allocates Treasury to wUSDM, Backed by BlackRock’s BUIDL Fund

EOS Soars 8% While Bitcoin Marked 18-Day Low: Weekend Watch

Gemini Pro vs GPT-4: A Comprehensive Comparison of AI Powerhouses

Why Grayscale’s SEC Victory Is Unlikely to Benefit Bitcoin and Crypto Markets in the Long Run

What’s New Here!

Subscribe Now

AMD Radeon PRO GPUs and ROCm Software Expand LLM Inference Capabilities

New Capabilities for Small Enterprises

Expanding Use Cases for LLMs

Local Hosting Benefits

AMD’s AI Performance

RELATED POSTS

NVIDIA Introduces Fast Inversion Technique for Real-Time Image Editing

Exploring the Future of Real World Assets in DeFi

Related Posts

Recommended Stories

Popular Stories

What’s New Here!

Subscribe Now