CryptoSpiel.com
No Result
View All Result
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams
No Result
View All Result
CryptoSpiel.com
No Result
View All Result

NVIDIA NIM Enhances RAG Applications for Veterinary AI

August 27, 2024
in Blockchain
Reading Time: 6 mins read
A A
0
Nvidia Plans to add Innovation in the Metaverse with Software, Marketplace Deals
0
SHARES
7
VIEWS
ShareShareShareShareShare


Iris Coleman
Aug 27, 2024 19:56

NVIDIA NIM improves retrieval-augmented generation (RAG) applications, streamlining AI solutions in specialized fields like veterinary science.





The advent of large language models (LLMs) has significantly benefited the AI industry, offering versatile tools capable of generating human-like text and handling a wide range of tasks. However, while LLMs demonstrate impressive general knowledge, their performance in specialized fields, such as veterinary science, is limited when used out of the box. To enhance their utility in specific areas, two primary strategies are commonly adopted in the industry: fine-tuning and retrieval-augmented generation (RAG).

Fine-Tuning vs. RAG

Fine-tuning involves training the model on a carefully curated and structured dataset, demanding substantial hardware resources, as well as the involvement of domain experts, a process that is often time-consuming and costly. Unfortunately, in many fields, it’s incredibly challenging to access domain experts in a way that is compatible with business constraints.

Conversely, RAG involves building a comprehensive corpus of knowledge literature, alongside an effective retrieval system that extracts relevant text chunks to address user queries. By adding this retrieved information to the user query, LLMs can produce better answers. Although this approach still requires subject matter experts to curate the best sources for the dataset, it is more tractable and business-compatible than fine-tuning. Also, since extensive training of the model isn’t necessary, this approach is less computationally intensive and more cost-effective.

NVIDIA NIM and NLP Pipelines

NVIDIA NIM streamlines the design of NLP pipelines using LLMs. These microservices simplify the deployment of generative AI models across platforms, allowing teams to self-host LLMs while offering standard APIs to build applications.

NIM abstracts model inference internals like execution engines and runtime operations, ensuring optimal performance with TensorRT-LLM, vLLM, and others. Key features include:

  • Scalable deployment
  • Support for diverse LLM architectures with optimized engines
  • Flexible integration into existing workflows
  • Enterprise-grade security with safetensors and constant CVE monitoring

Developers can run NIM microservices with Docker and perform inference using APIs. Specialized trained model weights can also be used for specific tasks, such as document parsing, by modifying container commands.

Reimagining Veterinary Care with AI

At AITEM, a member of the NVIDIA Inception Program for startups, collaboration with NVIDIA has focused on AI-based solutions across multiple fields, including industrial and life sciences. In the veterinary sector, AITEM is working on LAIKA, an innovative AI copilot designed to assist veterinarians by processing patient data and offering diagnostic suggestions, guidance, and clarifications.

LAIKA integrates multiple LLMs and RAG pipelines. The RAG component retrieves relevant information from a curated dataset of veterinary resources. During preparation, each resource is divided into chunks, with embeddings calculated and stored in the RAG database. During inference, the query is pre-processed and its embeddings are computed and compared with those in the RAG database using geometric distance metrics. The closest matches are selected as the most relevant and used to generate responses.

Due to potential redundancy in the RAG database, multiple retrieved chunks might contain the same information, limiting the diversity of concepts provided to the answer system. To address this, LAIKA employs the Maximal Marginal Relevance (MMR) algorithm to minimize chunk redundancy and ensure a broader range of relevant information.

NVIDIA NeMo Retriever Reranking NIM Microservice

The NVIDIA API Catalog includes NeMo Retriever NIM microservices that enable organizations to seamlessly connect custom models to diverse business data and deliver highly accurate responses. The NVIDIA Retrieval QA Mistral 4B reranking NIM microservice is designed to assess the probability that a given text passage contains relevant information for answering a user query. Integrating this model into the RAG pipeline enables filtering out retrievals that do not pass the reranking model’s evaluation, ensuring that only the most relevant and accurate information is used.

To assess the impact of this step on the RAG pipeline, AITEM designed an experiment:

  1. Extract a dataset of ~100 anonymized questions from LAIKA users.
  2. Run the current RAG pipeline to retrieve chunks for each question.
  3. Sort the retrieved chunks based on probabilities provided by the reranking model.
  4. Evaluate each chunk for relevance to the query.
  5. Analyze the reranking model’s probability distribution in relation to the relevance determined in Step 4.
  6. Compare the ranking of chunks in Step 3 against their relevance from Step 4.

User questions in LAIKA can vary significantly in form. Some queries contain detailed explanations of a situation but lack a specific question. Others contain precise inquiries regarding research, while some seek guidance or differential diagnoses based on clinical cases or analysis documents.

Due to the large number of chunks per question, AITEM used the Llama 3.1 70B Instruct NIM microservice for evaluation, which is also available in the NVIDIA API Catalog.

To better understand the reranking model’s performance, specific queries and model responses were examined in detail. Table 1 highlights the top and bottom reranked chunks for a sample query regarding differential diagnoses for a cat losing weight.

Text Reranking Logit
Causes of weight loss that can be particularly difficult to diagnose … include gastric disease not causing vomiting, intestinal disease not causing vomiting or diarrhea, hepatic disease … 3.3125
Differential diagnoses for nonspecific signs like anorexia, weight loss, vomiting, and diarrhea … acute pancreatitis is rare in cats, … signs are nonspecific and ill-defined (anorexia, lethargy, weight loss). 2.3222
Severe weight loss (with or without increased appetite) may be noted where there is cancer cachexia, maldigestion/malabsorption … Appetite may be increased in some conditions, such as hyperthyroidism in cats, … However, a normal appetite does not rule out the presence of a serious condition. 2.2265
Overall, weight loss was the most common presenting sign … with little difference between the groups … -5.0078
Other client complaints include lethargy, anorexia, weight loss, vomiting … -7.3672
There were 6 British Shorthair, 4 European Shorthair, and 1 Bengal cat … Reported clinical signs by owners included: reduced appetite or anorexia… -10.3281
Table 1. Three highest-ranked chunks and three lowest-ranked text chunks

Figure 4 compares the reranking model probability output distribution (in logits) between relevant (good) and irrelevant (bad) chunks. The probabilities for good chunks are higher compared to bad chunks, and a t-test confirmed that this difference is statistically significant, with a p-value lower than 3e-72.

logit-distribution-good-bad-chunks-625x450.png
Figure 4. Distribution of reranking model output in terms of logits

Figure 5 shows the distribution difference in the reranking-induced sorting positions: good chunks are predominantly in top positions, while bad chunks are lower. The Mann-Whitney test confirmed that these differences are statistically significant, resulting in a p-value lower than 9e-31.

distribution-good-bad-chunks-model-sorting-625x458.png
Figure 5. Distribution of reranking model-induced sorting among the retrieved chunks

Figure 6 shows the ranking distribution and helps define an effective cutoff point. In the top five positions, most chunks are good, while the majority of chunks in positions 11-15 are bad. Thus, retaining only the top five retrievals or another chosen number can serve as one way to effectively exclude most of the bad chunks.

good-bad-chunk-balance-model-sorting-625x472.png
Figure 6. Balance between good and bad chunks by position in the sorting induced by the reranking model

To optimize retrieval pipelines, and minimize ingestion costs while maximizing accuracy, a lightweight embedding model can be paired with the NVIDIA reranking NIM microservice, to boost retrieval accuracy. Execution time can be improved by 1.75x (Figure 7).

nv-rerankqa-mistral4b-v3-comparison-625x711.png
Figure 7. NVIDIA reranking NIM microservice comparison

Better Answers with the NVIDIA Reranking NIM Microservice

The results demonstrate that adding the NVIDIA reranking NIM microservice to the LAIKA RAG pipeline positively affects the relevance of retrieved chunks. By forwarding more precise, specialized information to the downstream answering LLM, it equips the model with the knowledge that’s necessary for highly specialized fields like veterinary science.

The NVIDIA reranking NIM microservice, available in the NVIDIA API Catalog, simplifies adoption as you can easily pull and run the model and infer its evaluations through APIs. This eliminates stress related to environment settings and manual optimization, as it comes pre-quantized and optimized with NVIDIA TensorRT for almost any platform.

For more information and the latest updates about LAIKA and other AITEM projects, see AITEM Solutions and follow LAIKA and AITEM on LinkedIn.

Image source: Shutterstock


Credit: Source link

RELATED POSTS

Anthropic Reveals Claude Code Tool Design Philosophy Behind AI Agent Development

Riot Platforms Sells $289M in Bitcoin as Mining Output Drops 4% in Q1

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

Buy JNews
ADVERTISEMENT
ShareTweetSendPinShare
Previous Post

Ex-Goldman Sachs Executive Predicts 10x for Solana (SOL), Says ‘Really Good Trade’ Approaching

Next Post

Tether CEO Criticizes Durov’s Arrest, States Europe Is ‘Falling Into Dark Ages’

Related Posts

Bitcoin Addresses Holding Between 100 and 10,000 BTC Hit a 7-Week High
Blockchain

Anthropic Reveals Claude Code Tool Design Philosophy Behind AI Agent Development

April 10, 2026
Riot Blockchain Yearly Bitcoin Production Increases by 236%, Accumulates $194M in BTC
Blockchain

Riot Platforms Sells $289M in Bitcoin as Mining Output Drops 4% in Q1

April 2, 2026
Galaxy Digital: Ethereum Developers Discuss Key Upgrades During Latest Consensus Call
Blockchain

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

December 9, 2025
Next Post
Tether CEO Criticizes Durov’s Arrest, States Europe Is ‘Falling Into Dark Ages’

Tether CEO Criticizes Durov’s Arrest, States Europe Is ‘Falling Into Dark Ages’

Small Litecoin (LTC) Fishes Are ‘Jumping Ship,’ Here’s What it Means

Small Litecoin (LTC) Fishes Are 'Jumping Ship,' Here's What it Means

Recommended Stories

No Content Available

Popular Stories

  • Winklevoss Twins Continue Crypto Donation Spree With Another $1,000,000 in Bitcoin (BTC)

    Trader Says DeFi Altcoin Aave Witnessing Clear Trend Switch, Updates Forecast on Two Low-Cap Coins

    0 shares
    Share 0 Tweet 0
  • Rich Dad Poor Dad’s Robert Kiyosaki Says He’s Buying Bitcoin and Ether as Inflation Escalates – Economics Bitcoin News

    0 shares
    Share 0 Tweet 0
  • 10 Best Crypto Presales for Future-Proof Investments (up to 10000x Long-term Gains)

    0 shares
    Share 0 Tweet 0
  • Chingari partners with Fashion TV for exclusive content

    0 shares
    Share 0 Tweet 0
  • LangChain Expands DeepAgents Capability with New Update

    0 shares
    Share 0 Tweet 0
CryptoSpiel.com

This is an online news portal that aims to provide the latest crypto news, blockchain, regulations and much more stuff like that around the world. Feel free to get in touch with us!

What’s New Here!

  • Ripple CEO Says CLARITY Act Talks Near Breakthrough as Senate Standoff Eases
  • SEC Opens Proceedings on NYSE Proposal to List Grayscale Crypto ETF Options – Regulation Bitcoin News
  • Anthropic Reveals Claude Code Tool Design Philosophy Behind AI Agent Development

Subscribe Now

Loading
  • Live Crypto Prices
  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2021 - cryptospiel.com - All rights reserved!

No Result
View All Result
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams

© 2021 - cryptospiel.com - All rights reserved!

Please enter CoinGecko Free Api Key to get this plugin works.