AssemblyAI Unveils Enhanced PII Redaction and Entity Detection Features

Jessie A Ellis
Jul 26, 2024 05:52

AssemblyAI introduces advanced PII Redaction in 47 languages and adds 16 new entity types to its Entity Detection model, ensuring 99% accuracy.

AssemblyAI has announced significant upgrades to its PII Redaction and Entity Detection features, aimed at enhancing data security and extracting key insights from audio transcripts. According to AssemblyAI, the latest updates include support for PII Text Redaction across 47 languages and the addition of 16 new entity types to its Entity Detection model, bringing the total to 44.

Enhanced PII Redaction Capabilities

The updated PII Text Redaction feature now supports 47 languages, ensuring comprehensive protection of personally identifiable information (PII) across diverse regions. This upgrade allows users to identify and remove sensitive data such as addresses, phone numbers, and credit card details from their transcripts. Additionally, users can generate transcripts with PII removed or use the tool to “beep out” sensitive information in audio files.

An example of how to use the API for PII redaction is provided by AssemblyAI:

import assemblyai as aai

aai.settings.api_key = "YOUR API KEY"

audio_url = "https://github.com/AssemblyAI-Community/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3"

config = aai.TranscriptionConfig(speaker_labels=True).set_redact_pii(
  policies=[
    aai.PIIRedactionPolicy.person_name,
    aai.PIIRedactionPolicy.organization,
    aai.PIIRedactionPolicy.occupation,
  ],
  substitution=aai.PIISubstitutionPolicy.hash,
)

transcript = aai.Transcriber().transcribe(audio_url, config)

for utterance in transcript.utterances:
  print(f"Speaker {utterance.speaker}: {utterance.text}")
  
print(transcript.text)

Users can refer to AssemblyAI’s documentation for more detailed examples and an in-depth dive into the updates.

Expanded Entity Detection

The Entity Detection model has been upgraded with 16 new entity types, allowing for the automatic identification and categorization of critical information in transcripts. This brings the total number of supported entity types to 44, which includes names, organizations, addresses, and more. The model ensures 99% accuracy in major languages, making it a robust tool for extracting valuable insights from audio data.

An example of how to use the API for Entity Detection is also provided:

import assemblyai as aai

aai.settings.api_key = "YOUR API KEY"

audio_url = "https://github.com/AssemblyAI-Community/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3"

config = aai.TranscriptionConfig(entity_detection=True)

transcript = aai.Transcriber().transcribe(audio_url, config)

for entity in transcript.entities:
  print(entity.text)
  print(entity.entity_type)
  print(f"Timestamp: {entity.start} - {entity.end}\n")

Additional Resources

AssemblyAI has also shared several new blog posts and tutorials to help users get the most out of their products. Topics include using Claude 3.5 Sonnet with audio data, understanding Microsoft’s Florence-2 image model, and creating a real-time language translation service with AssemblyAI and DeepL in JavaScript.

For more information on these updates and to explore additional resources, visit AssemblyAI’s official blog.

Image source: Shutterstock

Credit: Source link

AssemblyAI Unveils Enhanced PII Redaction and Entity Detection Features

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

Tether’s Strategic Investment in Generative Bionics Boosts Innovative Humanoid Robotics

Harvey Integrates NetDocuments for Enhanced Legal Document Management

Washington Regulator Warns of Miami Equity Fund 005 Crypto Fraud

Will $4.3B in Bitcoin Options Expiring Tank Crypto Markets?

Related Posts

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

Tether’s Strategic Investment in Generative Bionics Boosts Innovative Humanoid Robotics

Harvey Integrates NetDocuments for Enhanced Legal Document Management

Will $4.3B in Bitcoin Options Expiring Tank Crypto Markets?

Epic Altcoin Rally Predicted for August and September

Recommended Stories

Popular Stories

Trader Says DeFi Altcoin Aave Witnessing Clear Trend Switch, Updates Forecast on Two Low-Cap Coins

Bitcoin trust with 635.000 BTC jumps 12% after deadline expiry Winklevoss’ Gemini

Bitcoin Futures’ Open Interest Reaches Lifetime High, Surpassing 2021 Bull Run

Austin City Passes Two Crypto and Blockchain Resolutions

XRP Bulls Battle To Defend 2020 Highs, These Are The Levels to Watch

What’s New Here!

Subscribe Now

AssemblyAI Unveils Enhanced PII Redaction and Entity Detection Features

Enhanced PII Redaction Capabilities

Expanded Entity Detection

Additional Resources

RELATED POSTS

Washington Regulator Warns of Miami Equity Fund 005 Crypto Fraud

Will $4.3B in Bitcoin Options Expiring Tank Crypto Markets?

Related Posts

Recommended Stories

Popular Stories

What’s New Here!

Subscribe Now