CryptoSpiel.com
No Result
View All Result
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams
No Result
View All Result
CryptoSpiel.com
No Result
View All Result

Anyscale Enhances Ray Data with Joins and Hash-Shuffle for Improved Performance

May 20, 2025
in Blockchain
Reading Time: 2 mins read
A A
0
Bitcoin Holdings in Public Company Treasuries Exceed 200,000 BTC
0
SHARES
7
VIEWS
ShareShareShareShareShare


Timothy Morano
May 20, 2025 04:25

Anyscale introduces a hash-based shuffle backend in Ray Data, enhancing joins and performance for repartitioning and aggregations. Discover the advancements in the Ray 2.46 release.





Anyscale has unveiled significant improvements to Ray Data, highlighted by the introduction of a hash-based shuffle backend, according to Anyscale. This new feature, part of the Ray 2.46 release, aims to enhance joins and improve performance for data repartitioning and aggregations, while also reducing memory pressure.

Enhancements in Ray Data

The latest release boasts several new features, including native join support via the ds.join() API, key-based repartitioning, and a simplified custom aggregation API named AggregateFnV2. Additionally, the performance of large-scale sorting has been improved, which enhances range partitioning shuffle.

The newly introduced hash-based shuffle backend addresses previous limitations of the range-based shuffle approach. In prior versions, shuffling relied on range-partitioning, which was resource-intensive and prone to bottlenecks. The new method partitions incoming data blocks based on key-value tuples, directing them to corresponding Aggregator actors for efficient processing.

Implementing Joins with Hash Shuffle

Ray 2.46 introduces support for various join types, including inner, left/right, and full outer joins. The hash-shuffle backend co-locates records with the same keys, optimizing performance. This approach utilizes Apache Arrow’s Acero engine through PyArrow’s native Table.join operation, although it can be memory-intensive.

Benchmarking Performance

Performance benchmarks demonstrate substantial improvements across multiple workloads. Tests conducted on a cluster with m7i.4xlarge and m7i.16xlarge instances reveal performance gains ranging from 3.3x to 5.6x when using the hash-based shuffle, compared to previous versions. Notably, the TPCH-Q1-SF1000 workload, which was previously unmanageable, is now feasible with the new backend.

Additional tests showed that range-partitioning shuffle has also improved, with runtime enhancements between 1.6x and 4.3x. Importantly, the hash shuffle backend significantly reduces peak memory usage, with improvements up to 3.9x.

Future Developments

Looking ahead, Anyscale plans to expand support for different join types and implement logical plan optimizations to reorder joins. Further enhancements to data preprocessors are also anticipated.

These advancements in Ray Data are set to empower developers with more efficient data processing capabilities. For more insights, visit the official Anyscale blog.

Image source: Shutterstock


Credit: Source link

RELATED POSTS

Anthropic Reveals Claude Code Tool Design Philosophy Behind AI Agent Development

Riot Platforms Sells $289M in Bitcoin as Mining Output Drops 4% in Q1

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

Buy JNews
ADVERTISEMENT
ShareTweetSendPinShare
Previous Post

Ethereum Sees $205M Weekly Inflows Following Successful Pectra Upgrade

Next Post

Fibonacci Retracement: A Trader’s Compass in the Bitcoin Market

Related Posts

Bitcoin Addresses Holding Between 100 and 10,000 BTC Hit a 7-Week High
Blockchain

Anthropic Reveals Claude Code Tool Design Philosophy Behind AI Agent Development

April 10, 2026
Riot Blockchain Yearly Bitcoin Production Increases by 236%, Accumulates $194M in BTC
Blockchain

Riot Platforms Sells $289M in Bitcoin as Mining Output Drops 4% in Q1

April 2, 2026
Galaxy Digital: Ethereum Developers Discuss Key Upgrades During Latest Consensus Call
Blockchain

Exploring Chainlink’s Role Beyond Price Feeds in the Blockchain Ecosystem

December 9, 2025
Next Post
Fibonacci Retracement: A Trader’s Compass in the Bitcoin Market

Fibonacci Retracement: A Trader’s Compass in the Bitcoin Market

Understanding Ambiguity: Causes and Effects

Anyscale Expands AI Compute Capabilities with New Multi-Cloud and AKS Support

Recommended Stories

Bitcoin Addresses Holding Between 100 and 10,000 BTC Hit a 7-Week High

Anthropic Reveals Claude Code Tool Design Philosophy Behind AI Agent Development

April 10, 2026
Can US-Iran new peace deal signal keep Bitcoin above $70,000?

Can US-Iran new peace deal signal keep Bitcoin above $70,000?

April 8, 2026
Stabble Urges Users to Pull Liquidity After Alleged North Korean Hacker Link

Stabble Urges Users to Pull Liquidity After Alleged North Korean Hacker Link

April 8, 2026

Popular Stories

  • Renowned 3D NFT Artist Gal Yosef Announces Meta Eagle Club Collection Backed By Eden Gallery

    Renowned 3D NFT Artist Gal Yosef Announces Meta Eagle Club Collection Backed By Eden Gallery

    0 shares
    Share 0 Tweet 0
  • Trader Says DeFi Altcoin Aave Witnessing Clear Trend Switch, Updates Forecast on Two Low-Cap Coins

    0 shares
    Share 0 Tweet 0
  • Leading US-based energy firm explores Bitcoin mining

    0 shares
    Share 0 Tweet 0
  • Ripple v. SEC Lawsuit Update October 8th

    0 shares
    Share 0 Tweet 0
  • Fidelity’s FBTC Leads the Pack as US Spot Bitcoin ETFs Break Negative Streak

    0 shares
    Share 0 Tweet 0
CryptoSpiel.com

This is an online news portal that aims to provide the latest crypto news, blockchain, regulations and much more stuff like that around the world. Feel free to get in touch with us!

What’s New Here!

  • Ripple CEO Says CLARITY Act Talks Near Breakthrough as Senate Standoff Eases
  • SEC Opens Proceedings on NYSE Proposal to List Grayscale Crypto ETF Options – Regulation Bitcoin News
  • Anthropic Reveals Claude Code Tool Design Philosophy Behind AI Agent Development

Subscribe Now

Loading
  • Live Crypto Prices
  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2021 - cryptospiel.com - All rights reserved!

No Result
View All Result
  • Home
  • Live Crypto Prices
  • Live ICO
  • Exchange
  • Crypto News
  • Bitcoin
  • Altcoins
  • Blockchain
  • Regulations
  • Trading
  • Scams

© 2021 - cryptospiel.com - All rights reserved!

Please enter CoinGecko Free Api Key to get this plugin works.