Ray in 2023: Ray in Reflection

December 07, 2023

160

Ray in 2023: Ray in Reflection

The quick recap and reflection of how Ray has progressed and its pivotal role in the LLM stack landscape
and Generative AI domain.

Anyscale

December 07, 2023

Tweet

More Decks by Anyscale

See All by Anyscale

Evaluating LLM Applications is hard

0

4.2k

Developing and serving RAG-Based LLM applications in production

0

150

Ray_Essentials__Introduction_to_Ray_for_machine_learning.pdf

0

170

How to build a serverless database cloud service

0

110

Multi-Region/Cloud Ray Pipeline with Distributed Caching

0

180

Modern Compute Stack for Scaling Large AI/ML/LLM Workloads

0

110

5 Painful Lessons using LLMs

0

140

How continuous batching enables 23x throughput in LLM inference

0

1.4k

Ray Community LLM August Meetup

0

97

Other Decks in Technology

See All in Technology

【CEDEC2025】ブランド力アップのためのコンテンツマーケティング～ゲーム会社における情報資産の活かし方～

PRO

0

230

クマ×共生 HACKATHON - 熊対策を『特別な行動」から「生活の一部」に -

0

290

OPENLOGI Company Profile for engineer

1

37k

LLM開発を支えるエヌビディアの生成AIエコシステム

acceleratedmu3n

0

370

私とAWSとの関わりの歩み～意志あるところに道は開けるかも？～

1

160

【OptimizationNight】数理最適化のラストワンマイルとしてのUIUX

0

160

AI関数が早くなったので試してみよう

0

120

Rubyの国のPerlMonger

3

730

【CEDEC2025】『Shadowverse: Worlds Beyond』二度目のDCG開発でゲームをリデザインする～遊びやすさと競技性の両立～

PRO

1

290

Bet "Bet AI" - Accelerating Our AI Journey #BetAIDay

PRO

4

1.5k

AWS re:Inforce 2025 re:Cap Update Pickup & AWS Control Tower の運用における考慮ポイント

1

200

猫でもわかるQ_CLI(CDK開発編)+ちょっとだけKiro

0

3.4k

Featured

See All Featured

Learning to Love Humans: Emotional Interface Design

273

40k

10 Git Anti Patterns You Should be Aware of

PRO

656

60k

Being A Developer After 40

90

590k

I Don’t Have Time: Getting Over the Fear to Launch Your Podcast

33

2.4k

How STYLIGHT went responsive

100

5.7k

A designer walks into a library…

pauljervisheath

207

24k

Facilitating Awesome Meetings

54

6.5k

Practical Tips for Bootstrapping Information Extraction Pipelines

PRO

21

1.4k

Product Roadmaps are Hard

PRO

54

11k

jQuery: Nuts, Bolts and Bling

63

7.8k

ピンチをチャンスに：未来をつくるプロダクトロードマップ #pmconf2020

126

53k

Writing Fast Ruby

628

62k

Transcript

Ray in 2023 Robert Nishihara
None
12x 50% 40% 10x 5x 30% Why Ray? faster cheaper
cheaper cheaper faster cheaper
As AI capabilities have grown, so have the challenges Scale
Future readiness Cost These are the challenges Ray was built for
Anyscale Endpoints - fine-tuning Llama-2-7B GPT-4 fine-tuned 86% 3% 78%
Superior task-specific performance at 1/300th the cost of GPT-4!
Spark SageMaker $0 $20 $40 $60 $3.5 $7.3 $57 AWS
Cost to process 1M images $2.5 Batch inference - costs
Anyscale Endpoints Cost efficient LLM inference Anyscale Endpoints Single GPU
optimizations Multi-GPU modeling Inference server Autoscaling Multi-region, multi-cloud $1 / million tokens (Llama-2 70B)
None
None
None