Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Riak Search: The Next Generation
Search
Tom Santero
September 17, 2013
Programming
0
160
Riak Search: The Next Generation
Presentation on Yokozuna (
https://github.com/basho/yokozuna
) at the NYC Riak Meetup group
Tom Santero
September 17, 2013
Tweet
Share
More Decks by Tom Santero
See All by Tom Santero
DeepStack: Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker
tsantero
1
350
Buridan's Principle
tsantero
1
410
Release Engineering from the Ground Up
tsantero
1
330
Beyond Fast and Slow
tsantero
0
250
Choose Your Own Consistency
tsantero
1
210
Erlang Fight Club
tsantero
5
460
Riak on Ruby: Keys, Values and CRDTs
tsantero
0
280
Consensus, Raft and Rafter
tsantero
21
3.7k
Riak: Distributed Storage for Games You Don't Have to Worry About
tsantero
6
1.8k
Other Decks in Programming
See All in Programming
20251127_ぼっちのための懇親会対策会議
kokamoto01_metaps
2
400
「コードは上から下へ読むのが一番」と思った時に、思い出してほしい話
panda728
PRO
1
1.1k
Building AI Agents with TypeScript #TSKaigiHokuriku
izumin5210
6
1.2k
UIデザインに役立つ 2025年の最新CSS / The Latest CSS for UI Design 2025
clockmaker
17
6.6k
CSC305 Lecture 15
javiergs
PRO
0
240
AWS CDKの推しポイントN選
akihisaikeda
1
240
【CA.ai #3】Google ADKを活用したAI Agent開発と運用知見
harappa80
0
260
TypeScript 5.9 で使えるようになった import defer でパフォーマンス最適化を実現する
bicstone
1
1k
TypeScriptで設計する 堅牢さとUXを両立した非同期ワークフローの実現
moeka__c
6
2.9k
関数実行の裏側では何が起きているのか?
minop1205
1
560
AIと協働し、イベントソーシングとアクターモデルで作る後悔しないアーキテクチャ Regret-Free Architecture with AI, Event Sourcing, and Actors
tomohisa
5
18k
令和最新版Android Studioで化石デバイス向けアプリを作る
arkw
0
110
Featured
See All Featured
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.1k
The World Runs on Bad Software
bkeepers
PRO
72
12k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
15k
Unsuck your backbone
ammeep
671
58k
VelocityConf: Rendering Performance Case Studies
addyosmani
333
24k
Building Applications with DynamoDB
mza
96
6.8k
We Have a Design System, Now What?
morganepeng
54
7.9k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
9
1k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.5k
Art, The Web, and Tiny UX
lynnandtonic
303
21k
GraphQLの誤解/rethinking-graphql
sonatard
73
11k
Transcript
Riak Search the next generation Tuesday, September 17, 13
tsantero @ basho.com Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
2.0 coming soon.. Tuesday, September 17, 13
the history of Riak Search Tuesday, September 17, 13
home grown full-text search Tuesday, September 17, 13
lucene Tuesday, September 17, 13
SCALE Tuesday, September 17, 13
NODE # = HASH(KEY) % NUM_NODES NH(Ka) = 0 NH(Kb)
= 1 NH(Kc) = 2 NH(Kd) = 0 ... Naive Hashing Tuesday, September 17, 13
NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd
Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Naive Hashing Tuesday, September 17, 13
NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd
Kg Ki NODE 3 Ke Kf Kh Kj Kk Kl Km Kn Ko Kp Kq Kr Naive Hashing Tuesday, September 17, 13
K * (NN - 1) / NN => K •
K = # OF KEYS • NN = # OF NODES • AS NN GROWS FACTOR ESSENTIALLY BECOMES 1, THUS ALL KEYS MOVE Naive Hashing Tuesday, September 17, 13
PARTITION # = HASH(KEY) % PARTITIONS • # PARTITIONS REMAINS
CONSTANT • KEY ALWAYS MAPS TO SAME PARTITION • NODES OWN PARTITIONS • PARTITIONS CONTAIN KEYS • EXTRA LEVEL OF INDIRECTION Consistent Hashing Tuesday, September 17, 13
P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE
0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Consistent Hashing Tuesday, September 17, 13
P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE
0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr NODE 3 Consistent Hashing Tuesday, September 17, 13
NN * K/Q => K/Q • K = # OF
KEYS • NN = # OF NODES • Q = # OF PARTITIONS • AS K GROWS NN BECOMES CONSTANT, THUS K/Q KEYS MOVE Consistent Hashing Tuesday, September 17, 13
uniform distribution Consistent Hashing {logical vs physical partitioning scheme even
division of keys Tuesday, September 17, 13
the future of Riak Search Tuesday, September 17, 13
Tuesday, September 17, 13
persistence distributing Solr querying indexing Tuesday, September 17, 13
Each Riak node runs an instance of Solr Tuesday, September
17, 13
Solr index = riak bucket document = RObj value plaintext,
JSON, XML Tuesday, September 17, 13
Distributed Searching in Solr query faceting highlighting stats spell check
term vectors Tuesday, September 17, 13
SolrCloud Tuesday, September 17, 13
SolrCloud Tuesday, September 17, 13
Harvest vs Yield Tuesday, September 17, 13
A better measure of Availability Tuesday, September 17, 13
Queries Issues Queries Offered Yield = Tuesday, September 17, 13
Harvest = Data Available Total Dataset Tuesday, September 17, 13
Harvest Yield Tuesday, September 17, 13
Manage Harvest by storing Index Replicas Tuesday, September 17, 13
Term vs Document Partitioning Schemes Tuesday, September 17, 13
Node 0 Node 1 Node 2 Term Based Partitioning Tuesday,
September 17, 13
Node 0 Node 1 Node 2 Document Based Partitioning Tuesday,
September 17, 13
Replicas Node 0 Node 1 Node 2 Tuesday, September 17,
13
Quorums Tuesday, September 17, 13
Concurrency => Siblings Tuesday, September 17, 13
Read Repair (Anti-Entropy) Tuesday, September 17, 13
replica replica replica Tuesday, September 17, 13
replica replica replica X Tuesday, September 17, 13
replica replica replica replica replica replica Tuesday, September 17, 13
Active Anti-Entropy (self healing clusters) Tuesday, September 17, 13
real-time updates persistent non-blocking disk-based Tuesday, September 17, 13
Tuesday, September 17, 13
= hashes marked “dirty” Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
= keys to read-repair Tuesday, September 17, 13
Questions? make it so! Tuesday, September 17, 13