Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Riak Search: The Next Generation
Search
Tom Santero
September 17, 2013
Programming
0
160
Riak Search: The Next Generation
Presentation on Yokozuna (
https://github.com/basho/yokozuna
) at the NYC Riak Meetup group
Tom Santero
September 17, 2013
Tweet
Share
More Decks by Tom Santero
See All by Tom Santero
DeepStack: Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker
tsantero
1
350
Buridan's Principle
tsantero
1
410
Release Engineering from the Ground Up
tsantero
1
320
Beyond Fast and Slow
tsantero
0
240
Choose Your Own Consistency
tsantero
1
210
Erlang Fight Club
tsantero
5
460
Riak on Ruby: Keys, Values and CRDTs
tsantero
0
280
Consensus, Raft and Rafter
tsantero
21
3.7k
Riak: Distributed Storage for Games You Don't Have to Worry About
tsantero
6
1.8k
Other Decks in Programming
See All in Programming
ノーコードからの脱出 -地獄のデスロード- / Escape from Base44
keisuke69
0
670
エンジニアに事業やプロダクトを理解してもらうためにやってること
murabayashi
0
140
AIのバカさ加減に怒る前にやっておくこと
blueeventhorizon
0
160
Inside of Swift Export
giginet
PRO
1
530
2026年向け会社紹介資料
misu
0
150
Blazing Fast UI Development with Compose Hot Reload (droidcon London 2025)
zsmb
0
500
例外処理を理解して、設計段階からエラーを見つけやすく、起こりにくく #phpconfuk
kajitack
12
5.7k
Introducing RemoteCompose: break your UI out of the app sandbox.
camaelon
2
540
FlutterKaigi 2025 システム裏側
yumnumm
0
700
複数チーム並行開発下でのコード移行アプローチ ~手動 Codemod から「生成AI 活用」への進化
andpad
0
110
Flutterアプリ運用の現場で役立った監視Tips 5選
ostk0069
1
310
なんでRustの環境構築してないのにRust製のツールが動くの? / Why Do Rust-Based Tools Run Without a Rust Environment?
ssssota
15
48k
Featured
See All Featured
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.6k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
192
56k
For a Future-Friendly Web
brad_frost
180
10k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.1k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
658
61k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
31
2.7k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
9
970
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
34
2.3k
Into the Great Unknown - MozCon
thekraken
40
2.1k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.2k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Transcript
Riak Search the next generation Tuesday, September 17, 13
tsantero @ basho.com Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
2.0 coming soon.. Tuesday, September 17, 13
the history of Riak Search Tuesday, September 17, 13
home grown full-text search Tuesday, September 17, 13
lucene Tuesday, September 17, 13
SCALE Tuesday, September 17, 13
NODE # = HASH(KEY) % NUM_NODES NH(Ka) = 0 NH(Kb)
= 1 NH(Kc) = 2 NH(Kd) = 0 ... Naive Hashing Tuesday, September 17, 13
NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd
Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Naive Hashing Tuesday, September 17, 13
NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd
Kg Ki NODE 3 Ke Kf Kh Kj Kk Kl Km Kn Ko Kp Kq Kr Naive Hashing Tuesday, September 17, 13
K * (NN - 1) / NN => K •
K = # OF KEYS • NN = # OF NODES • AS NN GROWS FACTOR ESSENTIALLY BECOMES 1, THUS ALL KEYS MOVE Naive Hashing Tuesday, September 17, 13
PARTITION # = HASH(KEY) % PARTITIONS • # PARTITIONS REMAINS
CONSTANT • KEY ALWAYS MAPS TO SAME PARTITION • NODES OWN PARTITIONS • PARTITIONS CONTAIN KEYS • EXTRA LEVEL OF INDIRECTION Consistent Hashing Tuesday, September 17, 13
P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE
0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Consistent Hashing Tuesday, September 17, 13
P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE
0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr NODE 3 Consistent Hashing Tuesday, September 17, 13
NN * K/Q => K/Q • K = # OF
KEYS • NN = # OF NODES • Q = # OF PARTITIONS • AS K GROWS NN BECOMES CONSTANT, THUS K/Q KEYS MOVE Consistent Hashing Tuesday, September 17, 13
uniform distribution Consistent Hashing {logical vs physical partitioning scheme even
division of keys Tuesday, September 17, 13
the future of Riak Search Tuesday, September 17, 13
Tuesday, September 17, 13
persistence distributing Solr querying indexing Tuesday, September 17, 13
Each Riak node runs an instance of Solr Tuesday, September
17, 13
Solr index = riak bucket document = RObj value plaintext,
JSON, XML Tuesday, September 17, 13
Distributed Searching in Solr query faceting highlighting stats spell check
term vectors Tuesday, September 17, 13
SolrCloud Tuesday, September 17, 13
SolrCloud Tuesday, September 17, 13
Harvest vs Yield Tuesday, September 17, 13
A better measure of Availability Tuesday, September 17, 13
Queries Issues Queries Offered Yield = Tuesday, September 17, 13
Harvest = Data Available Total Dataset Tuesday, September 17, 13
Harvest Yield Tuesday, September 17, 13
Manage Harvest by storing Index Replicas Tuesday, September 17, 13
Term vs Document Partitioning Schemes Tuesday, September 17, 13
Node 0 Node 1 Node 2 Term Based Partitioning Tuesday,
September 17, 13
Node 0 Node 1 Node 2 Document Based Partitioning Tuesday,
September 17, 13
Replicas Node 0 Node 1 Node 2 Tuesday, September 17,
13
Quorums Tuesday, September 17, 13
Concurrency => Siblings Tuesday, September 17, 13
Read Repair (Anti-Entropy) Tuesday, September 17, 13
replica replica replica Tuesday, September 17, 13
replica replica replica X Tuesday, September 17, 13
replica replica replica replica replica replica Tuesday, September 17, 13
Active Anti-Entropy (self healing clusters) Tuesday, September 17, 13
real-time updates persistent non-blocking disk-based Tuesday, September 17, 13
Tuesday, September 17, 13
= hashes marked “dirty” Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
= keys to read-repair Tuesday, September 17, 13
Questions? make it so! Tuesday, September 17, 13