Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Boaz Leskes
June 04, 2013
Technology
81
0
Share
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
330
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
550
Resiliency in Elasticsearch & Lucene
bleskes
0
260
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
740
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
380
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
710
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
350
Other Decks in Technology
See All in Technology
全社統制を維持しながら現場負担をどう減らすか〜プラットフォームチームとセキュリティチームで進めたSecurity Hub活用によるAWS統制の見直し〜/secjaws-security-hub-custom-insights
mhrtech
1
480
20260515 OpenIDファウンデーション・ジャパンご紹介
oidfj
0
100
アプリブロック機能のつくりかたと、AIとHTMLの不合理な相性の良さについて
kumamotone
1
250
Oracle AI Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
6
1.6k
10サービス以上のメール到達率改善を地道に継続的に進めている話 / Continue to improve email delivery rates across multiple services
yamaguchitk333
6
1.7k
LookerとADKで作る社内AIエージェント
chanyou0311
0
180
カオナビに Suspenseを導入するまで / The Road to Suspense at kaonavi
kaonavi
1
450
毎日の作業を Claude Code 経由にしたら、 ノウハウがコードになった
kossykinto
1
1.3k
Oracle Base Database Service 技術詳細
oracle4engineer
PRO
15
100k
iOS・Androidの文字サイズ設定をWebViewに!モバイルUIのアクセシビリティTips
shincarpediem
2
100
いつの間にかデータエンジニア以外の業務も増えていたけど、意外と経験が役に立ってる
zozotech
PRO
0
560
AI駆動開発で生産性を追いかけたら、行き着いたのは品質とシフトレフトだった
littlehands
0
500
Featured
See All Featured
4 Signs Your Business is Dying
shpigford
187
22k
A designer walks into a library…
pauljervisheath
211
24k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.7k
Making Projects Easy
brettharned
120
6.6k
Game over? The fight for quality and originality in the time of robots
wayneb77
1
170
Joys of Absence: A Defence of Solitary Play
codingconduct
1
360
The SEO Collaboration Effect
kristinabergwall1
1
440
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.6k
The Power of CSS Pseudo Elements
geoffreycrofte
82
6.2k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
Believing is Seeing
oripsolob
1
120
How STYLIGHT went responsive
nonsquared
100
6.1k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857