Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Boaz Leskes
June 04, 2013
Technology
0
68
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
330
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
520
Resiliency in Elasticsearch & Lucene
bleskes
0
240
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
720
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
370
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
690
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
320
Other Decks in Technology
See All in Technology
Fashion×AI「似合う」を届けるためのWEARのAI戦略
zozotech
PRO
2
580
Python 3.14 Overview
lycorptech_jp
PRO
1
120
WordPress は終わったのか ~今のWordPress の制作手法ってなにがあんねん?~ / Is WordPress Over? How We Build with WordPress Today
tbshiki
1
780
30分であなたをOmniのファンにしてみせます~分析画面のクリック操作をそのままコード化できるAI-ReadyなBIツール~
sagara
0
150
MLflowで始めるプロンプト管理、評価、最適化
databricksjapan
1
240
MLflowダイエット大作戦
lycorptech_jp
PRO
1
120
AI-DLCを現場にインストールしてみた:プロトタイプ開発で分かったこと・やめたこと
recruitengineers
PRO
2
130
AI駆動開発における設計思想 認知負荷を下げるフロントエンドアーキテクチャ/ 20251211 Teppei Hanai
shift_evolve
PRO
2
390
Kiro Autonomous AgentとKiro Powers の紹介 / kiro-autonomous-agent-and-powers
tomoki10
0
500
[CMU-DB-2025FALL] Apache Fluss - A Streaming Storage for Real-Time Lakehouse
jark
0
120
AWSを使う上で最低限知っておきたいセキュリティ研修を社内で実施した話 ~みんなでやるセキュリティ~
maimyyym
2
1.3k
大企業でもできる!ボトムアップで拡大させるプラットフォームの作り方
findy_eventslides
1
780
Featured
See All Featured
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.3k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
37
2.6k
Six Lessons from altMBA
skipperchong
29
4.1k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
10
730
Build your cross-platform service in a week with App Engine
jlugia
234
18k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.3k
Building Applications with DynamoDB
mza
96
6.8k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.6k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.8k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857