Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Boaz Leskes
June 04, 2013
Technology
0
71
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
330
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
530
Resiliency in Elasticsearch & Lucene
bleskes
0
240
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
730
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
370
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
690
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
330
Other Decks in Technology
See All in Technology
GitHub Copilot CLI 現状確認会議(2026年2月のすがた)
torumakabe
1
210
生成AIを活用した音声文字起こしシステムの2つの構築パターンについて
miu_crescent
PRO
3
260
『誰の責任?』で揉めるのをやめて、エラーバジェットで判断するようにした ~感情論をデータで終わらせる、PMとエンジニアの意思決定プロセス~
coconala_engineer
0
240
Claude_CodeでSEOを最適化する_AI_Ops_Community_Vol.2__マーケティングx_AIはここまで進化した.pdf
riku_423
2
650
ZOZO.swift #2
zozotech
PRO
0
260
AIエージェントに必要なのはデータではなく文脈だった/ai-agent-context-graph-mybest
jonnojun
1
620
生成AI素人でも玄人でもない私がセイセイAIチョットワカルために勉強したこと
wkm2
1
120
【Ubie】AIを活用した広告アセット「爆速」生成事例 | AI_Ops_Community_Vol.2
yoshiki_0316
1
140
StrandsAgentsで構築したAIエージェントにMCP Apps機能を追加してみた
kmiya84377
0
130
ECS障害を例に学ぶ、インシデント対応に備えたAIエージェントの育て方 / How to develop AI agents for incident response with ECS outage
iselegant
5
780
30分でわかるアーキテクチャモダナイゼーション
nwiizo
0
330
Claude Code で画面の仕様書を作ろう
zozotech
PRO
0
270
Featured
See All Featured
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.8k
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
130
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
Design of three-dimensional binary manipulators for pick-and-place task avoiding obstacles (IECON2024)
konakalab
0
360
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
Being A Developer After 40
akosma
91
590k
GraphQLとの向き合い方2022年版
quramy
50
14k
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
180
Darren the Foodie - Storyboard
khoart
PRO
2
2.5k
Neural Spatial Audio Processing for Sound Field Analysis and Control
skoyamalab
0
190
エンジニアに許された特別な時間の終わり
watany
106
230k
Marketing to machines
jonoalderson
1
4.9k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857