Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Boaz Leskes
June 04, 2013
Technology
76
0
Share
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
330
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
540
Resiliency in Elasticsearch & Lucene
bleskes
0
250
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
740
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
380
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
700
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
340
Other Decks in Technology
See All in Technology
ぼくがかんがえたさいきょうのあうとぷっと
yama3133
0
180
明日からドヤれる!超マニアックなAWSセキュリティTips10連発 / 10 Ultra-Niche AWS Security Tips
yuj1osm
0
550
Digitization部 紹介資料
sansan33
PRO
1
7.3k
職能の壁を取り払った先で見えた壁 -AI時代のクロスファンクショナル組織-
shimotaroo
1
120
CloudSec JP #005 後締め ~ソフトウェアサプライチェーン攻撃から開発者のシークレットを守る~
lhazy
0
230
AIを共同作業者にして書籍を執筆する方法 / How to Write a Book with AI as a Co-Creator
ama_ch
2
130
AIエージェントの権限管理 2: データ基盤の Fine grained access control 編
ren8k
0
120
基盤を育てる 外部SaaS連携の運用
gamonges_dresscode
1
110
Code Interpreter で、AIに安全に コードを書かせる。
yokomachi
0
7k
データを"持てない"環境でのアノテーション基盤設計
sansantech
PRO
1
100
マルチエージェント × ハーネスエンジニアリング × GitLab Duo Agent Platformで実現する「AIエージェントに仕事をさせる時代へ。」 / 20260421 GitLab Duo Agent Platform
n11sh1
0
140
こんなアーキテクチャ図はいやだ / Anti-pattern in AWS Architecture Diagrams
naospon
1
430
Featured
See All Featured
Amusing Abliteration
ianozsvald
1
150
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
3
110
sira's awesome portfolio website redesign presentation
elsirapls
0
220
Test your architecture with Archunit
thirion
1
2.2k
Code Reviewing Like a Champion
maltzj
528
40k
How Software Deployment tools have changed in the past 20 years
geshan
0
33k
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
2
210
Mobile First: as difficult as doing things right
swwweet
225
10k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.4k
Prompt Engineering for Job Search
mfonobong
0
270
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
27
3.4k
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
100
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857