Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Boaz Leskes
June 04, 2013
Technology
0
70
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
330
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
530
Resiliency in Elasticsearch & Lucene
bleskes
0
240
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
730
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
370
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
690
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
330
Other Decks in Technology
See All in Technology
AI時代にあわせたQA組織戦略
masamiyajiri
6
2.7k
BPaaSオペレーション・kubell社内 n8n活用による効率化検証事例紹介
kubell_hr
0
280
Vivre en Bitcoin : le tutoriel que votre banquier ne veut pas que vous voyiez
rlifchitz
0
370
Claude in Chromeで始める自律的フロントエンド開発
diggymo
1
270
それぞれのペースでやっていく Bet AI / Bet AI at Your Own Pace
yuyatakeyama
1
610
SOC2は、取った瞬間よりその後が面白い
3flower
1
250
SMTP完全に理解した ✉️
yamatai1212
0
110
「全社導入」は結果。1人の熱狂が組織に伝播したmikanのn8n活用
sota_mikami
0
520
新規事業における「一部だけどコア」な AI精度改善の優先順位づけ
zerebom
0
330
VRTと真面目に向き合う
hiragram
1
490
新規事業 toitta におけるAI 機能評価の話 / AI Feature Evaluation in toitta
pokutuna
0
280
re:Inventで見つけた「運用を捨てる」技術。
ezaki
1
150
Featured
See All Featured
Building Experiences: Design Systems, User Experience, and Full Site Editing
marktimemedia
0
400
Leadership Guide Workshop - DevTernity 2021
reverentgeek
1
190
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
0
180
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
61
49k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
51k
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
63
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
710
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.7k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.6k
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.1k
Navigating Weather and Climate Data
rabernat
0
77
Site-Speed That Sticks
csswizardry
13
1k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857