Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Ensemble Topic Modelling
Search
Leland McInnes
July 12, 2019
Research
1
450
Ensemble Topic Modelling
A short lightning talk on ensemble topic modelling with pLSA using the enstop package.
Leland McInnes
July 12, 2019
Tweet
Share
More Decks by Leland McInnes
See All by Leland McInnes
PyNNDescent: Fast Approximate Nearest Neighbors with Numba
lmcinnes
0
990
Word and Document Embeddings
lmcinnes
0
140
Topological Data Analysis
lmcinnes
1
310
Learning Topology: topological methods for unsupervised learning
lmcinnes
2
3.5k
A Guide to Dimension Reduction
lmcinnes
3
1.3k
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
lmcinnes
2
2.5k
Other Decks in Research
See All in Research
MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation
satai
4
280
なめらかなシステムと運用維持の終わらぬ未来 / dicomo2025_coherently_fittable_system
monochromegane
0
3.4k
EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues
satai
3
200
診断前の病歴テキストを対象としたLLMによるエンティティリンキング精度検証
hagino3000
1
140
Galileo: Learning Global & Local Features of Many Remote Sensing Modalities
satai
3
330
EarthSynth: Generating Informative Earth Observation with Diffusion Models
satai
3
340
20250624_熊本経済同友会6月例会講演
trafficbrain
1
640
SSII2025 [TS3] 医工連携における画像情報学研究
ssii
PRO
3
1.3k
SSII2025 [SS1] レンズレスカメラ
ssii
PRO
2
1.1k
Time to Cash: The Full Stack Breakdown of Modern ATM Attacks
ratatata
0
160
ip71_contraflow_reconfiguration
stkmsd
0
100
多言語カスタマーインタビューの“壁”を越える~PMと生成AIの共創~ 株式会社ジグザグ 松野 亘
watarumatsuno
0
130
Featured
See All Featured
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.6k
Agile that works and the tools we love
rasmusluckow
330
21k
Navigating Team Friction
lara
189
15k
Build your cross-platform service in a week with App Engine
jlugia
231
18k
It's Worth the Effort
3n
187
28k
Reflections from 52 weeks, 52 projects
jeffersonlam
352
21k
Building Better People: How to give real-time feedback that sticks.
wjessup
368
20k
4 Signs Your Business is Dying
shpigford
185
22k
A better future with KSS
kneath
239
17k
Statistics for Hackers
jakevdp
799
220k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
188
55k
Designing Experiences People Love
moore
142
24k
Transcript
Ensemble Topic Modelling Leland McInnes
Model a corpus of documents in terms of underlying “topics”
Topic Modelling as Matrix Factorization
None
None
None
None
LDA and pLSA are probabilistic matrix factorization methods
(Ensembles of) pLSA
Performance?
None
Quality?
None
Instability?
These are hard optimization problems
Topics vary from one run to another
What are the stable topics? Inspired by https://github.com/RaRe-Technologies/gensim/pull/2282
None
Each cluster represents a stable topic
None
• Greater stability • Determines number of topics automatically •
Embarrassingly parallel computation
Implementation
sklearn API
None
https://github.com/lmcinnes/enstop
pip install enstop