Vector search • Generating vectors from text documents. • Storing it into Vector DB. • Generate a vector from a search text. • Comparing/Searching items by vector with similarity algorithm.
model • Important to select a model • How to select: ◦ Massive Text Embedding Benchmark (MTEB) Leaderboard. ◦ https://huggingface.co/spaces/mteb/leaderboard • I’m using "intfloat/multilingual-e5-large" ◦ Supports 100 languages, Japanese, Thai and more.
situation • I want Intranet Plone to have higher functionality of search. • Not only words but also sentences to be searched. • Not use of OpenAI, Intranet data should not got out beyond the boundary.
Feature • A new Index class reference from ZCTextIndex • Adding the Index on portal_catalog for auto indexing. • Embedding model is "intfloat/multilingual-e5-large", No OpenAI is involved. • As a consequence, a new keyword args are added on portal_catalog for search