a ys LIKE % • Shouldn’t we just buy Algoli a ? • Don’t I need to cre a te a vector store with embeddings a nd do some sort of AI se a rch? • I though Meilise a rch w a s the new hotness? • Of course, there is a lso Typesense….
Schem a Explicit Inferred Indexing Memory + Disk Disk + Memory High Av a il a bility Replic a s Sh a rding Geo Yes Meh PHP SDK Yes Yes F a cets Excellent Decent RAM / Disk 1.5-2x D a t a Size Less th a n Typesense
a t in Docker • C a n be ch a llenging in K8S, but cert a inly possible • Clustering • R a ft, either use 1 node or a t le a st 3 • Le a der t a kes a ll writes, replic a s h a ndle a ctu a l user tr a ff ic • Replic a s cont a in full d a t a
write, used on the server only • Se a rch Only Keys - C a n se a rch only, these c a n be a llowed in J a v a Script • C a n scope keys to a nd get short lived keys for mobile a pps
d a t a b a se t a bles, they cont a in docs with the s a me schem a • Documents - well-structured JSON th a t meets a schem a • You c a n se a rch a cross collections but it’s not a s useful a s you might hope • If you w a nted “site wide se a rch” it might be more useful to m a ke a met a collection
a ble? F a cet a ble? Sort a ble? string Text f ield for full-text se a rch. ✅ ✅ ✅ ❌ string[] Arr a y of text v a lues (t a gs, c a tegories). ✅ (per element) ✅ ✅ ❌ int32 32-bit integer v a lue. ❌ ✅ ✅ ✅ int64 64-bit integer v a lue for timest a mps or l a rge counts. ❌ ✅ ✅ ✅ f lo a t Flo a ting-point number (price, r a ting, etc.). ❌ ✅ ✅ ✅ bool Boole a n true/f a lse f ield. ❌ ✅ ✅ ✅ geopoint Arr a y of two f lo a ts [l a t, lon] for geo se a rch. ❌ ✅ (geo f ilters) ❌ ✅ (dist a nce sort)
unique ID within a given collection • The f ield is c a lled id a nd it is a lw a ys a string. You do not (c a nnot) de f ine it explicitly in your schem a • You must provide the ID yourself • Nulls a re a llowed in concept but not in the a ctu a l d a t a . To imply a null, you simply do not p a ss th a t key in the json document.
- just h a d to reindex • Newer versions - c a n ch a nge the index in pl a ce, but it’s blocking / synchronous • Ali a ses a re the w a y to go • Cre a te a schem a like products_v1 • Cre a te a n a li a s to it c a lled products • M a ke products_v2 with new schem a , then upd a te a li a s pointer
a tch indexing 20,000 records for the schem a we’re t a lking a bout, in 1,000 item b a tches: • Insert time per record: 0.088 milliseconds • Upsert time per record: 0.052 milliseconds • “Slow”, per-record indexing: • Insert time per record: 0.811 milliseconds • Upsert time per record: 0.783 milliseconds
Scout does this correctly, IMO • Only index f ields you will a ctu a lly se a rch on • If needed, bulk-rehydr a te objects from your d a t a b a se • SELECT * FROM … WHERE id IN (…)
a n enterprise Algoli a customer for the better p a rt of a dec a de • I got fed up with their ridiculous price incre a ses • I moved to Meilise a rch bec a use it w a s hot. Worked f ine, but I didn’t ever re a lly love it, felt like they were he a ded tow a rds Algoli a • Typesense a lso h a s a cloud o ff ering but they seem committed to open source • If I w a s building a new se a rch product tod a y, this is a bsolutely wh a t I’d use.