Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Melhore a busca textual do seu e-commerce

Melhore a busca textual do seu e-commerce

Melhore a busca textual do seu
e-commerce

Trabalho na Leroy Merlin Brasil.
Co-organizer Meetup de Laravel em SP.
Co-organizer dos Hangouts do Laravel BR.
Apresentação
@guilhermeguitte

Distributed, scalable, and highly available
Real-time search and analytics capabilities
Sophisticated RESTful API
Schema-Free
Document-Oriented

$ bin/elasticsearch

Amount of data
Best search ever

#1 Lição
Não trate a busca como uma black box

{ "_id" : 88045034, "name":"Furadeira e parafusadeira 12W 127V (110V) 6271DWPETC Makita", "characteristics" : { "produto" : "Furadeira e parafusadeira", "marca" : "Makita", "tensao" : "127V (110V)", "alimentacao" : "Bateria", }, "created_at" : {"sec":1380291160,"usec":7000}, "updated_at" : {"sec":1380291160,"usec":7000}, "description" : "Com design compacto e leve,..."}

curl -XPOST http://192.168.59.103:9200/your_index/product -d '{ "_id" : 88045034, "name":"Furadeira e parafusadeira 12W 127V (110V) 6271DWPETC Makita", "characteristics" : { "produto" : "Furadeira e parafusadeira", "marca" : "Makita", "tensao" : "127V (110V)", "alimentacao" : "Bateria", }, "created_at" : {"sec":1380291160,"usec":7000}, "updated_at" : {"sec":1380291160,"usec":7000}, "description" : "Com design compacto e leve,..." ...}'

Type
Index
Your node

Request to index
Request to search

Request to index
Request to search

@fields.each do |name, value| index = Elasticsearch::index 'your_index' analyzer = index.get_analyzer_by_field name tokens = analyzer.analyze name, value Elasticsearch::persist_it_to_lucene tokensend

@fields.each do |name, value| index = Elasticsearch::index 'your_index' analyzer = index.get_analyzer_by_field name tokens = analyzer.analyze name, value Elasticsearch::persist_it_to_lucene tokensend

{ "settings": { "analysis": { "filter": { "portuguese_stop_words": { "type": "stop", "stopwords": "_portuguese_" } } }, "my_analyzer": { "filter": [ "lowercase", "portuguese_stop_words", ], "tokenizer": "whitespace", "char_filter": ["html_strip"] } }}

Furadeira e parafusadeira 12W 127V (110V) 6271DWPETCMakita

furadeira
parafusadeira
12w
127v
(110v)
6271dwpetc
makita

TERMS | Document | Freq------------------------------------------furadeira | 1, 5, 7 | 3parafusadeira | 1, 2 | 212w | 1, 5 | 2127v | 1, 6 | 2(110v) | 1, 8 | 26271dwpetc | 1 | 1makita | 1 | 1

localhost:9200/my_index/_analyze?analyzer=my_analyzer

Request to index
Request to search

Alquimia

Structured Search
Full-text search

Multi-field search

Proximity Matching

Partial Matching

Relevance

Ranges

Null values

Using Caching

Match

Multi-Match

Multi-word

Boosting

Most_fields

Best_fields

_all

Phrase matching

Ngram

Descobri a Busca
Perfeita!
"Furadeira Makita"

Joãozinho venha testar a nova busca
"Furadeira Makita"

Essa busca
não funciona!
Joãozinho

#2 Lição
Nunca subestime a busca de um usuário

Actual
Expected
OK

Actual
Expected
OK
Garbage
Meaningful

OK
What you want

OK

Actual
Expected
OK
More garbage

NAME = John AND SURNAME = Silva # => 100
NAME = John OR SURNAME = Silva # => 1000

?

Não existe 1 query que resolva tudo.
#3 Lição

Busca

Strict
Fuzzy

Descobrimos...
Joãozinho

#4 Lição
Não se baseie em um único input para definir a qualidade da sua query

if @term.split.size >= 2 @results = Elasticsearch::strict_query_with_two_terms @termend@results = Elasticsearch::strict_query_with_one_term @termunless @results @results = Elasticsearch::fuzzy_query @termend@results

Polissemia

"Boias"

Atalho

@category = MyModel::get_by_shortcut @term

return redirect_to(@category) unless @category

if @term.split.size >= 2 @results = Elasticsearch::strict_query_with_two_terms @termend@results = Elasticsearch::strict_query_with_one_term @termunless @results @results = Elasticsearch::fuzzy_query @termend@results

Relevância

Sinônimos

"Bica"

Desenhe seu próprio pipeline de busca
#5 Lição

@category = MyModel::get_by_shortcut @term

return redirect_to(@category) unless @category

if @term.split.size >= 2 @results = Elasticsearch::strict_query_with_two_terms @term
else
@results = Elasticsearch::strict_query_with_one_term @termend
unless @results @results = Elasticsearch::fuzzy_query @term
end@results

#Última Lição
Sua busca não é o Elasticsearch.
Sua busca usa o Elasticsearch.

Dicas

Obrigado!

@guilhermeguitte
guilhermeguitte
https://br.linkedin.com/in/guitte

Guilherme Guitte

September 19, 2015
Tweet

More Decks by Guilherme Guitte

Other Decks in Technology

Transcript

  1. • Trabalho na Leroy Merlin Brasil. • Co-organizer Meetup de

    Laravel em SP. • Co-organizer dos Hangouts do Laravel BR. Apresentação @guilhermeguitte
  2. • Distributed, scalable, and highly available • Real-time search and

    analytics capabilities • Sophisticated RESTful API • Schema-Free • Document-Oriented
  3. { "_id" : 88045034, "name":"Furadeira e parafusadeira 12W 127V (110V)

    6271DWPETC Makita", "characteristics" : { "produto" : "Furadeira e parafusadeira", "marca" : "Makita", "tensao" : "127V (110V)", "alimentacao" : "Bateria", }, "created_at" : {"sec":1380291160,"usec":7000}, "updated_at" : {"sec":1380291160,"usec":7000}, "description" : "Com design compacto e leve,..." }
  4. curl -XPOST http://192.168.59.103:9200/your_index/product -d '{ "_id" : 88045034, "name":"Furadeira e

    parafusadeira 12W 127V (110V) 6271DWPETC Makita", "characteristics" : { "produto" : "Furadeira e parafusadeira", "marca" : "Makita", "tensao" : "127V (110V)", "alimentacao" : "Bateria", }, "created_at" : {"sec":1380291160,"usec":7000}, "updated_at" : {"sec":1380291160,"usec":7000}, "description" : "Com design compacto e leve,..." ... } Type Index Your node
  5. @fields.each do |name, value| index = Elasticsearch::index 'your_index' analyzer =

    index.get_analyzer_by_field name tokens = analyzer.analyze name, value Elasticsearch::persist_it_to_lucene tokens end
  6. @fields.each do |name, value| index = Elasticsearch::index 'your_index' analyzer =

    index.get_analyzer_by_field name tokens = analyzer.analyze name, value Elasticsearch::persist_it_to_lucene tokens end
  7. { "settings": { "analysis": { "filter": { "portuguese_stop_words": { "type":

    "stop", "stopwords": "_portuguese_" } } }, "my_analyzer": { "filter": [ "lowercase", "portuguese_stop_words", ], "tokenizer": "whitespace", "char_filter": ["html_strip"] } } }
  8. TERMS | Document | Freq ------------------------------------------ furadeira | 1, 5,

    7 | 3 parafusadeira | 1, 2 | 2 12w | 1, 5 | 2 127v | 1, 6 | 2 (110v) | 1, 8 | 2 6271dwpetc | 1 | 1 makita | 1 | 1
  9. Structured Search Full-text search Multi-field search Proximity Matching Partial Matching

    Relevance Ranges Null values Using Caching Match Multi-Match Multi-word Boosting Most_fields Best_fields _all Phrase matching Ngram
  10. OK

  11. NAME = John AND SURNAME = Silva # => 100

    NAME = John OR SURNAME = Silva # => 1000
  12. ?

  13. #4 Lição Não se baseie em um único input para

    definir a qualidade da sua query
  14. if @term.split.size >= 2 @results = Elasticsearch::strict_query_with_two_terms @term end @results

    = Elasticsearch::strict_query_with_one_term @term unless @results @results = Elasticsearch::fuzzy_query @term end @results
  15. @category = MyModel::get_by_shortcut @term return redirect_to(@category) unless @category if @term.split.size

    >= 2 @results = Elasticsearch::strict_query_with_two_terms @term end @results = Elasticsearch::strict_query_with_one_term @term unless @results @results = Elasticsearch::fuzzy_query @term end @results
  16. @category = MyModel::get_by_shortcut @term return redirect_to(@category) unless @category if @term.split.size

    >= 2 @results = Elasticsearch::strict_query_with_two_terms @term else @results = Elasticsearch::strict_query_with_one_term @term end unless @results @results = Elasticsearch::fuzzy_query @term end @results