Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Multi-table Full Text Search in Postgres

Multi-table Full Text Search in Postgres

Avatar for Caleb Hearth

Caleb Hearth

October 14, 2014
Tweet

More Decks by Caleb Hearth

Other Decks in Programming

Transcript

  1. Scopes def self.search(query) joins(:user) .where(<<-SQL, query, query, query) articles.body ILIKE

    %?% OR articles.title ILIKE %?% OR users.name ILIKE %?% SQL end
  2. Query object class Search def self.for(query) [ Article.where("title ILIKE %?%",

    query), Article.where("body ILIKE %?%", query), Article.joins(:user) .where("users.name ILIKE %?%", query), ].flatten.uniq end end
  3. SELECT DISTINCT(id) FROM ( SELECT id AS id, title ||

    ' ' || body AS the_text FROM articles UNION SELECT articles.id AS id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id ) AS this_doesnt_matter WHERE to_tsvector(the_text) @@ to_tsquery('?');
  4. SELECT DISTINCT(id) FROM ( SELECT id AS id, title ||

    ' ' || body AS the_text FROM articles UNION SELECT articles.id AS id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id
  5. id AS id, title || ' ' || body AS

    the_text FROM articles UNION SELECT articles.id AS id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id
  6. SELECT DISTINCT(id) FROM ( SELECT id AS id, title ||

    ' ' || body A FROM articles UNION SELECT articles.id AS id, authors.name AS the_te
  7. title || ' ' || body A FROM articles UNION

    SELECT articles.id AS id, authors.name AS the_te FROM authors JOIN articles ON authors.id = articles ) AS this_doesnt_matter WHERE
  8. SELECT DISTINCT(id) FROM ( SELECT id AS id, title ||

    ' ' || body AS t FROM articles UNION SELECT
  9. We could throw that into our query object class Search

    def self.for(query) <<-SQL SQL end end
  10. CREATE VIEW users_with_recent_activity AS SELECT DISTINCT ON (users.id) users.*, activities.created_at

    AS active_at FROM users JOIN activities ON activities.user_id = users.id WHERE activities.created_at >= CURRENT_DATE - interval '7 days';
  11. CREATE VIEW users_with_recent_activity AS SELECT DISTINCT ON (users.id) users.*, activities.created_at

    AS active_at FROM users JOIN activities ON activities.user_id = users.id WHERE activities.created_at >= CURRENT_DATE - interval '7 days';
  12. CREATE VIEW users_with_recent_activity SELECT DISTINCT ON (users.id) users.*, activities.created_at AS

    active FROM users JOIN activities ON activities.user_id = users.id WHERE activities.created_at >= CURRENT_DATE - interval '7 days
  13. CREATE VIEW users_with_recent AS SELECT DISTINCT ON (users.id) users.*, activities.created_at

    active_at FROM users JOIN activities ON activities.user_id = u
  14. VIEW users_with_recent_activity AS DISTINCT ON (users.id) users.*, activities.created_at AS active_at

    users activities activities.user_id = users.id activities.created_at >= CURRENT_DATE - interval '7 days';
  15. CREATE VIEW users_with_recent_activity AS SELECT DISTINCT ON (users.id) users.*, activities.created_at

    AS active_at FROM users JOIN activities ON activities.user_id = users.id WHERE activities.created_at >= CURRENT_DATE - interval '7 days';
  16. Querying a view looks just like querying a table SELECT

    * FROM users_with_recent_activity WHERE id IN (1,2,3…) ORDER BY active_at DESC
  17. SELECT id AS article_id, title || ' ' || body

    AS the_text FROM articles UNION SELECT articles.id AS article_id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id;
  18. class CreateUsersWithRecentActivity < ActiveRecord::Migration def up ActiveRecord::Base.connection.execute( "--The CREATE VIEW

    sql from before" ) end def down ActiveRecord::Base.connection.execute( 'DROP VIEW users_with_recent_activity' ) end end
  19. SELECT id AS article_id, title || ' ' || body

    AS the_text FROM articles UNION SELECT articles.id AS article_id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id UNION SELECT article_id, body AS the_text FROM comments
  20. authors.name AS the_tex FROM authors JOIN articles ON authors.id =

    article UNION SELECT article_id, body AS the_text FROM comments
  21. class CreateUsersWithRecentActivity < ActiveRecord::Migration def up ActiveRecord::Base.connection.execute( "CREATE OR REPLACE

    VIEW searches AS --The sql from the new view" ) end def down ActiveRecord::Base.connection.execute( "CREATE OR REPLACE VIEW searches AS --The sql from the old view" ) end end
  22. class CreateUsersWithRecentActivity < ActiveRecord::Migration def up ActiveRecord::Base.connection.execute( "DROP VIEW searches;

    CREATE VIEW searches AS --The sql from the new view" ) end def down ActiveRecord::Base.connection.execute( "DROP VIEW searches; CREATE VIEW searches AS --The sql from the old view" ) end end
  23. Model generator $ rails generate scenic:model search create app/models/search.rb create

    db/views/searches_v01.sql create db/migrate/..._create_searches.rb
  24. $ cat app/views/searches_v01.sql SELECT id AS article_id, title || '

    ' || body AS the_text FROM articles UNION SELECT articles.id AS article_id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id
  25. $ cat app/views/searches_v02.sql SELECT id AS article_id, title || '

    ' || body AS the_text FROM articles UNION SELECT articles.id AS article_id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id
  26. $ cat app/views/searches_v02.sql SELECT id AS article_id, title || '

    ' || body AS the_text FROM articles UNION SELECT articles.id AS article_id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id UNION SELECT article_id, body AS the_text FROM comments
  27. Use GIN • You don't have hundreds of thousands of

    rows • You're not concerned about longer writes blocking the db • You're adding the index late in the game • You don't care about disk space • You want fast lookups (read-heavy)
  28. Use GiST • Very large tables • You have performance

    concerns • For some reason disk space is important • Your table is write-heavy
  29. add_index :articles, :body, using: :gin add_index :articles, :title, using: :gin

    add_index :authors, :name, using: :gin add_index :comments, :body, using: :gin
  30. ElasticSearch via Tire or elasticsearch-rails or Chewy require 'elasticsearch/model' class

    Article < ActiveRecord::Base include Elasticsearch::Model include Elasticsearch::Model::Callbacks end
  31. Solr via Sunspot class Post < ActiveRecord::Base searchable do text

    :title, :body text :comments do comments.map { |comment| comment.body } end boolean :featured integer :blog_id integer :author_id integer :category_ids, :multiple => true double :average_rating time :published_at time :expired_at string :sort_title do title.downcase.gsub(/^(an?|the)/, '') end end end
  32. boolean :featured integer :blog_id integer :author_id integer :category_ids, :multiple =>

    true double :average_rating time :published_at time :expired_at
  33. sphinx with ThinkingSphinx ThinkingSphinx::Index.define :article, :with => :active_record do indexes

    subject, :sortable => true indexes content indexes author.name, :as => :author, :sortable => true has author_id, created_at, updated_at end
  34. :(

  35. :)

  36. Bibliography • http://shisaa.jp/postset/postgresql-full-text-search-part-1.html • http://blog.lostpropertyhq.com/postgres-full-text-search-is-good- enough/ • http://www.postgresql.org/docs/9.3/static/textsearch.html • http://linuxgazette.net/164/sephton.html

    • http://www.postgresql.org/docs/current/static/sql-createview.html • http://www.postgresql.org/docs/current/static/indexes-examine.html • http://www.postgresql.org/docs/current/static/textsearch-indexes.html • http://en.wikipedia.org/wiki/Gin_and_tonic