Upgrade to Pro — share decks privately, control downloads, hide ads and more …

New Features in Elasticsearch v1.0

Igor Motov
November 04, 2013

New Features in Elasticsearch v1.0

Boston Elasticsearch Meetup
Nov 4, 2013

Igor Motov

November 04, 2013
Tweet

More Decks by Igor Motov

Other Decks in Programming

Transcript

  1. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited Igor Motov New Features in Elasticsearch 1.0
  2. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited about • Developer at Elasticsearch Inc joined Elasticsearch Inc.: Oct 2012 Elasticsearch contributor since Apr 2011 ! • Elasticsearch Inc founded: July 2012 headquarters: Amsterdam and Los Altos, CA provides: training (public & onsite), development support, production support subscription
  3. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited v1.0 ? • v0.4.0 - Feb 8, 2010 • v0.5.0 - Mar 5, 2010 • … • v0.19.0 - Mar 1, 2012 • v0.20.0 - Dec 7, 2012 • v0.90.0 - Apr 29, 2013 • v1.0 - Soon
  4. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited v1.0 • rolling upgrades because not everyone can afford having “scheduled maintenance” • ability to backup data because “rm -rf" happens !
  5. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited v1.0 • rolling upgrades • snapshot/restore (backup) • _cat API • aggregations • distributed percolator
  6. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited snapshot and restore Photo by John http://www.flickr.com/people/60026579@N00
  7. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited backup in 0.90 1. disable flush 2. find all primary shard location (optional) 3. copy files from primary shards (rsync) 4. enable flush
  8. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited backup in v1.0 $ curl -XPUT localhost:9200/_snapshot/my_backup/snapshot_20131010 snapshot name repository
  9. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited repositories • Snapshot Storage Shared File System - v1.0 S3 - v1.0 HDFS Google Compute Engine Microsoft Azure ...
  10. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited register repository $ curl -XPUT "localhost:9200/_snapshot/my_backup" -d '{! "type": "fs", ! "settings": {! "location":"/mnt/es-test-repo"! }! }' location repository repository type
  11. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited start snapshot $ curl -XPUT "localhost:9200/_snapshot/my_backup/snapshot_20131010" -d '{! "indices":"+test_*,-test_4"! }' snapshot name repository index list (optional)
  12. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited restore in 0.90 1. close the index (shutdown the cluster) 2. find all existing index shards 3. replace all index shards with data from backup 4. open the index (start the cluster)
  13. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited restore in 1.0 $ curl -XPOST "localhost:9200/test_*/_close" snapshot name close all indices that start with test_ $ curl -XPOST "localhost:9200/_snapshot/my_backup/snapshot_20131010" -d '{! "indices":"test_*"! }' repository name index list
  14. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited https://github.com/elasticsearch/ elasticsearch/issues/3826
  15. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited distributed percolator Image Source: Wikipedia,
  16. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited percolator • reverse search • alerts • updatable search results
  17. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited registering percolator in 0.90 $ curl -XPUT “localhost:9200/_percolator/tweeter/es-tweets" -d ‘{! “query”: {! “match”: { “text”: “elasticsearch” }! }! }’! target index query id
  18. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited document percolation in 0.90 $ curl -XGET “localhost:9200/twitter/tweet/_percolate” -d ‘{! “doc”: {! “text”: “#elasticsearch is awesome”! “nick”: “@imotov”! “name”: “Igor Motov”! “date”: “2013-11-03” ! }! }’ target index percolation end point document to be percolated {! “ok”: true! “matches”: [“es-tweets”]! } matching queries
  19. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited how does it work in 0.90? • all queries are stored in special _percolate index • _percolate index has 1 primary shard which is replicated to every node • each percolated document is indexed in memory • all queries are executed against this document sequentially • execution time is linear to number of queries!
  20. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited how does it work in 0.90? • all queries are stored in special _percolate index • _percolate index has 1 primary shard which is replicated to every node • each percolated document is indexed in memory • all queries are executed against this document sequentially • execution time is linear to number of queries!
  21. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited registering percolator in 1.0 $ curl -XPUT “localhost:9200/some_index/_percolator/es-tweets” -d ‘{! “query”: {! “match”: { “body”: “elasticsearch” }! }! }’! reserved percolator type query id any index with as many shards as you need
  22. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited multi index support $ curl -XGET “localhost:9200/twitter,facebook/_percolate” -d ‘{! “doc”: {! “body”: “#elasticsearch is awesome”! “nick”: “@imotov”! “name”: “Igor Motov”! “date”: “2013-11-03” ! }! }’ document to be percolated
  23. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited full alias support $ curl -XGET “localhost:9200/soc_media_alias/_percolate” -d ‘{! “doc”: {! “body”: “#elasticsearch is awesome”! “nick”: “@imotov”! “name”: “Igor Motov”! “date”: “2013-11-03” ! }! }’ document to be percolated
  24. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited other features • percolation of existing document • percolate count api • filter support (in addition to queries in 0.90) • highlighting • scoring • multi percolate (bulk percolation)
  25. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited https://github.com/elasticsearch/ elasticsearch/issues/3173
  26. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited _cat/* api no, it will not help you organize your massive collection of cat pictures
  27. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited _cat/* api It’s because humans suck at reading JSON
  28. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited Which one is the master? $ curl "localhost:9200/_cluster/state?pretty&filter_metadata=true&! filter_routing_table=true"! {! "cluster_name" : "elasticsearch",! "master_node" : "GNf0hEXlTfaBvQXKBF300A",! "blocks" : { },! "nodes" : {! "ObdRqLHGQ6CMI5rOEstA5A" : {! "name" : "Triton",! "transport_address" : “inet[/10.0.1.11:9300]”,! "attributes" : { }! },! "4C7pKbfhTvu0slcSy_G4_w" : {! "name" : "Kid Colt",! "transport_address" : "inet[/10.0.1.12:9300]",! "attributes" : { }! },! "GNf0hEXlTfaBvQXKBF300A" : {! "name" : "Lang, Steven",! "transport_address" : "inet[/10.0.1.13:9300]",! "attributes" : { }! }! }! }
  29. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited Which one is the master? (v0.90) $ curl "localhost:9200/_cluster/state? pretty&filter_metadata=true&filter_routing_table=true"! {! "cluster_name" : "elasticsearch",! "master_node" : "GNf0hEXlTfaBvQXKBF300A",! "blocks" : { },! "nodes" : {! "ObdRqLHGQ6CMI5rOEstA5A" : {! "name" : "Triton",! "transport_address" : “inet[/10.0.1.11:9300]”,! "attributes" : { }! },! "4C7pKbfhTvu0slcSy_G4_w" : {! "name" : "Kid Colt",! "transport_address" : "inet[/10.0.1.12:9300]",! "attributes" : { }! },! "GNf0hEXlTfaBvQXKBF300A" : {! "name" : "Lang, Steven",! "transport_address" : "inet[/10.0.1.13:9300]",! "attributes" : { }! }! }! }
  30. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited Which one is the master? (v1.0) $ curl localhost:9200/_cat/master GNf0hEXlTfaBvQXKBF300A 10.0.1.13 Lang, Steven
  31. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited /cat/count $ curl localhost:9200/_cat/count! 1383501234301 12:53:54 3344067 count
  32. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited _cat/* api • /_cat/allocation • /_cat/count • /_cat/health • /_cat/master • /_cat/nodes • /_cat/recovery • /_cat/shards • /_cat/indices
  33. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited facets in 0.90 • terms / terms stats • range • histogram / date histogram • filter/query • statistical • geo distance
  34. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited terms facet • Divides documents into buckets based on a value of a selected term • Calculates statistics on some other field of these document for each bucket
  35. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited index of large US cities {! "rank": "21",! "city": "Boston",! "state": "MA",! "population2012": "636479",! "population2010": "617594",! "land_area": "48.277",! "density": "12793",! "ansi": "619463",! "location": {! "lat": "42.332",! "lon": "71.0202"! }! }!
  36. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited terms facet request $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{! "facets": {! "stat1": {! "terms_stats": {! "key_field": "state",! "value_field": "density"! }! }! }! }' group by this field calculate stats for this field
  37. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited terms facet response "facets" : {! "stat1" : {! "_type" : "terms_stats",! "missing" : 0,! "terms" : [ {! "term" : "CA",! "count" : 69,! "total_count" : 69,! "min" : 1442.0,! "max" : 17179.0,! "total" : 383545.0,! "mean" : 5558.623188405797! }, {! "term" : "TX",! "count" : 32,! "total_count" : 32,! "min" : 1096.0,! "max" : 3974.0,! "total" : 79892.0,! "mean" : 2496.625! }, {! "term" : "FL",! "count" : 20,! group by field stats
  38. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited range facet request curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{! "facets": {! "population_ranges": {! "histogram": {! "key_field": "population2012",! "value_field": "density",! "interval": 500000! }! }! }! }' group by this field calculate stats by this field
  39. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited terms facet response "facets" : {! "population_ranges" : {! "_type" : "histogram",! "entries" : [ {! "key" : 0,! "count" : 255,! "min" : 171.0,! "max" : 17346.0,! "total" : 980306.0,! "total_count" : 252,! "mean" : 3890.1031746031745! }, {! "key" : 500000,! "count" : 25,! "min" : 956.0,! "max" : 17179.0,! "total" : 116597.0,! "total_count" : 25,! "mean" : 4663.88! }, {! "key" : 1000000,! "count" : 4,! "min" : 2798.0,! group by field (population) stats (density)
  40. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited MOAR!!! But what if I want an average density by population histogram for each state?
  41. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited aggs = buckets + calcs CA TX MA CO AZ "facets" : {! "population_ranges" : {! "_type" : "histogram",! "entries" : [ {! "key" : 0,! "count" : 255,! "min" : 171.0,! "max" : 17346.0,! "total" : 980306.0,! "total_count" : 252,! "mean" : 3890.1031746031745! }, {! "key" : 500000,! "count" : 25,! "min" : 956.0,! "max" : 17179.0,! "total" : 116597.0,! "total_count" : 25,! "mean" : 4663.88! }, {! "key" : 1000000,! "count" : 4,! "min" : 2798.0,! "max" : 4020.0,! "total" : 13216.0,! "total_count" : 4,! "mean" : 3304.0! }, {!
  42. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited density by state aggregation $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{! "aggs" : {! "mean_density_by_state" : {! "terms" : {! "field" : "state" ! }, ! "aggs": {! "mean_density": {! "avg" : { ! "field" : "density" ! } ! }! }! }! }! }' group by this field calculate stats for this field
  43. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited aggregation response "aggregations" : {! "mean_density_by_state" : {! "terms" : [ {! "term" : "CA",! "doc_count" : 69,! "mean_density" : {! "value" : 5558.623188405797! }! }, {! "term" : "TX",! "doc_count" : 32,! "mean_density" : {! "value" : 2496.625! }! }, {! "term" : "FL",! "doc_count" : 20,! "mean_density" : {! "value" : 4006.6! }! }, {! "term" : "CO",! "doc_count" : 11,! group by state density stats
  44. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited density by population aggregation $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{! "aggs" : {! "mean_density_by_population" : {! "histogram" : { ! "field" : "population2012", ! "interval": 500000 ! }, ! "aggs": {! "mean_density": {! "avg" : { ! "field" : "density" ! } ! }! }! }! }! }' group by population calculate stats density
  45. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited aggregation response "aggregations" : {! "mean_density_by_population" : [ {! "key" : 0,! "doc_count" : 255,! "mean_density" : {! "value" : 3890.1031746031745! }! }, {! "key" : 500000,! "doc_count" : 25,! "mean_density" : {! "value" : 4663.88! }! }, {! "key" : 1000000,! "doc_count" : 4,! "mean_density" : {! "value" : 3304.0! }! }, {! "key" : 1500000,! "doc_count" : 1,! "mean_density" : {! group by population density stats
  46. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited density by population by state $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{! "aggs" : {! "mean_density_by_population_by_state": {! "terms" : { "field" : "state" }, ! "aggs": {! "mean_density_by_population" : {! "histogram" : { ! "field" : "population2012", ! "interval": 500000 ! }, ! "aggs": {! "mean_density": {! "avg" : { ! "field" : "density" ! } ! }! }! }! } ! }! }! }' group by population calculate stats on density group by state
  47. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited aggregation response "aggregations" : {! "mean_density_by_population_by_state" : {! "terms" : [ {! "term" : "CA",! "doc_count" : 69,! "mean_density_by_population" : [ {! "key" : 0,! "doc_count" : 64,! "mean_density" : {! "value" : 5382.453125! }! }, {! "key" : 500000,! "doc_count" : 3,! "mean_density" : {! "value" : 8985.333333333334! }! }, {! "key" : 1000000,! "doc_count" : 1,! "mean_density" : {! "value" : 4020.0! }! }, {! "key" : 3500000,! "doc_count" : 1,! "mean_density" : {! "value" : 8092.0! }! } ]! }, {! "term" : "TX",! "doc_count" : 32,! group by population stats on density group by state
  48. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited calc aggregators • avg • min • max • sum • count • stats • extended stats
  49. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited bucket aggregators • global • filter • missing • terms • range • date range • ip range • histogram • date histogram • geo distance • nested
  50. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited https://github.com/elasticsearch/ elasticsearch/issues/3300
  51. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited thank you! Igor Motov twitter: @imotov email: [email protected] ! ! ! ! ! ! ! ! • Support: http://elasticsearch.com/support • Training: http://training.elasticsearch.com/ • We are hiring: http://elasticsearch.com/about/jobs/