Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The New Elasticsearch .NET Client: Getting Star...

The New Elasticsearch .NET Client: Getting Started and Behind the Scenes

Elasticsearch is a leading search and analytics solution used by thousands of companies worldwide for use cases, including search, analytics, monitoring and security information and event management (SIEM). With an emphasis on speed, scale, and relevance, it's transforming how the world uses data.

In this session, we'll learn about leveraging the power of Elasticsearch within .NET applications, utilising the new Elasticsearch .NET client library. Join Steve to learn about the .NET client and how to install it in .NET applications and use it to begin indexing and searching documents. We'll even take a sneak peek at how we maintain the library using code generation.

This session is aimed at software developers looking to get started by combining the capabilities of Elasticsearch with their .NET applications. You'll leave with the core knowledge required to begin using Elasticsearch and the .NET client library.

Steve Gordon

June 23, 2022
Tweet

More Decks by Steve Gordon

Other Decks in Technology

Transcript

  1. 1 Steve Gordon (Senior Engineer @ Elastic) @stevejgordon | stevejgordon.co.uk

    The New Elasticsearch .NET Client Getting Started and Behind the Scenes
  2. 2 Agenda • Introduction to Elasticsearch • .NET Client for

    Elasticsearch ‒ Problems with the existing client ‒ Introducing the new v8 client • Demos • Behind the scenes ‒ Building the Elasticsearch specification ‒ Code generation of the new .NET client
  3. 4 Store, Search, & Analyze Visualize & Manage Ingest Elastic

    Stack SOLUTIONS Kibana Elasticsearch Beats Logstash SaaS On-Prem Elastic cloud Elastic cloud Enterprise Standalone Elastic cloud On Kubernetes Elastic Agent
  4. 7 Basic Terminology CLUSTER A collection of one or more

    nodes (servers) that together hold your data and provide federated indexing and search capabilities across all nodes.
  5. 8 Basic Terminology CLUSTER A single server that is part

    of your cluster, stores your data, and participates in the clusters indexing and search capabilities. NODE 1 NODE 2 NODE 3
  6. 9 Basic Terminology CLUSTER A collection of documents that have

    somewhat similar characteristics. NODE 1 NODE 2 NODE 3 INDEX
  7. 10 Basic Terminology CLUSTER Elasticsearch provides the ability to subdivide

    your index into multiple pieces called shards. NODE 1 NODE 2 NODE 3 INDEX SHARD (PRIMARY) P1 SHARD (PRIMARY) P2 SHARD (PRIMARY) P3 SHARD (REPLICA) R3 SHARD (REPLICA) R1 SHARD (REPLICA) R2
  8. 11 Basic Terminology CLUSTER Elasticsearch provides the ability to subdivide

    your index into multiple pieces called shards. NODE 1 NODE 2 NODE 3 INDEX SHARD (PRIMARY) P1 SHARD (PRIMARY) P2 SHARD (PRIMARY) P3 SHARD (REPLICA) R3 SHARD (REPLICA) R1 SHARD (REPLICA) R2
  9. 12 Basic Terminology CLUSTER The basic unit of information that

    can be indexed in JSON form. NODE 1 NODE 2 NODE 3 INDEX SHARD (PRIMARY) P1 SHARD (PRIMARY) P2 SHARD (PRIMARY) P3 SHARD (REPLICA) R3 SHARD (REPLICA) R1 SHARD (REPLICA) R2 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC
  10. 14 The Elasticsearch API in Numbers • > 400 API

    endpoints • > 2000 data structures ‒ 50 query types ‒ 70 aggregation types ‒ 30 field types
  11. 15 Language Clients • .NET • Java • JavaScript •

    Ruby • Go • PHP • Perl • Python • Rust
  12. 16 Existing Elasticsearch .NET Client (7.x) NEST High Level Client

    • Methods for every API • Strongly-typed requests & responses • Aggregations • Mappings • Query DSL • Fluent syntax • Helpers Elasticsearch.Net Low Level Client • Dependency free, unopinionated client. • Handles transport • Client-side load balancing • Request parameters (query string) • Serialisation • API URL Resolution
  13. 18 Problems with the Existing Client • Hand written ‒

    API is not always consistent ‒ A lot of maintenance work (400 endpoints and thousands of types!) • Legacy internalised JSON serialiser based on Utf8Json • ~12 years of historical decisions
  14. 19

  15. 20 Introducing Elastic.Clients.Elasticsearch • A new generation of the Elasticsearch

    client • Code Generated ‒ Based on a formal specification of the Elasticsearch API • Uses System.Text.Json serializer • Built on a common Elastic.Transport layer • Removes some of the legacy of the past to create a cleaner API The new .NET client for v8.0
  16. 24 REST API specification → OpenAPI? OpenAPI is too limited

    • Elasticsearch API is complex and not “canonical” • Would require custom extensions • Our problem is mostly about data structures, not so much URLs OpenAPI is complex • “The Schema Object is a superset of the JSON Schema Specification Draft 2020-12” 😱 • 400 endpoint, 2000 structures… in YAML/JSON 😓
  17. 25 JSON API Specification → TypeScript! TypeScript’s type system is

    built to represent JSON/JS • Static type checking of the API • Strong IDE support • ts-morph: a library to build TS code processors ‒ Setup, navigation, and manipulation of the TypeScript AST can be a challenge. This library wraps the TypeScript compiler API so it's simple.
  18. 27 Example: Search Request /** * @rest_spec_name search * @since

    0.0.0 * @stability stable */ export interface Request extends RequestBase { path_parts: { index?: Indices } query_parameters: { allow_no_indices?: Boolean ... size?: integer from?: integer sort?: string | string[] } body: { /** @aliases aggs */ // ES uses "aggregations" in serialization aggregations?: Dictionary<string, AggregationContainer> collapse?: FieldCollapse /** * If true, returns detailed information about score computation as part of a hit. * @server_default false */ explain?: boolean * @server_default 0 */ from?: integer ... } export type IndexName = string export type Indices = IndexName | IndexName[] Meta information Alias tag Documentation comment
  19. 28 export class Response<TDocument> { body: ResponseBody<TDocument> } export class

    ResponseBody<TDocument> { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata<TDocument> aggregations?: Dictionary<AggregateName, Aggregate> _clusters?: ClusterStatistics fields?: Dictionary<string, UserDefinedValue> max_score?: double num_reduce_phases?: long profile?: Profile pit_id?: Id _scroll_id?: ScrollId suggest?: Dictionary<SuggestionName, Suggest<TDocument>[]> terminated_early?: boolean } Example: Search Response User-provided type
  20. 29 export class Response<TDocument> { body: ResponseBody<TDocument> } export class

    ResponseBody<TDocument> { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata<TDocument> aggregations?: Dictionary<AggregateName, Aggregate> _clusters?: ClusterStatistics fields?: Dictionary<string, UserDefinedValue> max_score?: double num_reduce_phases?: long profile?: Profile pit_id?: Id _scroll_id?: ScrollId suggest?: Dictionary<SuggestionName, Suggest<TDocument>[]> terminated_early?: boolean } Example: Search Response export class HitsMetadata<T> { total?: TotalHits | long hits: Hit<T>[] max_score?: double | null } export class HitMetadata<TDocument> { _id: Id _index: IndexName _primary_term: long _routing: string _seq_no: SequenceNumber _source: TDocument _version: VersionNumber }
  21. 30 Example: Query /** * @variants container * @non_exhaustive *

    @doc_id query-dsl */ export class QueryContainer { bool?: BoolQuery boosting?: BoostingQuery /** @deprecated 7.3.0 */ common?: SingleKeyDictionary<Field, CommonTermsQuery> /** @since 7.13.0 */ combined_fields?: CombinedFieldsQuery constant_score?: ConstantScoreQuery dis_max?: DisMaxQuery distance_feature?: DistanceFeatureQuery exists?: ExistsQuery function_score?: FunctionScoreQuery fuzzy?: SingleKeyDictionary<Field, FuzzyQuery> geo_bounding_box?: GeoBoundingBoxQuery geo_distance?: GeoDistanceQuery geo_polygon?: GeoPolygonQuery ... Container variant is used for types that contain all the variants inside the definition Properties can be tagged as deprecation since a particular version We also track versions where new properties have been added
  22. 31 Validating the Specification • Piggy-back on Elasticsearch integration tests

    ‒ Capture request and response JSON ‒ Does it fit in the corresponding TS type? ‒ > 5400 validation tests!
  23. 34 TypeScript to Code Generating code from the TypeScript AST

    • Too low level • Not constrained enough Transform TypeScript to a simpler schema • Tailor-made for Elastic’s specific needs • Simple unambiguous meta-model
  24. 35 Code Generation Pipeline Spec compiler schema.json TypeScript API request

    & response bodies specification.ts Endpoints, Request & response bodies + Rich annotations .NET Code Generator .NET Client More Code Generators Java, Go, JS, Python, Rust, Ruby, PHP clients OpenAPI OpenAPI API Docs Even more generators
  25. 36 .NET Code Generator Process Deserialise JSON Build contexts Mark

    and enrich contexts Build Roslyn AST Write .cs files
  26. 37 .NET Code Generator • Establish naming and namespaces for

    generated types • Walk type hierarchy • Identify relationships • Mark request types • Mark containers and variants • Simplify type aliases to built-in types • Determine which types require which descriptors • Mark specialised serialisation needs (Bulk etc.) Marking and Enrichment
  27. 38 .NET Code Generator • Roslyn includes a very rich

    API • Not a great deal of documentation • roslynquoter.azurewebsites.net Build Roslyn AST https://docs.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/compiler-api-model
  28. 40 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword),

    Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...
  29. 41 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword),

    Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...
  30. 42 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword),

    Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...
  31. 43 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword),

    Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...
  32. 44 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword),

    Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...
  33. 45 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword),

    Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...
  34. 46 public static PropertyDeclarationSyntax CreateSerializableProperty(PropertyV2 property, bool selfDeserialisable = false)

    { if (TryResolveMemberToPropertyType(property.Type, property, out var typeSyntax)) { var propertyDeclaration = PropertyDeclaration(typeSyntax.TypeSyntax, Identifier(property.CodegenName)) .AddModifiers(Token(SyntaxKind.PublicKeyword)); propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList(Attribute(IdentifierName("JsonInclude")))), AttributeList(SingletonSeparatedList(Attribute(IdentifierName("JsonPropertyName")) .AddArgumentListArguments(AttributeArgument( LiteralExpression(SyntaxKind.StringLiteralExpression, Literal(property.JsonName))))))); if (property.IsSourceProperty) propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList(Attribute(IdentifierName("SourceConverter"))))); ...
  35. 47 public static PropertyDeclarationSyntax CreateSerializableProperty(PropertyV2 property, bool selfDeserialisable = false)

    { if (TryResolveMemberToPropertyType(property.Type, property, out var typeSyntax)) { var propertyDeclaration = PropertyDeclaration(typeSyntax.TypeSyntax, Identifier(property.CodegenName)) .AddModifiers(Token(SyntaxKind.PublicKeyword)); propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList(Attribute(IdentifierName("JsonInclude")))), AttributeList(SingletonSeparatedList(Attribute(IdentifierName("JsonPropertyName")) .AddArgumentListArguments(AttributeArgument( LiteralExpression(SyntaxKind.StringLiteralExpression, Literal(property.JsonName))))))); if (property.IsSourceProperty) propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList(Attribute(IdentifierName("SourceConverter"))))); ...
  36. 48 ... if (property.OwningType.UsedInRequest) { propertyDeclaration = propertyDeclaration.AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken(Token(SyntaxKind.SemicolonToken)),

    AccessorDeclaration(SyntaxKind.SetAccessorDeclaration) .WithSemicolonToken(Token(SyntaxKind.SemicolonToken))); } else { propertyDeclaration = propertyDeclaration.AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken(Token(SyntaxKind.SemicolonToken)), AccessorDeclaration(SyntaxKind.InitAccessorDeclaration) .WithSemicolonToken(Token(SyntaxKind.SemicolonToken))); } return propertyDeclaration; } return null; }
  37. 49 Resources • github.com/elastic/elasticsearch-net • elastic.co/guide/en/elasticsearch/client/net-api • nuget.org/packages/NEST • nuget.org/packages/Elastic.Clients.Elasticsearch

    • github.com/stevejgordon/elasticsearch-examples • github.com/elastic/elasticsearch-specification • discuss.elastic.co