Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Writing Code with Code: Getting Started with th...

Writing Code with Code: Getting Started with the Roslyn APIs

As developers, we spend our days writing code. What if we could get the computer to write it for us? Using the .NET Roslyn APIs, we can do precisely that!

In this session, Steve will share his latest work at Elastic, generating the majority of the .NET client for Elasticsearch from a simple specification. Steve will introduce and demonstrate how to get started with leveraging the Roslyn APIs. He will show you how the Elastic .NET code generator transforms a JSON spec into a C# syntax tree and outputs thousands of classes in only a few seconds.

You'll leave this session with an understanding of the Roslyn APIs and how you can leverage them in your work. You'll understand core concepts such as syntax trees, the SyntaxFactory and tools to help you write code with code.

Steve Gordon

January 24, 2023
Tweet

More Decks by Steve Gordon

Other Decks in Technology

Transcript

  1. 1 Steve Gordon (Engineer @ Elastic) @stevejgordon | fosstodon.org/@stevejgordon stevejgordon.co.uk

    Writing Code with Code Getting Started with the Roslyn APIs bit.ly/writing-code-with-code
  2. 2 Agenda • Introduction to Roslyn • Roslyn API demos

    ‒ Visualising syntax trees ‒ Generating C# code from syntax trees • Code-generating the Elasticsearch .NET client ‒ Creating a specification/schema ‒ Transform a spec to a strongly-type language ‒ Building syntax trees ‒ Emitting C# files ‒ Future enhancements (lessons learned)
  3. 4 What is Rosyln? • Open source, open box, compilers

    for C# and VB.NET • Compiler platform • Used heavily to provide Visual Studio IDE capabilities ‒ Maker of squiggles!! ‒ Finder of things!!
  4. 5 Analyzers and Code Fixes • An analyzer contains code

    that recognizes violations of a rule • Rules can relate to code structure, coding style, naming conventions etc. • A code fix contains the code that fixes the violation
  5. 7 Source Generators • C# compiler feature that lets developers

    inspect user code as it is being compiled • Develop components which run during compilation with access to rich metadata • Can create new C# source files on the fly that are added to a compilation
  6. 8 Compiler Flow • C#, VB.NET and F# compile to

    IL • At runtime, IL code is Just-In-Time (JIT) compiled to machine code • AoT / Native compilation (No JIT required) https://docs.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/compiler-api-model
  7. 9 Rosyln APIs • Roslyn includes a very rich compiler

    API https://docs.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/compiler-api-model
  8. 10 Rosyln APIs • Roslyn includes a very rich compiler

    API https://docs.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/compiler-api-model
  9. 11 Syntax Trees • The result of the syntax analysis

    phase of a compiler • Tree representation of syntactic structure of source code ‒ Nodes, Tokens and Trivia • Interact with source code on a deeply meaningful level. It's no longer text strings, but data that represents the structure of a program • Immutable and round-trippable data structure exposed by the compiler APIs
  10. 13

  11. 16 The Elasticsearch API in Numbers • > 400 API

    endpoints • > 2000 data structures ‒ 50 query types ‒ 70 aggregation types ‒ 30 field types
  12. 17 Problems with the v7 Client • Hand written ‒

    API is not always consistent ‒ A lot of maintenance work (400 endpoints and thousands of types!)
  13. 18 Elastic.Clients.Elasticsearch • A new generation of the Elasticsearch client

    • Code Generated ‒ Based on a formal specification of the Elasticsearch API The new .NET client for v8.0
  14. 19 Options for Code Generation • Strings/StringBuilder • Templates (T4,

    Razor etc.) • Roslyn APIs ‒ Construct syntax tree and produce C#
  15. 24 REST API specification → OpenAPI? OpenAPI is too limited

    • Elasticsearch API is complex and not “canonical” • Would require custom extensions • Our problem is mostly about data structures, not so much URLs OpenAPI is complex • “The Schema Object is a superset of the JSON Schema Specification Draft 2020-12” 😱 • 400 endpoint, 2000 structures… in YAML/JSON 😓
  16. 25 JSON API Specification → TypeScript! TypeScript’s type system is

    built to represent JSON/JS • Static type checking of the API • Strong IDE support • ts-morph: a library to build TS code processors ‒ Setup, navigation, and manipulation of the TypeScript AST can be a challenge. This library wraps the TypeScript compiler API so it's simple.
  17. 27 Example: Search Request /** * @rest_spec_name search * @since

    0.0.0 * @stability stable */ export interface Request extends RequestBase { path_parts: { index?: Indices } query_parameters: { allow_no_indices?: Boolean ... size?: integer from?: integer sort?: string | string[] } ... export type IndexName = string export type Indices = IndexName | IndexName[] Meta information
  18. 28 Example: Search Request ... body: { /** @aliases aggs

    */ aggregations?: Dictionary<string, AggregationContainer> collapse?: FieldCollapse /** * If true, returns detailed information about score computation as part of a hit. * @server_default false */ explain?: boolean from?: integer ... } Alias tag Documentation comment
  19. 30 export class Response<TDocument> { body: ResponseBody<TDocument> } export class

    ResponseBody<TDocument> { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata<TDocument> aggregations?: Dictionary<AggregateName, Aggregate> _clusters?: ClusterStatistics fields?: Dictionary<string, UserDefinedValue> max_score?: double num_reduce_phases?: long profile?: Profile ... } Example: Search Response User-provided type
  20. 31 export class Response<TDocument> { body: ResponseBody<TDocument> } export class

    ResponseBody<TDocument> { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata<TDocument> aggregations?: Dictionary<AggregateName, Aggregate> _clusters?: ClusterStatistics fields?: Dictionary<string, UserDefinedValue> max_score?: double num_reduce_phases?: long profile?: Profile ... } Example: Search Response export class HitsMetadata<T> { total?: TotalHits | long hits: Hit<T>[] max_score?: double | null } export class Hit<TDocument> { _index: IndexName _id: Id _score?: double | null _explanation?: Explanation ... _source: TDocument _seq_no?: SequenceNumber _source: TDocument _version?: VersionNumber }
  21. 32 export class Response<TDocument> { body: ResponseBody<TDocument> } export class

    ResponseBody<TDocument> { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata<TDocument> aggregations?: Dictionary<AggregateName, Aggregate> _clusters?: ClusterStatistics fields?: Dictionary<string, UserDefinedValue> max_score?: double num_reduce_phases?: long profile?: Profile ... } Example: Search Response export class HitsMetadata<T> { total?: TotalHits | long hits: Hit<T>[] max_score?: double | null } export class Hit<TDocument> { _index: IndexName _id: Id _score?: double | null _explanation?: Explanation ... _source: TDocument _seq_no?: SequenceNumber _source: TDocument _version?: VersionNumber }
  22. 33 export class Response<TDocument> { body: ResponseBody<TDocument> } export class

    ResponseBody<TDocument> { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata<TDocument> aggregations?: Dictionary<AggregateName, Aggregate> _clusters?: ClusterStatistics fields?: Dictionary<string, UserDefinedValue> max_score?: double num_reduce_phases?: long profile?: Profile ... } Example: Search Response export class HitsMetadata<T> { total?: TotalHits | long hits: Hit<T>[] max_score?: double | null } export class Hit<TDocument> { _index: IndexName _id: Id _score?: double | null _explanation?: Explanation ... _source: TDocument _seq_no?: SequenceNumber _source: TDocument _version?: VersionNumber }
  23. 37 Example: Query /** * @variants container * @non_exhaustive *

    @doc_id query-dsl */ export class QueryContainer { bool?: BoolQuery boosting?: BoostingQuery /** @deprecated 7.3.0 */ common?: SingleKeyDictionary<Field, CommonTermsQuery> /** @since 7.13.0 */ combined_fields?: CombinedFieldsQuery constant_score?: ConstantScoreQuery dis_max?: DisMaxQuery distance_feature?: DistanceFeatureQuery exists?: ExistsQuery function_score?: FunctionScoreQuery ... Container variant is used for types that contain all the variants inside the definition
  24. 38 Validating the Specification • Piggy-back on Elasticsearch integration tests

    ‒ Capture request and response JSON ‒ Does it fit in the corresponding TS type? ‒ > 5400 validation tests!
  25. 41 TypeScript to Code Generating code from the TypeScript AST

    • Too low level • Not constrained enough Transform TypeScript to a simpler schema • Tailor-made for Elastic’s specific needs • Simple unambiguous meta-model
  26. 42 Code Generation Pipeline Spec compiler schema.json TypeScript API request

    & response bodies specification.ts Endpoints, Request & response bodies + Rich annotations .NET Code Generator .NET Client More Code Generators Java, Go, JS, Python, Rust, Ruby, PHP clients OpenAPI Generator OpenAPI Spec API Docs Even more generators
  27. 44 .NET Code Generator Process Deserialise JSON Build contexts Mark

    and enrich contexts Build Roslyn Syntax Trees Write .cs files
  28. 45 .NET Code Generator • Establish naming and namespaces for

    generated types • Walk type hierarchy • Identify relationships • Mark request types • Mark containers and variants • Simplify type aliases to built-in types • Mark specialised serialisation needs (Bulk etc.) Marking and Enrichment
  29. 48 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(

    Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes( SimpleBaseType( ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters .Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); ...
  30. 49 var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS

    var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers( Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes( SimpleBaseType( ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()) .Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters .Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties .Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...
  31. 50 var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS

    var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers( Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes( SimpleBaseType( ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()) .Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters .Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties .Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...
  32. 51 var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS

    var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers( Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes( SimpleBaseType( ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()) .Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters .Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties .Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...
  33. 52 var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS

    var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers( Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes( SimpleBaseType( ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()) .Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters .Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties .Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...
  34. 58 public static PropertyDeclarationSyntax CreateSerializableProperty( Property property, bool selfDeserialisable =

    false) { if (TryResolveMemberToPropertyType(property.Type, property, out var typeSyntax)) { var propertyDeclaration = PropertyDeclaration( typeSyntax.TypeSyntax, Identifier(property.CodegenName)) .AddModifiers(Token(SyntaxKind.PublicKeyword)); propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonInclude")))), AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonPropertyName")) .AddArgumentListArguments(AttributeArgument( LiteralExpression(SyntaxKind.StringLiteralExpression, Literal(property.JsonName))))))); ...
  35. 59 public static PropertyDeclarationSyntax CreateSerializableProperty( Property property, bool selfDeserialisable =

    false) { if (TryResolveMemberToPropertyType(property.Type, property, out var typeSyntax)) { var propertyDeclaration = PropertyDeclaration( typeSyntax.TypeSyntax, Identifier(property.CodegenName)) .AddModifiers(Token(SyntaxKind.PublicKeyword)); propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonInclude")))), AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonPropertyName")) .AddArgumentListArguments(AttributeArgument( LiteralExpression(SyntaxKind.StringLiteralExpression, Literal(property.JsonName))))))); ...
  36. 60 public static PropertyDeclarationSyntax CreateSerializableProperty( Property property, bool selfDeserialisable =

    false) { if (TryResolveMemberToPropertyType(property.Type, property, out var typeSyntax)) { var propertyDeclaration = PropertyDeclaration( typeSyntax.TypeSyntax, Identifier(property.CodegenName)) .AddModifiers(Token(SyntaxKind.PublicKeyword)); propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonInclude")))), AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonPropertyName")) .AddArgumentListArguments(AttributeArgument( LiteralExpression(SyntaxKind.StringLiteralExpression, Literal(property.JsonName))))))); ...
  37. 64 if (property.OwningType.UsedInRequest) { propertyDeclaration = propertyDeclaration .AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken(

    Token(SyntaxKind.SemicolonToken)), AccessorDeclaration(SyntaxKind.SetAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken))); } else { propertyDeclaration = propertyDeclaration .AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken)), AccessorDeclaration(SyntaxKind.InitAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken))); } return propertyDeclaration;
  38. 65 if (property.OwningType.UsedInRequest) { propertyDeclaration = propertyDeclaration .AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken(

    Token(SyntaxKind.SemicolonToken)), AccessorDeclaration(SyntaxKind.SetAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken))); } else { propertyDeclaration = propertyDeclaration .AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken)), AccessorDeclaration(SyntaxKind.InitAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken))); } return propertyDeclaration;
  39. 67 Future Plans • Refactoring the Code Generator ‒ Pluggable

    transform pipeline (JSON input) ‒ Pluggable filter pipeline for endpoints (JSON input) ‒ Easier to configure for non-developers ‒ Decouple specification from syntax building (intermediate models) • Analyse existing project via Workspace APIs ‒ Determine differences and breaking changes ‒ Check generated project compiles (in memory)
  40. 68 Resources • bit.ly/writing-code-with-code • github.com/stevejgordon/writing-code-with-code- demos • learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/ •

    roslynquoter.azurewebsites.net/ • github.com/elastic/elasticsearch-net • github.com/elastic/elasticsearch-specification