Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ruleguard vs Semgrep vs CodeQL

Ruleguard vs Semgrep vs CodeQL

```
| Topic | Ruleguard vs Semgrep vs CodeQL |
| Location | online |
| Date | October 17, 202 0 |
```

Sub-topics:

- go/analysis example
- Ruleguard example
- Semgrep example
- CodeQL example
- Using ruleguard from golagnci-lint
- Ruleguard guide
- How ast matching works
- Type matching examples
- Side-by-side comparison

Avatar for Iskander (Alex) Sharipov

Iskander (Alex) Sharipov

October 17, 2020
Tweet

More Decks by Iskander (Alex) Sharipov

Other Decks in Programming

Transcript

  1. Our starting point We assume that: • You know that

    static analysis is cool • You’re using golangci-lint
  2. Our starting point We assume that: • You know that

    static analysis is cool • You’re using golangci-lint • You want to create custom code checkers
  3. !

  4. 6 hours later... W hy?! WDYM AST type types are

    not “types”?! No results on stackoverflow?! How?!
  5. var analyzer = &analysis.Analyzer{ Name: "writestring", Doc: "find sloppy io.WriteString()

    usages", Run: run, } func run(pass *analysis.Pass) (interface{}, error) { // Analyzer implementation... return nil, nil } Analyzer definition
  6. for _, f := range pass.Files { ast.Inspect(f, func(n ast.Node)

    bool { // Check n node... }) } Analyzer implementation
  7. // 1. Is it a call expression? call, ok :=

    n.(*ast.CallExpr) if !ok || len(call.Args) != 2 { return true } Check n node: part 1
  8. // 2. Is it io.WriteString() call? fn, ok := call.Fun.(*ast.SelectorExpr)

    if !ok || fn.Sel.Name != "WriteString" { return true } pkg, ok := fn.X.(*ast.Ident) if !ok || pkg.Name != "io" { return true } Check n node: part 2
  9. // 3. Is second arg a string(b) expr? stringCall, ok

    := call.Args[1].(*ast.CallExpr) if !ok || len(stringCall.Args) != 1 { return true } stringFn, ok := stringCall.Fun.(*ast.Ident) if !ok || stringFn.Name != "string" { return true } Check n node: part 3
  10. // 4. Does b has a type of []byte? b

    := stringCall.Args[0] if pass.TypesInfo.TypeOf(b).String() != "[]byte" { return true } Check n node: part 4
  11. // 5. Report the issue msg := "io.WriteString(w, string(b)) ->

    w.Write(b)" pass.Reportf(call.Pos(), msg) Check n node: part 5
  12. func f(io InputController, b []byte) { io.WriteString(w, string(b)) } io

    could be something else! Need to check that io is a package
  13. import "github.com/quasilyte/io" // not stdlib! func f(b []byte) { io.WriteString(w,

    string(b)) } io could be something else! But even if it is a package we can get confused
  14. rules: - id: writestring patterns: - pattern: io.WriteString($W, string($B)) message:

    "use $W.Write($B)" languages: [go] severity: ERROR writestring.yml
  15. rules: - id: writestring patterns: - pattern: io.WriteString($W, string($B)) message:

    "use $W.Write($B)" languages: [go] severity: ERROR writestring.yml TODO: type filters
  16. { rules: [ { id: 'writestring', patterns: [ {pattern: 'io.WriteString($W,

    string($B))'}, ], message: 'use $W.Write($B)', languages: ['go'], severity: 'ERROR', }, ], } Using YAML5 format for semgrep rules
  17. import go from CallExpr c, Expr w, ConversionExpr conv, SelectorExpr

    fn where w = c.getArgument(0) and conv = c.getArgument(1) and fn = c.getCalleeExpr() and fn.getSelector().getName() = "WriteString" and fn.getBase().toString() = "io" and conv.getOperand().getType() instanceof ByteSliceType and conv.getType() instanceof StringType select c, "use " + w + ".Write(" + conv.getOperand() + ")" CodeQL query
  18. How to run? • Use the online query console •

    Select quasilyte/codeql-test project • Copy/paste query from the previous slide
  19. CodeQL pros • SSA support • Taint analysis (source-sink) •

    Not limited by (Go) syntax rules • Real declarative programming language
  20. CodeQL pros • SSA support • Taint analysis (source-sink) •

    Not limited by (Go) syntax rules • Real declarative programming language • Backed by GitHub
  21. CodeQL pros • SSA support • Taint analysis (source-sink) •

    Not limited by (Go) syntax rules • Real declarative programming language • Backed by GitHub Microsoft
  22. CodeQL pros • SSA support • Taint analysis (source-sink) •

    Not limited by (Go) syntax rules • Real declarative programming language • Backed by GitHub Microsoft • 1st class GitHub integration
  23. CodeQL cons The main points that I want to cover:

    1. Steep learning curve 2. Simple things are not simple 3. Non-trivial QL may look alien for many people
  24. Why Ruleguard then? • Very easy to get started (just

    “go get” it) • Rules are written in pure Go
  25. Why Ruleguard then? • Very easy to get started (just

    “go get” it) • Rules are written in pure Go • Integrated in golangci-lint and go-critic
  26. Why Ruleguard then? • Very easy to get started (just

    “go get” it) • Rules are written in pure Go • Integrated in golangci-lint and go-critic • Simple things are simple
  27. Why Ruleguard then? • Very easy to get started (just

    “go get” it) • Rules are written in pure Go • Integrated in golangci-lint and go-critic • Simple things are simple • Very Go-centric (both pro and con)
  28. Enabling Ruleguard 1. Install golangci-lint on your pipeline (if not

    yet) 2. Prepare a rules file (a Go file with ruleguard rules) 3. Enable ruleguard in golangci-lint config You can also use Ruleguard directly or via go-critic.
  29. linters: enable: - gocritic linters-settings: gocritic: enabled-checks: - ruleguard settings:

    ruleguard: rules: "rules.go" .golangci.yml checklist go-critic linter should be enabled
  30. linters: enable: - gocritic linters-settings: gocritic: enabled-checks: - ruleguard settings:

    ruleguard: rules: "rules.go" .golangci.yml checklist ruleguard checker should be enabled
  31. linters: enable: - gocritic linters-settings: gocritic: enabled-checks: - ruleguard settings:

    ruleguard: rules: "rules.go" .golangci.yml checklist rules param should be set
  32. func match(pat, n ast.Node) bool pat is a compiled pattern

    n is a node being matched AST matching engine
  33. Algorithm • Both pat and n are traversed • Non-meta

    nodes are compared normally • pat meta nodes are separate cases • Named matches are collected (capture) • Some patterns may involve backtracking
  34. • $x is a simple “match any” named match •

    $_ is a “match any” unnamed match • $*_ matches zero or more nodes Meta node examples
  35. Pattern matching = $x $x = a 10 Pattern $x=$x

    Target a=10 $x is bound to a
  36. Pattern matching = $x $x = a a Pattern $x=$x

    Target a=a $x is bound to a
  37. Pattern matching = $x $x = a a Pattern $x=$x

    Target a=a a = a, pattern matched
  38. Where() expression operands • Matched text predicates • Properties like

    AssignableTo/ConvertibleTo/Pure • Check whether a value implements interface
  39. Where() expression operands • Matched text predicates • Properties like

    AssignableTo/ConvertibleTo/Pure • Check whether a value implements interface • Type matching expressions
  40. Where() expression operands • Matched text predicates • Properties like

    AssignableTo/ConvertibleTo/Pure • Check whether a value implements interface • Type matching expressions • File-related filters (like “file imports X”)
  41. $t Arbitrary type []byte Byte slice type []$t Arbitrary slice

    type map[$t]$t Map with $t key and value types map[$t]struct{} Any set-like map func($_) $_ Any T1->T2 function type Type matching examples
  42. struct{$*_} Arbitrary struct struct{$x; $x} Struct of 2 $x-typed fields

    struct{$_; $_} Struct with any 2 fields struct{$x; $*_} Struct that starts with $x field struct{$*_; $x} Struct that ends with $x field struct{$*_; $x; $*_} Struct that contains $x field Type matching examples (cont.)
  43. // Just report a message m.Report("warning message") // Report +

    do an auto fix in -fix mode m.Suggest("autofix template") Report() and Suggest() handle a match
  44. func badLock(m fluent.Matcher) { m.Match(`$mu.Lock(); $mu.Unlock()`). Report(`$mu unlocked immediately`) m.Match(`$mu.Lock();

    defer $mu.RUnlock()`). Report(`maybe $mu.RLock() is intended?`) } Find mutex usage issues (real-world example)
  45. # -e runs a single inline rule ruleguard -e 'm.Match(`!($x

    != $y)`)' file.go Running ruleguard with -e
  46. Written in go-ruleguard Go Semgrep Mostly OCaml CodeQL ??? (Compler+Runtime

    are closed source) Ruleguard vs Semgrep vs CodeQL
  47. Type matching mechanism go-ruleguard Typematch patterns + predicates Semgrep N/A

    (planned, but not implemented yet) CodeQL Type assertion-like API Ruleguard vs Semgrep vs CodeQL
  48. Supported languages go-ruleguard Go Semgrep Go + other languages CodeQL

    Go + other languages Ruleguard vs Semgrep vs CodeQL
  49. How much you can do go-ruleguard Simple-medium diagnostics Semgrep Simple-medium

    diagnostics CodeQL Almost whatever you want Ruleguard vs Semgrep vs CodeQL
  50. Links • Ruleguard quickstart: EN, RU • Ruleguard DSL documentation

    • Ruleguard examples: one, two • gogrep - AST patterns matching library for Go • A list of similar tools • .golangci.yml from go-critic (uses ruleguard)