Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Database Firewall from Scratch

Database Firewall from Scratch

Arseny Reutov

May 25, 2017
Tweet

More Decks by Arseny Reutov

Other Decks in Programming

Transcript

  1. About us • Arseny Reutov • Head of application security

    research at Positive Technologies • Member of Positive Hack Days (https://phdays.com) conference board • Occasional web security blogger (https://raz0r.name) • Denis Kolegov • Team lead of Application Firewall research at Positive Technologies • PhD, associate professor at Tomsk State University • Web security micro blogger (https://twitter.com/dnkolegov)
  2. Outline • The stuff we are going to talk about

    is joint work of PT Application Firewall Research Team developing a database firewall prototype as a part of our application firewall • Thanks to  Arseny Reutov  Denis Kolegov  Vladimir Kochetkov  Igor Kanygin  Nikolay Tkachenko  Ivan Hudyashov  Sergey Grechnev  Sergey Reshetnikov
  3. Agenda • Intro • WAF and DBFW • Related Work

    • Our Prototype • Parser • Protectors  Profiler  Dejector  SQLi  Access Control  IAM • Roadmap
  4. What is WAF for most people? WAF is a blackbox

    system that applies a set of rules to an HTTP conversation
  5. What is WAF for most people? For a WAF web

    application is just a series of HTTP transactions
  6. What is Database Firewall? Database firewalls are a type of

    application firewalls which • Monitor database activity • Detect database specific attacks • Protect sensitive information stored in the databases • Implement adequate access control models
  7. Database Firewall Deployment Like WAFs database firewalls can be deployed

    • in proxy mode • in sniffer mode via a SPAN port (mirrored traffic) • as a host-based agent
  8. What Database Firewall Can Do? Database firewalls can do several

    actions on each query: • Pass • Log for monitoring purposes • Alert • Rewrite query • Block (either by dropping connection or by generating a native error code)
  9. WAF vs DBFW WAF XSS XXE RCE LFI SSRF IDOR

    CSRF Path Traversal Open Redirect Object Injection Session Fixation …
  10. WAF vs DBFW WAF XSS XXE RCE LFI SSRF IDOR

    CSRF Path Traversal Open Redirect Object Injection Session Fixation … DBFW Segregation of Duties Audit & Monitoring Sensitive Data Discovery
  11. WAF vs DBFW WAF XSS XXE RCE LFI SSRF IDOR

    CSRF Path Traversal Open Redirect Object Injection Session Fixation … DBFW Segregation of Duties Audit & Monitoring Sensitive Data Discovery Access Control SQL Injection Buffer Overflow Data Leakage Prevention Data Masking
  12. SQL Injection Detection: Green SQL • Green SQL have been

    mod_security of DBFWs for many years, but open source project is no longer maintained • SQL Injection detection is based on risk score using metrics:  SQL comments  Sensitive tables  OR token  UNION token  Variable comparison  Always true expressions  and more
  13. SQL Injection Detection: Machine Learning SOFIA: An Automated Security Oracle

    for Black-Box Testing of SQL-Injection Vulnerabilities
  14. SQL Injection Detection: Machine Learning • ”SOFIA is significantly more

    accurate than antiSQLi and GreenSQL and significantly faster than antiSQLi in classifying legitimate SQL statements and SQLi attacks.” • However, it takes lots of computing power to train the model since tree operations are time expensive • The algorithm is not tolerant to attacks during training
  15. Architecture • Core  Go  Python/Twisted • Parser 

    ANTLR 4  ANTLR Grammars-v4  Python/C++ • Storage  Mongo • UI  React  Redux  GraphQL
  16. DBFW Data Flow dbfwgo (1) xsql query (6) check result

    dbfwpython (2) request context Web server Parser DBMS (10) xsql result (7) response context (3) parser request (4) response (AST, tokens, etc.) (8) xsql query (9) xsql result Protector 1 … Protector N (5) query attributes
  17. Naïve Approach Syntax analysis Protectors Protector 1 … Protector N

    Concrete syntax tree Parser mysql Parser tsql Parser plsql query
  18. SQL Parsing Flow Mysql parser AST tsqlParser plsqlParser Syntax analysis

    Semantic analysis … Collector 1 … Collector N CST AST Abstract syntax analysis query
  19. SQL Parsing Implementation ANTLR4 xsql parser Code generator xsql grammar

    SQL AST specification Parser’s classes AST Node classes query xsql visitor CST AST
  20. MySQL Parsing Time Results Python 2.7 ~ 0.3 / 0.001

    sec ~ 0.4 / 0.001 sec ~ 1.5 / 0.003 sec ~ 1.6 / 0.004 sec C++ 11 ~ 0.01 / 0.00008 sec ~ 0.01 / 0.00029 sec ~ 0.06 / 0.0001 sec ~ 0.06 / 0.004 sec Test Minimal query* parsing Minimal query full parsing Heavy query** parsing Heavy query full parsing ~ 983 sec ~ 2 KB/sec ~ 48 sec ~ 40 KB/sec Bitrix queries file (2 MB) ~ 432 sec ~ 420 query/sec ~ 16 sec ~ 11000 query/sec Wordpress queries (181540) * Minimal query: select 1 ** Heavy query: select * from (((((((select col1 from t1) as ttt))))))
  21. Profiler • SQL profiler is a basic protection mechanism implemented

    in all database firewalls • It works like linting utilities or linters (e.g. eslint, pylint, cppint, etc.), but analyses SQL queries and check if they satisfy security policy (SQL profile) • The main goal is to prevent using of SQLi automatic tools and exploits • SQL profile can be  Static: created by manual configuration  Dynamic: created by source code analysis tools
  22. Example of SQL Profile Rules Options ["log", 20] Rule no-union

    ["block", 150] max-query-length ["block", 6] max-columns ["block", 0] max-union ["block", "benchmark", "md5", "if"] function-blacklist ["block"] entity-length "off" no-subqueries ["block"] no-hex ["block"] no-stacked-queries ["block"] no-comments ["block"] no-os-commands
  23. AST Example Dejector is a context-free parse tree validation approach

    to preventing SQL Injection, proposed by Hansen and Patterson in 2005 Given a set of known-good queries and the base formal grammar, Dejector builds a new subgrammar that contains only the rules required to produce exactly the queries in the known-good set Strings recognized by the subgrammar are guaranteed to be structurally identical to those in the known-good set The subgrammar is then used with a parser generator such as bison or ANTLR to produce a recognizer for the sublanguage
  24. ANTLR-based Dejector xsql Parser Compiler (3) Trusted xsql queries Aggregator

    (4) CST list (6) Union CST (7) Grammar subxsql.g4 ANTLR (8) subxsql parser (1) xsql.g4 (2) xsql Parser (5) Dejector mode Application Security language mechanism Developer
  25. Strict Mode a b c f h g i j

    l k m UCST ANTLR v4 grammar: a: b | c | pd ; c: f; f: g | h; g: i | j; i: l; … p d
  26. Strict Mode a b c f d h g i

    j l k m a b f d UCST CST
  27. Strict Mode a b c f h g i j

    l k m b1 f d1 new UCST ANTLR v4 grammar: a: b | c | pd | b1 | d1 ; b1: f; c: f; f: g | h; g: i | j; i: l; … p d
  28. Strict Dejector Parsing Time Results Python 2.7 MySQL ~ 0.643

    / 0.0019 sec ~ 0.67 / 0.002 sec ~ 0.33 / 0.003 sec ~ 0.32 / 0.009 sec Python 2.7 SubMySQL ~ 0.09 / 0.0011 sec ~ 0.102 / 0.0011 sec ~ 0.09 / 0.001 sec ~ 0.18 / 0.005 sec Test SELECT * FROM a WHERE b='c' SELECT * FROM a WHERE b BETWEEN 'c' AND 'd' INSERT INTO passbook VALUES('a','b','c','d','e','f','g','h') CREATE TABLE a (b int(5) AUTO_INCREMENT, c date, d VARCHAR(255), e VARCHAR(255), f VARCHAR(255), g int(10), h int(10), i float(10,2), j VARCHAR(255), PRIMARY KEY (b)) ~ 1.54 / 0.003 sec ~ 0.09 / 0.001 sec SELECT * FROM (((((((SELECT col1 FROM t1) AS ttt))))))* * Query can not be derived in SubMySQL grammar
  29. WAF + DBFW • Suppose that we have both WAF

    and DBFW deployed: Client WAF Web Server DBFW Database
  30. HTTP & SQL Correlation In order to correlate SQL queries

    with HTTP packets a host-based module can be deployed on the web server which will append session cookie into each SQL query in a comment section
  31. HTTP & SQL Correlation • When these modified queries reach

    DBFW it can look up those session identifiers in the database shared with WAF • WAF holding access control policy for web users acts as information point, i.e. it provides user information given a session cookie • DBFW serves as enforcement point, effectively blocking or allowing queries
  32. HTTP & SQL Correlation • What if we do not

    have a chance to deploy a host module (agent)? • We can still try to correlate HTTP and SQL using time-throttled request processing • Idea is that we process HTTP requests synchronously, observe emitted SQL queries, and associate them with HTTP requests
  33. SQL Injection Detection • Using host-based agent we can effectively

    detect SQL Injections • Agent injects into an SQL comment data about HTTP parameters that were observed when executing SQL query
  34. SQLi Detection Approach • DBFW replaces each occurrence of HTTP

    parameter value found in SQL query with a constant • Then it tries to parse and get tokens firstly for the original query and then for the second one with replaced constants • If a number of tokens is different, an SQL Injection is reported since constant replacement have caused changes in the query structure
  35. AST-based Detection • A better approach is to compare ASTs

    instead • After traversal of the ASTs, if differences are found, an SQL Injection is reported because constant replacement have caused changes in AST
  36. It decreases number of false positives. Does this mechanism decrease

    false negatives too? One of bypasses for owasp-modsecurity-crs found by Ivan Novikov It is not detected by libinjection too due to the context issue From Theory to Practice curl 'localhost/index.html?id=1%20or%20true' 1%20or%20true id=1.or-id id=.1or-UTC_DATE— )-sleep(9999 sleep(9999) */UNION SELECT password FROM users--
  37. SQLi Detection Example GET /app/?id=1%20or%20true HTTP/1.1 Host: example.com Connection: close

    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding: gzip, deflate, sdch Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4
  38. SQLi Detection Example GET /app/?id=1%20or%20true HTTP/1.1 Host: example.com Connection: close

    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding: gzip, deflate, sdch Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4 /*{"get_args":[{"id":"1 or true"}]}*/ select * from users where clientid = 1 or true
  39. SQLi Detection Example GET /app/?id=1%20or%20true HTTP/1.1 Host: example.com Connection: close

    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding: gzip, deflate, sdch Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4 /*{"get_args":[{"id":"1 or true"}]}*/ select * from users where clientid = 1 or true select * from users where clientid = "" select * from users where clientid = 1 or true
  40. SQLi Detection Example select * from users where clientid =

    "" select * from users where clientid = 1 or true
  41. SQLi Detection Example select * from users where clientid =

    "" select * from users where clientid = 1 or true select * from users where id = "" Lexems select * from users where id = 1 or true Lexems
  42. SQLi Detection Example select * from users where clientid =

    "" select * from users where clientid = 1 or true select * from users where id = "" Lexems select * from users where id = 1 or true Lexems 8 ≠ 10
  43. SQLi Detection Example select * from users where id =

    "" select * from users where id = 1 or true select * from users where clientid = "" Lexems select * from users where clientid = 1 or true Lexems 8 ≠ 10
  44. Access Control • All types of application firewalls should have

    access control mechanisms • The main statement of any access policy: All entities must be identified • Entities identification in account-based systems: at least it is necessary to identify web application subjects (users) that initiate queries to DBMS • Approaches  Many-to-many applications  HTTP and SQL user tracking  RASP • Angine - ABAC eNgine
  45. Angine Data Flow Model Subjects, resources, environment PAP (6) attribute

    query (1) ALFAScript policy (5) request context (12) response (4) request Compiler Access requester PDP (2) Lua code Context handler PEP (3) access request PIP (9) attribute (7) attribute query (11) response context (10) attributes (8) attributes Generator Data structures Language ALFAScript IDL policy spec
  46. ALFAScript Policy Example interface Entity { abstract id: String; }

    interface SQLEntity <: Entity { database: String; table: String; column: String; level: Number; tags: [String]; } interface Subject <: Entity { name: String; level: Number; tags: [String]; }
  47. ALFAScript Policy Example namespace example { export policy Main {

    target clause action == "select“ apply denyUnlessPermit rule mls { permit target clause subject.level > entity.level } rule rbac { permit condition rbacCheckRoles() == true } } }
  48. Generated Classes class Entity(object): def __init__(self): super(Entity, self).__init__() @abstractproperty def

    id(self): pass class SQLEntity(Entity): def __init__(self, database=None, table=None, column=None, level=None, tags=None): super(SQLEntity, self).__init__() self.database = database self.table = table self.column = column self.level = level self.tags = tags class Subject(Entity): def __init__(self, name=None, level=None, tags=None): super(Subject, self).__init__() self.name = name self.level = level self.tags = tags
  49. Angine Example from Angine.policy import Policy from Angine.pip import PIP

    from Angine.pdp import PDP def pep(): ... request = get_request(network) policy = Policy(alfa_mysql_policy) pip = PIP.init_data(mongo_connection) pdp = PDP(policy.get_lua_policy()) ctx = pip.create_ctx(request) response = pdp.evaluate(ctx) if response["result"]["decision"] != "permit": return None else: return process(request)
  50. Angine Results • ALFAScript IDL to runtime language code generator

    • ALFAScript language • ALFAScript to Lua transcompiler • PDP and PIP implementations for runtime language • Common parsers (HTTP, mysql, tsql)
  51. Roadmap • Host agents for C#, Java • ANTLR-based C++

    parser • Release MySQL grammar for ANTLR4 • PT Application Firewall integration • SQL user tracking • Machine learning for sensitive data discovery • Inspected Application Module for DBFW
  52. Inspected Application Module Vladimir Kochetkov. Do WAFs dream of static

    analyzers? Peculiarities  Web-only IAM can not process non HTTP attack vectors  There are some cases when CompFG is not adequate to detect attacks • Loops, recursion • Internal and external dependencies The idea is to build SQL profile based on application code, compile it to binary module and run on the DBFW This approach can be used to detect second order SQL injection attacks
  53. Inspected Application Module Flow Application Inspector Deployed application Configuration WAF

    IAM WAF Front-end Web client (2) HTTP request context Application CompFG (5) HTTP response context (11) SQL response context Computation flow model (8) SQL request context DBMS (7) SQL request (6) HTTP request (12) SQL request (13) SQL response (14) SQL response (15) HTTP response (16) HTTP response (1) HTTP request Compiled binary module Source code Application Firewall
  54. SQL IAM Example … $sql = "select * from data

    where id= intval(".$_POST["id"]) $result = mysql_query($sql, $connection) $row = mysql_fetch_row($result) $sql = "select * from data where fname=' ".$row[2]. " ' "
  55. SQL IAM Example … $sql = "select * from data

    where id= intval(".$_POST["id"]) $result = mysql_query($sql, $connection) $row = mysql_fetch_row($result) $sql = "select * from data where fname=' ".$row[2]. " ' " Untrusted data read from database. What if fname is ' or '1' = '1 ? Second order SQL injection
  56. SQL IAM Example … $sql = "select * from data

    where id= intval(".$_POST["id"]) $result = mysql_query($sql, $connection) $row = mysql_fetch_row($result) $sql = "select * from data where fname=' ".$row[2]. " ' " The main SQL injection feature: a number of tokens more that one
  57. SQL IAM Example … $sql = "select * from data

    where id= intval(".$_POST["id"]) $result = mysql_query($sql, $connection) $row = mysql_fetch_row($result) $sql = "select * from data where fname=' ".$row[2]. " ' " (concat "select * from data where fname=" ( concat (index-access row 2) "'"))
  58. SQL IAM Example … $sql = "select * from data

    where id= intval(".$_POST["id"]) $result = mysql_query($sql, $connection) $row = mysql_fetch_row($result) $sql = "select * from data where fname=' ".$row[2]. " ' " (concat "select * from data where fname=" ( concat (index-access row 2) "'")) (call mysql_fetch_row (call mysql_query (concat "select * from data where id=intval(" (concat (index- access POST, "id") ")")) connection))
  59. SQL IAM Example … $sql = "select * from data

    where id= intval(".$_POST["id"]) $result = mysql_query($sql, $connection) $row = mysql_fetch_row($result) $sql = "select * from data where fname=' ".$row[2]. " ' " № Query hash Index Tokens 1 87248237482347 [(28,-1)] 1 2 13475837458758 [(32,-1)] 1
  60. SQLi IAM Example GET /app/?id=1000 HTTP/1.1 Host: example.com Connection: close

    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding: gzip, deflate, sdch Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4 select * from data where id=1000 select * from data where fname='john' or '1'='1' 1 ≠ 2