Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Curling with Julia (JuliaCon 2024)

Curling with Julia (JuliaCon 2024)

We have a Julia application that communicates with other services over REST APIs. These services use mutual TLS for authentication, which in turn requires the use of client side TLS certificates. We found that LibCURL was the best way to do HTTPS with client certificates, but with a very C like interface, it was hard for Julia developers to use.

This led us to build CurlHTTP, which is a very Julia like interface for doing HTTP(S) with LibCURL.

LibCURL.jl and HTTP.jl are the two primary ways to write HTTP clients in Julia. They both have their pros and cons.

HTTP.jl
While HTTP.jl has a very easy to use interface to access the majority of features, including cookie handling and streaming, it suddenly becomes very complicated when you try to use TLS client certificates. HTTP.jl also doesn't have the ability to do multiple downloads in parallel on a single execution thread.

LibCURL.jl
LibCURL makes it very easy to do mutual TLS, supporting client and server TLS certificates with well documented APIs, and has a well documented multiple download interface, but it provides a fairly low level API, and is primarily suited to developers familiar with writing applications in C.

For our use case, we decided to build a higher level Julia wrapper around LibCURL's HTTP functionality (similar to what SMTPClient.jl does for SMTP over LibCURL), while making some of curl's more complicated features easy to use in a Julian way.

CurlHTTP.jl
The result is CurlHTTP.jl, which provides a high level HTTP interface to LibCURL, supporting single and multiple downloads, mutual TLS, and data streaming.

Talk Structure
This talk will first briefly cover the features of CurlHTTP and how to use it.
We will then spend a little more time going into some of the nuances and gotchas we faced while developing this library, particularly with passing Julian data types through the C interface and back to our callbacks, and with dealing with memory management in a clean way.

Philip Tellis

July 10, 2024
Tweet

More Decks by Philip Tellis

Other Decks in Technology

Transcript

  1. Quick start https://github.com/bluesmoon/CurlHTTP.jl • Available in the Package Registry •

    Pkg.add("CurlHTTP") • Current version v0.1.3 • Docs: https://bluesmoon.github.io/CurlHTTP.jl/ • Julia 1.6+
  2. Ways to HTTP in Julia • LibCURL.jl - low level

    wrapper around libcurl C library. ◦ Good for short recipes, but quickly gets cumbersome with complex use cases. ◦ Does more than just HTTP (eg: FTP, SMTP, SSH, Telnet) • HTTP.jl - higher level Julia API ◦ Supports client & server mode as well as websockets ◦ Complicated to do mutual TLS (client certificates) ◦ Doesn’t support single-threaded parallel requests • Downloads.jl • CurlHTTP.jl - higher level wrapper around libcurl C library ◦ This is what the rest of this talk is about!
  3. CurlHTTP at a glance ✔ Two interfaces - CurlEasy and

    CurlMulti for single and multi connection usage. ✔ Same API across both interfaces. ✔ Read/write data from/to HTTP servers in blocks or as streams. ✔ Support for TLS client certificates and CA certificates. ❌ No websocket support ❌ No cookie handling yet (you have to maintain your own cookie jar)
  4. Our use-case • Make GET/POST/PUT/DELETE requests to a server over

    HTTPS • Use mutual TLS for authentication • Use a custom CA (Certificate Authority) certificate to validate the server • Process response data (JSON stream data) as a stream • Occasionally make multiple parallel requests and combine the post-processed responses
  5. A quick note about TLS • Transport Layer Security. Used

    by HTTPS and a bunch of other protocols to encrypt traffic over the wire. • Uses asymmetric key cryptography. Typically the server has a certificate and a private key. The certificate is signed by a certificate authority that the client trusts. This allows the client to trust that the server is who they say they are. • mTLS is Mutual TLS. The client also has a certificate & key, but this is typically signed by the server. This is used for authentication. • Certificates contain a common name. This is used for authorization.
  6. Simple GET request using CurlHTTP curl = CurlEasy( url =

    "https://postman-echo.com/get?foo=bar", method = CurlHTTP.GET, verbose = true ) res, http_status, errormessage = curl_execute(curl) # curl.userdata[:databuffer] is a Vector{UInt8} containing the bytes of the response responseBody = String(curl.userdata[:databuffer]) # curl.userdata[:responseHeaders] is a Vector{String} containing the response headers responseHeaders = curl.userdata[:responseHeaders]
  7. POST with streaming response using CurlHTTP curl = CurlEasy( url

    = "https://postman-echo.com/post", method = CurlHTTP.POST ) requestBody = """{"testName":"test_writeCB"}""" headers = ["Content-Type: application/json"] databuffer = UInt8[] res, http_status, errormessage = curl_execute(curl, requestBody, headers) do d if isa(d, Array{UInt8}) append!(databuffer, d) end end responseBody = String(databuffer)
  8. mTLS curl = CurlEasy(; url = "https://your-https-server/url/", method = CurlHTTP.OPTIONS,

    cacertpath = "/path/to/custom-ca-cert.crt", # optional certpath = "/path/to/client-tls-cert.crt", keypath = "/path/to/client-tls-key.key" ) • The certificate may contain a complete certificate chain. • All three of these files may be combined into a single file: ◦ Certificate + certificate chain goes first ◦ Private key goes second ◦ CA Certificate goes last • By default CurlHTTP will validate certificates
  9. Just TLS • You don’t need anything special to do

    TLS. • If your URL is over https, then CurlHTTP will use TLS and use the most secure options available 1. • The default CA certificate uses LibCURL.cacert (which comes from Mozilla) • Pass in cacertpath to override it 1. Security also depends on the version of libcurl compiled in. On Julia 1.6, there are known vulnerabilities: https://github.com/JuliaLang/julia/pull/46116
  10. • FOLLOWLOCATION: Follows 30x • SSL_VERIFYPEER: Verify the authenticity of

    the peer’s certificate • SSL_VERIFYHOST: Verify that the certificate matches the host • SSL_VERSION (highest possible up to TLS 1.3) • HTTP_VERSION (H2 over TLS or HTTP/1.1) • TCP_FASTOPEN disabled due to tracking vuln and to allow TLS session cache • TCP_KEEPALIVE: use TCP KeepAlive probes • ACCEPT_ENCODING best supported compression • TRANSFER_ENCODING • DNS_CACHE_TIMEOUT disabled because your DNS server should handle this Default Options
  11. Parallel requests pool = map(1:3) do i curl = CurlEasy(url="https://postman-echo.com/post?val=$i",

    method=CurlHTTP.POST) requestBody = """{"testName":"test_multiPOST","value":$i}""" curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, length(requestBody)) curl_easy_setopt(curl, CURLOPT_COPYPOSTFIELDS, requestBody) curl_add_headers(curl, [ "Content-Type: application/json", "Content-Length: $(length(requestBody))" ]) return curl end multi = CurlMulti(pool) res = curl_execute(multi)
  12. Parallel requests • Requests are made when we call curl_execute

    • All requests run in parallel • Responses are streamed back in parallel to the response handler for each CurlEasy handle pool = map(1:3) do i curl = CurlEasy(url="https://postman-echo.com/post?val=$i", method=CurlHTTP.POST) requestBody = """{"testName":"test_multiPOST","value":$i}""" curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, length(requestBody)) curl_easy_setopt(curl, CURLOPT_COPYPOSTFIELDS, requestBody) curl_add_headers(curl, [ "Content-Type: application/json", "Content-Length: $(length(requestBody))" ]) return curl end multi = CurlMulti(pool) res = curl_execute(multi)
  13. Streaming parallel responses pool = map(1:3) do i curl =

    CurlEasy(url="https://postman-echo.com/post?val=$i", method=CurlHTTP.POST) requestBody = """{"testName":"test_multiPOST","value":$i}""" curl.userdata[:index] = i # userdata is a Dict to store anything you want curl.userdata[:channel] = Channel() CurlHTTP.curl_setup_request(curl, requestBody, ["Content-Type: application/json"]; data_channel = curl.userdata[:channel] ) return curl end # Set up tasks to handle the data channels before calling curl_execute multi = CurlMulti(pool) res = curl_execute(multi)
  14. Summary • CurlHTTP offers some enhancements over existing ways of

    doing HTTP in Julia • mTLS and single task parallel requests are easy • https://github.com/bluesmoon/CurlHTTP.jl • https://github.com/bluesmoon/CurlHTTP.jl/issues