C/Elixir Interop

C/Elixir Interop: A libVIPS/Phoenix Case Study Evadne Wu Head of
Exam Systems, Faria Education Group [email protected] / @evadne last updated 26 October 2016

Takeaway • Learn how you can embed your C program
in Elixir • Investigate various integration approaches • Get a proper project that does what it says

Structure 1. Overview 2. Requirements 3. Solutions Assessment 4. Single-Solution
Deep Dive 5. Demo 6. Observations / Q&A

Overview • If you’ve written a web application, chances are
that you’ll need to generate thumbnails. • If you have a Phoenix application, you’d probably like to generate thumbnails within your application so everything is in the same place. • You’d probably want a solution that just works but is also quite performant.

Requirements • Generate thumbnails from Elixir/Erlang in an Erlang-like manner
(i.e. that it isolates faults, is fast enough and does not do strange things)

Solutions Assessment 1. Fork: spawn a process with arguments, wait
for completion. 2. Daemon: Swap messages with a long-running daemon. 3. NIF: Run C code in BEAM directly. 4. Pattern Match: implement scaling code in Erlang/Elixir directly. 5. Persistent C Server: swap messages with a supervised process.

Solution 1: Fork • Fork an OS process to generate
one thumbnail at a time. • Start a child process with appropriate arguments. • Wait for the child process to ﬁnish. • Look at what the child process has sent.

Fork: Characteristics • Assembly Required: No implicit ﬂow control or
resource cap. • Safe: Crashes isolated to external OS processes; resource cleanup done by OS. • Slow: code/data needs to be reloaded on each run. • Expensive: bigger servers, smaller conference budget.

Fork: Good Bits • A simple forked process which exits
and returns results at the same time is very easy to reason with. This can be attractive when you do not have a concurrency requirement. • Thorough cleanup is almost guaranteed upon process exit.

Forking: Bad Bits • Forking is quite bad if your
process needs to ﬁrst load data into memory, or has a heavy initialisation process. • You may create a fork bomb if multiple forks can happen concurrently and there is no safeguard. • You will most likely need a timeout.

Forking: Example System.cmd “mogrify”,  arguments(image, output_path),  stderr_to_stdout: true

Solution 2: Daemon • Either implement a daemon for your
code, or ﬁnd a project that has one and use that. • The daemon will ﬁeld your requests either over a port directly or via forked child processes that pass messages. • Some daemons may even have concurrency support.

Daemon: Characteristics • Faster Per-File Processing Times: No need to
reload data on each call. • Less Memory Pressure: Possible to share some memory among all processes. • Faults Isolated: Crashes isolated to an external OS process and its children. Possible to have an OS-level process manager restart the daemon(s). • Multiple Failure Modes: Errors can propagate and cause grief because the daemon is probably not written in Erlang.

Daemon: Good Bits • ClamAV, a popular open-source virus scanner
project, has two variants. It can run a daemon which then accepts work, or it can be run standalone. The daemon is about 10 to 100 times faster to scan a ﬁle in practice because it does not have to repeatedly load virus deﬁnitions. • This is an example of a proper daemon not written in Erlang (and you can still supervise a daemon using Erlang).

Daemon: Bad Bits • It makes no sense to implement
half of Erlang in another language. It takes longer to do that than learning Erlang. • If you do not have the daemon supervised by your application, you will not have a common root for all activities and that leads to madness. • You need to ﬁnd a way to send a message to a daemon. You may need to make a binary/text interface or you may need to take the hit of forking something, which does that. Either way it is a lot of work.

Solution 3: NIF • Write your code in C. •
Expose them as NIFs (Native-Implemented Functions). • Call them from BEAM, wait for response (synchronously) then use that response.

NIF: Good Bits • Concurrent: NIFs can be marked “dirty”,
and they will be run on a separate set of schedulers. • Fast: No context-switching required, so calling NIFs can be quite fast.

NIF: Bad Bits • No Fault Isolation: A crash in
your NIF brings down the BEAM. • Hard to De-Risk: image formats can be complex; it would be difﬁcult to proclaim any code manipulating them bug-free. Images can come from the Internet (i.e. user-provided input). • Elbow Grease Required: Special care is required to mark a NIF dirty. Failure to cover all bases may cause issues.

Solution 4: Pattern-Match • Write your conversion code in Erlang
using pattern-matching. • Requires intimate understanding of all image format speciﬁcations and of the BEAM as you will be moving a lot of binary data around. • A good weekend project for the tenacious.

Solution 5: Persistent C Server • Write a synchronous, single-threaded
C Server reading from STDIN and writing to STDOUT/STDERR. • Supervise the C Server with appropriate Erlang code which restarts the process as needed. Crash the C Server whenever. • Put as many of these pairs in a connection pool as needed.

Single Solution We can summarise a few more data points
from all available information. The ideal solution should be… • Not Forking; • Isolated in Own Memory Space; • Crash-Resistant.

Single Solution: Ingredients • Image Manipulation: libVIPS and its High
Level C binding. This is a proven solution and is faster than ImageMagick. Its functions can be picked-and-choosed in our custom C Server. • Protocol: Text-Based. This means the C Server can be tested in isolation without an elaborate test harness, will be able to work over STDIN/STDOUT, and will not require code to handle a binary protocol.

Single Solution: Layout • Implement an worker pool using Poolboy.
• In each worker, pull in Erlexec and run/maintain a C server. • Implement a façade that checks out a process from the pool and uses it.

Single Solution: C Server $ scaler 20 20 foobar ERROR
- Unable to open file 288 288 /Users/evadne/Pictures/IMG_0245.PNG /tmp/converter-lDMsQF.png $ identify /tmp/converter-lDMsQF.png /tmp/converter-lDMsQF.png PNG 288x216 288x216+0+0 8-bit sRGB 4.74KB 0.000u 0:00.000

Single Solution: External Façade def preview(conn, params) do in_path =
params["image"].path {:ok, out_path} = Resampler.request(in_path, 512, 512) {:ok, image} = File.read(out_path) base64 = Base.encode64(image) render conn, "preview.html", base64: base64, diff: formatted_diff(diff) end

Demo • Note the local dependency and how a Makeﬁle
can be written to build the C bits in the right way. • Note how the performance gap seems to widen as the input gets larger.

Observations • The solution is a bit faster than others
indeed; the performance gap widens as the images grow larger. • Mixing Erlang/Elixir and C does not need to be hard. • Best tool for the job.

Open Source evadne/supervised-scaler Elixir + Phoenix MIT License

C/Elixir Interop

C/Elixir Interop

Evadne Wu

More Decks by Evadne Wu

Other Decks in Technology

Featured

Transcript

C/Elixir Interop: A libVIPS/Phoenix Case Study Evadne Wu Head of

Takeaway • Learn how you can embed your C program

Structure 1. Overview 2. Requirements 3. Solutions Assessment 4. Single-Solution

Overview • If you’ve written a web application, chances are

Requirements • Generate thumbnails from Elixir/Erlang in an Erlang-like manner

Solutions Assessment 1. Fork: spawn a process with arguments, wait

Solution 1: Fork • Fork an OS process to generate

Fork: Characteristics • Assembly Required: No implicit ﬂow control or

Fork: Good Bits • A simple forked process which exits

Forking: Bad Bits • Forking is quite bad if your

Forking: Example System.cmd “mogrify”,  arguments(image, output_path),  stderr_to_stdout: true

Solution 2: Daemon • Either implement a daemon for your

Daemon: Characteristics • Faster Per-File Processing Times: No need to

Daemon: Good Bits • ClamAV, a popular open-source virus scanner

Daemon: Bad Bits • It makes no sense to implement

Solution 3: NIF • Write your code in C. •

NIF: Good Bits • Concurrent: NIFs can be marked “dirty”,

NIF: Bad Bits • No Fault Isolation: A crash in

Solution 4: Pattern-Match • Write your conversion code in Erlang

Solution 5: Persistent C Server • Write a synchronous, single-threaded

Single Solution We can summarise a few more data points

Single Solution: Ingredients • Image Manipulation: libVIPS and its High

Single Solution: Layout • Implement an worker pool using Poolboy.

Single Solution: C Server $ scaler 20 20 foobar ERROR

Single Solution: External Façade def preview(conn, params) do in_path =

Demo • Note the local dependency and how a Makeﬁle

Observations • The solution is a bit faster than others

Open Source evadne/supervised-scaler Elixir + Phoenix MIT License