Engineering at FIVE Inc. • Largest Mobile Video Advertising Platform in Japan • My engineering roles are: Front-end Servers, Back-end Servers, Dashboards, Log Analysis and Reporting Batches, Operation Tools, Android SDK, iOS SDK, etc…
how we use Finagle/Thrift • Storing serialized data in DB • Reloading in-process cache • Scoring • Communicating with external service • Sharing schema in dashboard’s JavaScript and server • Conclusion
how we use Finagle/Thrift • Storing serialized data in DB • Reloading in-process cache • Scoring • Communicating with external service • Sharing schema in dashboard’s JavaScript and server • Conclusion
Scala • Twitter is an early adopter of Scala • Future, Try, Duration, … Many Scala utilities (mostly related to concurrency) are inspired by Twitter’s common library.
• JavaScript’s Promise, Java 8’s CompletableFuture, C++11’s std::future • A value of Future[A] is a placeholder to hold the result of an asynchronous operation • Typically it will issue some IO operations that may fail. • What makes different from traditional callback style is “composable”
Service represents both server and client • A Server is a function to implement the Service; Finagle dispatches incoming requests to it • A Client is a function to use the Service; Finagle dispatches requests to the service and handle responses
how we use Finagle/Thrift • Storing serialized data in DB • Reloading in-process cache • Scoring • Communicating with external service • Sharing schema in dashboard’s JavaScript and server • Conclusion
Users write data definitions and RPC interfaces in Thrift IDL • Thrift compiler generates code to serialize/ deserialize data and RPC client/server. • Originally developed by Facebook and now it’s an Apache project
double, binary, string • User defined types: enum, struct, union • Container types: optional, list<T>, set<T>, map<K,V> • Unlike protobuf, map key can be any types including user defined structs • But I don’t recommend • `service` keyword defines RPC interface • Each struct fields and RPC parameters have unique id: 1,2,3,…
that can be used from Scala • “But the generated code uses Java collections and mutable “bean” classes, causing some annoying boilerplate conversions to be hand-written.” • Twitter developed their own Thrift parser/generator, called Scrooge • https://twitter.github.io/scrooge/
Scala-friendly API • list, set, map by scala.collection.{Seq, Set, Map} • struct by immutable case class • enum, union by sealed trait • Easy interface to send/receive RPC • sbt support • Example codes are described later!
how we use Finagle/Thrift • Storing serialized data in DB • Reloading in-process cache • Scoring • Communicating with external service • Sharing schema in dashboard’s JavaScript and server • Conclusion
how we use Finagle/Thrift • Storing serialized data in DB • Reloading in-process cache • Scoring • Communicating with external service • Sharing schema in dashboard’s JavaScript and server • Conclusion
a indexed KVS • All serialized data are stored in `bytes` column and only index keys are defined as other columns. • Joins and aggregate functions are calculated in application layer
• But Thrift schema often drastically changes, which require data migration • There is no DB migration tool nor “UPDATE” statement applicable for serialized Thrift data • If you want to update data, you need to create a new sbt project, write a program that access to DB and change the data and save it, create a jar file, deploy, … • Or you can do it on sbt console in production environment… but it’s so painful
scripts by Scala, called “ScalaScript” project • Originally implemented by Twitter’s util-eval but now it’s no more needed • 1. Prepare fat-jar file that all common libraries are included. 2. Run the jar file with the argument of our script 3. The script is dynamically loaded, compiled and executed with all classpaths enabled!
library, such as DB access, Thrift serialization, Thrift RPC, running BigQuery, uploading to SpreadSheet, posting to Slack, … • What’s more, it’s type safe! • Write a script and run it
dashboard for daily operation • But experimental feature are often added to schema that dashboard can’t keep up with • Needs of simple data viewer/editor • There is no “phpMyAdmin”
how we use Finagle/Thrift • Storing serialized data in DB • Reloading in-process cache • Scoring • Communicating with external service • Sharing schema in dashboard’s JavaScript and server • Conclusion
are cached on each servers’ process. • To reduce redis access • But if you update master of such cold data, servers need to reload them • Dashboard copies MySQL data to Redis, then sends RPCs to each servers to reload cache.
how we use Finagle/Thrift • Storing serialized data in DB • Reloading in-process cache • Scoring • Communicating with external service • Sharing schema in dashboard’s JavaScript and server • Conclusion
• If it take much time than expected, we will partition campaigns to calc their scores on different nodes. • The paper “Your Server as a Function” shows an example of search query on different instances
how we use Finagle/Thrift • Storing serialized data in DB • Reloading in-process cache • Scoring • Communicating with external service • Sharing schema in dashboard’s JavaScript and server • Conclusion
time communication with external servers. • Here external means other company whose servers are across the internet • If we don’t have any ad to show but partner company have, we want to deliver it • We want to separate servers that receive high-traffic requests and ones that send outgoing requests.
how we use Finagle/Thrift • Storing serialized data in DB • Reloading in-process cache • Scoring • Communicating with external service • Sharing schema in dashboard’s JavaScript and server • Conclusion