expensive… even with commodity hardware… • Hard to fully utilize machines (e.g., 72 GB RAM and 24 CPUs) • Hard to deal with failures… • What else could we do…?
of managing warehouse scale computing: http:// research.google.com/pubs/pub35290.html ! • Google hit a lot of these problems before many other companies and came up with interesting solutions: http://youtube.com/watch?v=0ZFMlO98Jkc
doing research in this area, we decided to partner and hire researchers: https://amplab.cs.berkeley.edu/tag/mesos/ ! • “Return of the Borg: How Twitter Rebuilt Google’s Secret Weapon: http://www.wired.com/2013/03/ google-borg-twitter-mesos
spun into an open source project at the Apache Foundation: https:// blog.twitter.com/2012/incubating-apache-mesos • https://twitter.com/ApacheMesos/statuses/ 360039441500340224
source project with a healthy independent community: http://mesos.apache.org • Mesos is a distributed system to build and run distributed systems • Mesos provides fine-grained resource sharing and isolation • Mesos enables high-availability and fault-tolerance for your cluster
./ruby XYZ Mesos Slave Docker Executor Docker Executor java -jar XYZ.jar ./xyz Mesos Master Mesos Master Mesos Master Hadoop scheduler Marathon scheduler Zookeeper quorum *Thank you to Niklas Nielsen and Adam Borlen for the following diagrams explaining Mesos https://www.youtube.com/watch?v=EI0ROkf0vks Mesos consists of master/slave nodes
./ruby XYZ Mesos Slave Docker Executor Docker Executor java -jar XYZ.jar ./xyz Mesos Master Mesos Master Mesos Master Hadoop scheduler Marathon scheduler Zookeeper quorum applications are known as frameworks in Mesos, they interact with master
./ruby XYZ Mesos Slave Docker Executor Docker Executor java -jar XYZ.jar ./xyz Mesos Master Mesos Master Mesos Master Hadoop scheduler Marathon scheduler Zookeeper quorum Master schedules tasks to run on slaves’ available resources; slaves use executors to coordinate execution of tasks Tasks are the unit of execution
CGroups CPU isolator CGroups Memory isolator Launcher Container foo Task baz Containerizer API Executor bar When a slave starts, you can specify a “containerizer” to launch the container and set of isolators to enforce resource constraints (CPU/memory) Mesos can track and allocate more resource types, allowing you to manage resources like ip-addresses, ports, disk space and even GPUs!
Containerizer API Mesos Slave Process External Containerizer Program Container foo MySQL Containerizer API Ubuntu 13.10 Container bar Ruby Centos 6.4 github.com/mesosphere/deimos
tasks and waits for a node to reconnect, master will update the framework with any tasks that were completed while it was gone) Tasks keep running! Framework Masters
learn what pods to reconnect for reach task and re-registeres with the master) Tasks keep running! Compute Node Mesos Slave Process Mesos Executor Mesos Executor
Logic Tweet Service User Service Timeline Service SocialGraph Service DM Service Presentation API Web Search Feature X Feature Y Presentation TFE (netty) Reverse Proxy HTTP Thrift Thrift Aurora Mesos Monorail
run distributed systems (think datacenter OS) • Mesos enables resource sharing, high-availability and fault-tolerance for your data centers • Mesos is an open source project with a healthy independent community: http://mesos.apache.org • So please check it out, use it or contribute back if you can to make it better!
http://mesos.apache.org email: {user,dev}@mesos.apache.org 51 Also thanks to Niklas Nielsen and Adam Borlen for their slides explaining Mesos from ApacheCon 2014 https://www.youtube.com/watch?v=EI0ROkf0vks