- Quickly setup a test cluster ▸ Pig - High level programming language for MapReduce jobs ▸ Sqoop - For importing/reading MySQL and other RDBMS ▸ Spark - Alternative to MapReduce designed for fast analytics ▸ Flume - Streaming data collection / aggregation manager ▸ Oozie - MapReduce Workflow Manager & Scheduler ▸ Whirr - Deployment of clusters to AWS ▸ HBase - Low-latency distributed, non- relational database ▸ Zookeeper - Distributed application HA management ▸ HCatalog - Interop between Pig and Hive
distributed under the hood ▸ Hive: Using Hadoop as an RDBMS; writing ▸ Presto: A Facebook library we can use to query the cluster; reading ▸ Phresto: A library to make Presto accessible from PHP userland ▸ Phresto & Doctrine CONTENTS
and distributed under the hood ▸ Hive: Using Hadoop as an RDBMS; importing files ▸ Presto: A Facebook library we can use to query the cluster ▸ Phresto: A library to make Presto accessible from PHP userland ▸ Phresto & Doctrine CONTENTS
and distributed under the hood ▸ ✓ Hive: Using Hadoop as an RDBMS; importing files ▸ Presto: A Facebook library we can use to query the cluster ▸ Phresto: A library to make Presto accessible from PHP userland ▸ Phresto & Doctrine CONTENTS
and distributed under the hood ▸ ✓ Hive: Using Hadoop as an RDBMS; importing files ▸ ✓ Presto: A Facebook library we can use to query the cluster ▸ Phresto: A library to make Presto accessible from PHP userland ▸ Phresto & Doctrine CONTENTS
and distributed under the hood ▸ ✓ Hive: Using Hadoop as an RDBMS; importing files ▸ ✓ Presto: A Facebook library we can use to query the cluster ▸ ✓ Phresto: A library to make Presto accessible from PHP userland ▸ Phresto & Doctrine CONTENTS
and distributed under the hood ▸ ✓ Hive: Using Hadoop as an RDBMS; importing files ▸ ✓ Presto: A Facebook library we can use to query the cluster ▸ ✓ Phresto: A library to make Presto accessible from PHP userland ▸ ✓ Phresto & Doctrine CONTENTS