Data Warehouse TBs SQL Query Processing Reporting & Dashboarding Data Science In-Data Lake Transformation Reporting & Dashboarding Cloud Data Lake TBs -> PBs Open Data Lake Security, Reliability, and Governance
engine • Originally developed at Meta/Facebook as a replacement for Hive • Query in Place -- no need to move data(ETL) from source • Federated Querying -- join data from different source format • ANSI SQL Compliant • Designed from ground up for fast analytic queries against data of any size • Proven on petabytes of data • SQL-On-Anything • Federated querying and pluggable architecture to support many connectors • Opensource, hosted on github • https://github.com/prestodb 6
Cloud is SaaS to Query Data Lakes • Simplifies SQL analytics on cloud data lakes like S3 Team Ahana Cloud, Database & Presto Experts Steven Mih Cofounder CEO Dipti Borkar Cofounder Chief Products Officer Dave Simmen Cofounder Chief Technical Officer 2021 DBTA Best Data 100 2021 Stevie Best Startup 2021 Coolest Analytics 2021 Top 10 Hot Big Data 2020 Datanami Best Big Data Startup Awards 2020 Open Source 100
Minutes. Managed cloud service: No installation and configuration. 2. Built for data teams of all experience level. 3. Moderate level of control of deployment without complexity. 4. Dedicated support from Presto experts.
AWS Serverless options get very expensive for growing data volumes ▪ Cloud data warehouse costs grow much faster than compute engine costs ▪ Serverless options like AWS Athena charge /query and get expensive “Do it yourself” approach is complicated ▪ Big data skills in platform teams are limited ▪ Presto is complicated and operationally very time consuming Presto on AWS like AWS Athena has limited capabilities and doesn’t scale ▪ Limited concurrency of 20 per account ▪ No visibility into cluster logs, query logs, no flexibility / control on scale
ACCESS BILLING & SUPPORT In-VPC Presto Clusters (Compute Plane) AD HOC CLUSTER 1 TEST CLUSTER 2 PROD CLUSTER N Glue S3 RDS Elasticsearch Ahana Cloud Account Ahana console oversees and manages every Presto cluster Customer Cloud Account In-VPC orchestration of Presto clusters, where metadata, monitoring, and data sources reside Ahana Cloud for Presto 14
proven scalability • Interactive ANSI SQL queries • Query data where it lives with Federated Connectors (no ETL) • High concurrency • Separation of compute and storage 15
from the Data Warehouse to the Open Data Lakehouse powered by Presto & Ahana to power 200K orders/day • “Everything delivered in 10 minutes” “Ahana is providing Blinkit with a SaaS managed service for Presto, providing the company with the advanced data management capabilities it needs to meet its instant delivery promise.” Satyam Krishna, Engineering Manager at Blinkit
AWS S3 with an open data lake + USER ▪ Presto compute brought to your data in your VPC in your account ▪ Fully managed Presto cluster life cycle including idle-time management ▪ Query AWS DBs - RDS/MySQL , RDS/Postgres, Elasticsearch, Redshift, Elasticsearch ▪ Cloud-native and highly available running on Kubernetes ▪ Bring your own ▪ BI tool / Data Science Notebook ▪ Metadata Catalog ▪ Transaction Manager Easy to use 3x Price Performance Open & Flexible