Building Scalable and Flexible API by Leveraging GraphQL and BigTable

Building Scalable and Flexible API by Leveraging GraphQL and BigTable
Andi N. Dirgantara Lead Data Engineer at Traveloka

My Proﬁle • More than 6 years as software engineer
• Last 4 years focused at data engineering (big data) • Lead Data Engineer at Traveloka • Lead Facebook Developer Circles Malang • Working remotely from Malang • Urban and Regional Planning Graduate • Owner The Bros Coffee and Coworking Space (@thebros_co) • Owner Cahayu Aesthetic and Slimming Center (@cahayu.clinic)

The Problems

We have MySQL installed, exposed by our application via REST
API Everything went well until… We faced 5,000 rows written per seconds (1.8 millions rows per hour) Storage consume more than 100GB each day Single query can takes more than 5 hours Single row can have up to 1000 columns The system hit by 10,000 RPS What should we do?

Breaking Down The Problems 5,000 rows written per seconds 100GB
each days Query takes >5 hours 1000 columns 10,000 RPS Storage (Database) Problems API Problems We need scalable storage and flexible API

Solutions

Use distributed system to leverage horizontal scalability

We Choose BigTable for Distributed Storage • Low latency distributed
storage • Eventually consistent to leverage high throughput and high availability through replication (it can be set to strong consistency too if we want) • Columnar storage which able to store millions of columns More informations go to https://cloud.google.com/bigtable/ Machine 1 Data 1 Write Read Machine 2 Data 2 Machine n Data n

Now our system already scalable but we have 30 products
each product have at least 10 columns some products have 50 columns how much REST endpoint should we provide? who will maintain each endpoints? what if some system need to consume more than 5 endpoints? We need queryable API...

GraphQL Come to The Rescue • Model business case as
graph • Just query what you need • Has dashboard playground • Reduces network requests to 1 for multiple “endpoints” requests More informations visit https://graphql.org/

GraphQL + BigTable = Profit! query { customer(profileId: "123456") {
hotel { edges { node { name address checkInTime } } } flight { edges { node { bookingId origin destination } } } } } RowKey 123456 hotel name hotel address hotel checkInTime flight bookingId flight origin flight destination Only scanning and delivering the necessary data

What’s Next?

The Tradeoff and Room to Improve Room to Improve •
Optimizing/ leveraging optional data storage for more complex use cases (CockroachDB, Spanner, CitusDB, etc.) • Access control system per column • … suggestion? ... Tradeoff • The learning curve is steep • Hard to ﬁnd talent which experienced in our tech. stack (GoLang, GraphQL, BigTable, etc.) • BigTable cluster is relatively expensive, so proper data modelling is necessary to avoid wasting resources

Thank you and let’s keep in touch! fb.me/andi.n.dirgantara andi_dirgantara hellowin
Do you want to propose better solution? Our team is hiring...

Building Scalable and Flexible API by Leveragin...

Building Scalable and Flexible API by Leveraging GraphQL and BigTable

Andi N. Dirgantara

More Decks by Andi N. Dirgantara

Other Decks in Programming

Featured

Transcript