An Approach to Design Large Scale Data Centric Architecture Using MongoDB

AN APPROACH TO DESIGN LARGE SCALE DATA CENTRIC ARCHITECTURE USING
MONGODB By SUSHMITHA DIWAKAR ARULJOTHI ANNAMALAI CLARENCE J M TAURO Department of Computer Science Christ University, Hosur Road, Bangalore @ 5TH NATIONAL CONFERENCE ON EMERGING TRENDS IN IT ON 26TH FEBRUARY , 2014

Objectives • What Scale Is? • How is/was Scale achieved?
Traditional Way • Scaling Today • Introduction to NoSQL • MongoDB • Scaling with MongoDB • Replication in MongoDB • Search Design

What Scale Is? • How well a solution to some
problem will work when the size of the problem increases • Massive adoption/usage

How is/was Scale achieved? Traditional Way • Less usage of
Joins; Less triggers • DEnormalize as much as possible • Horizontal/Vertical replication • Increase hardware • Traditional RDBMS; Use ORMs like Hibernate • Manual process – Developers job

Scaling Today • Much more persistence options • Cloud based
architectures – completely abstract the underlying hardware from the developer • Use PaaS – CloudFoundry from Pivotal • Less developers

Example: Scaling with CloudFoundry v2 cf scale appName --instances 10

Introduction to NoSQL • NoSQL stands for – “NoSQL” =
“No SQL” = Not using traditional relational DBMS – “No SQL” Don’t use SQL language – No Join • Usually do not require a fixed table schema • All NoSQL offerings relax one or more of the ACID properties

MongoDB • MongoDB ( from “humongous”) • Cross platform schemaless
document-oriented NoSQL database • MongoDB uses BSON (JSON like structure) • Features include: – File storage – Indexing – Scaling – Replication

Sample MongoDB Document { _id : ObjectId("4e77bb3b8a3e000000004f7a"), when : Date("2014-02-126T02:10:11.3Z",
author : "arul", title : "MongoDB", text : "This is the text of the post", tags : [ "JSON", "BSON" ], votes : 5, voters : ["sushmita", "clarence", "jothi" ], }

Scaling – Larger Level • Prefer simpler architectures • Completely
breakdown workload • Fine-tune your workload • Do NOT use ORM – unless you really want to – Use simpler standards – Spring’s JdbcTemplate • Use smaller and fine-grained components to deploy your application • Shard • Replicate

Scaling – Micro Level • Multiple documents vs. Nested documents
• Indexing – Need to have right amount of indexes – More indexes make the DB slow. Esp. MongoDB • Transactions vs. Compensating Transactions – JTA transactions are highly discouraged

Nested/Embedded Data Model - MongoDB Single I/O – or at
least stored in continuous blocks

Normalized Data Model - MongoDB How many I/Os? Well it
depends on the storage

Scaling – Shard Keys • Sharding is the process of
storing data records across multiple machines and is MongoDB’s approach to meeting the demands of data growth – MongoDB does a range based sharding – Sharding can increase the number of queries • Figure out the most common use case and then decide on sharding – Do this at design time

Replication in MongoDB • MongoDB uses replica set to achieve
replication • Replica set is a group of MongoDB instances that can host the same data set • Replica set has one node as primary node which receives all write operations, where all other instances are secondary’s, which applies operations from the primary node so they can have the same data set

Replication in MongoDB

Search Functionality • NoSQL will be unique due to its
special characteristic of “multi-attribute querying” • Multi Attribute Querying – Using $and operation db.inventory.find( { $and: [ { price: 1.99 }, { qty: { $lt: 20 } }, { sale: true} ] })

Search Design • Search is based on sharding id •
With the help of indexes the horizontal scaling technique is implemented

Conclusion • Design Matters!

Future Work • There are more things while designing a
scalable architecture: – Locking – Random partitioning – Write concerns

Questions?

An Approach to Design Large Scale Data Centric ...

An Approach to Design Large Scale Data Centric Architecture Using MongoDB

Clarence J M Tauro (Couchbase)

More Decks by Clarence J M Tauro (Couchbase)

Other Decks in Research

Featured

Transcript