Mobile Media Site using Drupal and MongoDB with 10gen and CIGNEX Datamatics

August 07, 2012

Join In for a webinar showcasing an integration of Drupal and MongoDB to create a Mobile Media Site presented by 10gen and CIGNEX Datamatics.

The webinar will feature an Online & Mobile Media site developed in Drupal to store millions of digital photos with metadata in MongoDB. The site provides seamless features to create and manage albums, extract and store metadata and advanced search across a large repository for instantaneous retrieval of images.

The solution will demonstrate how MongoDB can be leveraged to store and search millions of images and associated metadata in a centralized and massively scalable repository.


  1. Mobile  Media  Site   using  Drupal  &   MongoDB  

      Aug  7,  2012   Presented  by:   Yash  Badiani,  Big  Data  Practice  Lead,  CIGNEX  Datamatics   Gaurav  Khambhala,  Technical  Lead,  CIGNEX  Datamatics  
     2000,  CIGNEX  Datamatics  has   implemented  over  400  Open  Source  enterprise   solutions  addressing  business  requirements   related  to  Portals,  Content  &  Big  Data   3
       SOLUTIONS   Managed  Cloud  Services  -­‐  Develop,  Deploy,  Manage   Annual  Product  Subscrip;on:  Liferay,  Alfresco,  Magento,  Hadoop,  Selenium     Extended  Development  Center  –  Center  of  Excellence     UI    Development    Integra;on    Customiza;on    Migra;on    Tes;ng      Training      Support  (24*7)   User  eXperience     PlaOorm   Portals   Liferay,    Magento,   Drupal,  Adobe  CQ   •  Intranet     •  Extranet   •  S o c i a l   Collabora;on     •  Mobile  Portals   •  E-­‐Commerce   Enterprise  Content   Management   Content   Alfresco,      Drupal,     Magento,  Adobe  CQ,    Moodle,      EphesoR       •  WCM   •  DM   •  RM   •  DAM   •  E-­‐Commerce   •  E-­‐learning   •  ERP   •  Imaging          Solu;ons    SERVICES   Making  Data  Work   Big  Data   Hadoop,    MongoDB,     Hbase,    Neo4j,    Solr   •  Analy;cs   •  Mobile   •  Social   •  Web   •  Real-­‐;me       •  DW  -­‐  BI   •  Log  Processing   and  Analysis     •  Enterprise   Search   Velocity   Complexity   Volume   Variety  
      –  Experts  in  improving  Enterprise   productivity    through    Process   Engineering  &    Information   Management  Solutions   •  Key  Highlights   –  Founded  in  1975   –  Publicly  listed  in  India   –  Annual  consolidated  revenue  of   US$100  Million   –  Fortune  500  clients   –  4,400+  employees  across  22   of[ices  in  9  countries   Strategic  Alliances   5
     Yash  Badiani  is  the  Big  Data  Practice  Lead  at  CIGNEX  Datamatics  and   focuses  on  Big  Data  Technologies  including  MongoDB  &  Hadoop.  He  has   worked  extensively  on  large  Data  warehousing  &  Business  Intelligence   projects  with  tools  such  as  Business  Objects,  Microsoft  SQL  Server,   Microstrategy,  IBM  Cognos.         

Gaurav  Khambhala  works  at  CIGNEX  Datamatics  as  Technical  Lead.  He  is   the  senior  member  of  the  PHP  Practice  at  CIGNEX  Datamatics  and  is   involved  on  various  technology  initiatives  like  Big  Data  where  he  focuses  on   integration  of  PHP  with  NoSQL  sources  like  MongoDB.  He  has  a  wide   industry  experience  in  software  development  &  management  in  Open   Source  technologies  such  as  Drupal,  Moodle  &  Wordpress.
  6. CIGNEX Datamatics Confidential Agenda   •  The  Mobile  Media  Use

     Case   •  Requirements  and  Challenges   •  Solution  :  Mobile  Media  site  using  Drupal    &  MongoDB   •  Why  Drupal  and  MongoDB?   •  Demo  and  Solution  Features   •  Bene[its     •  Summary   7
  7. CIGNEX Datamatics Confidential The  Mobile  Explosion!   By   2015,

      at   least   60%   of   information   workers   will   interact  with  their  content  applications  via  a  mobile   device   Employees   work   on   proposals   and   presentation   on   mobile  devices  while  travelling   People   use   digital   assets   (videos,   images)   longer   on   Tablets  and  Mobiles    compared  to  desktops   8 Based  on  a  report  by  a  leading  IT  advisory  [irm  
  8. CIGNEX Datamatics Confidential Mobile  Media  Use  Case   •  Mobile

     Media  site  includes  the  following  features:   –  Store  a  variety  of  Images  &  associated  metadata   –  Massively  Scalable  to  store  billions  of  images   –  Access  through  Mobile   –  Create  /Edit  Albums   –  Add  Images  to  the  Albums   –  Add  /  Edit  Metadata  of  Images   –  Search  Images  /  Albums  by  date,  metadata,  albums,  etc   –  Social  Media  features  –  Likes,  comments     9
  9. CIGNEX Datamatics Confidential Requirements  of  Mobile  Media  sites   • 

    Fast  performance   •  Large  user  base   •  Concurrent  CRUD   •  Access  through                    various  channels     •  Millions  of  digital                  assets   •  Variety  of  content   •  Complexity  of  data   •  Rich  UI  features   •  Social  features   •  Mobile  access   •  Fast  search   •  Elastic  scaling   •  Cost  effectiveness   •  Centralized  storage   •  Ease  of                  Maintenance   •  HIGH  availability   •  Automatic  failover   •  User  management   Velocity   Volume   User     experience   Scalability   Security  &     Availability   10 •  Easy  integration   •  Shorter  dev  cycle   •  Faster  deployment   •  Ease  of  schema                design    Flexibility  &     Agility  
  10. CIGNEX Datamatics Confidential Standard  Three  Layered  Data  Architecture   11

    File  System   Metadata  in  RDBMS   Search   Standard  Three  Layered   Storage   Application   layer  
  11. CIGNEX Datamatics Confidential Limitations  of  RDBMS •  Support  limited  to

     terabytes   –  No  support  for  petabytes  to   exabytes   •  Manage  only  structured  data   –  No  support  for  semi-­‐structured  and   unstructured  data   •  RDBMS  don't  scale  inherently   –  Scale  up/Scale  out  (Load  Balancing   &  Replication)   •  Hard  to  shard  /  partition   –  Large  data  [iles   •  Both  read  /  write  throughput  not   possible   –  Transactional  /  Analytical   databases   •  Specialized  hardware  -­‐  expensive   RDBMS  can’t  manage  all  dimensions  of   data  with  speed  &  at  lower  cost.   12
  12. CIGNEX Datamatics Confidential NoSQL  is  the  right  solution Not  

    SQL   Only   •  They  are  schema  less   •  Designed  to  support  huge  data  volumes   –  Facebook  135  billion  messages/month;  Twitter  7TB  data/day   •  Scalable  replication  and  distribution  mechanism   –  Thousands  of  machines  distributed  around  the  world   •  Massive  write  performance  with  asynchronous  inserts  and  updates   •  Designed  to  give  high  query  performance   •  Runs  on  commodity  hardware   •  Most  NoSQL  databases  are  Open  Source   13
  13. CIGNEX Datamatics Confidential NoSQL  –  Data  Models Column  Families  

    Usage:  Read/Write  Intensive     Popular  databases:  Hbase,  Cassandra   Document  Store   Usage:  Working  with  Occasionally     changing/consistent  data   Popular  databases:  CouchDB,  MongoDB   Graph  Database     Usage:  Spatial  Data  storage   Popular  databases:  Neo4j,  Bigdata   Key  Value  /  Tulip  Store   Usage:  Briskly  changing  data  and  high   availability   Popular  databases:  Riak,  Redis,  Azure   Table  storage   NoSQL  Databases   •   4  broad  data  models   •  120+  variants  available  in  the  market   14
  14. CIGNEX Datamatics Confidential Requirements  of  Mobile  Media  sites  -­‐  Recap

      •  Fast  performance   •  Large  user  base   •  Concurrent  CRUD   •  Access  through                    various  channels     •  Millions  of  digital                  assets   •  Variety  of  content   •  Complexity  of  data   •  Rich  UI  features   •  Social  features   •  Mobile  access   •  Fast  search   •  Elastic  scaling   •  Cost  effectiveness   •  Centralized  storage   •  Ease  of                  maintenance   •  HIGH  availability   •  Automatic  failover   •  User  management   Velocity   Volume   User     experience   Scalability   Security  &     Availability   15 •  Easy  integration   •  Shorter  dev  cycle   •  Faster  deployment   •  Ease  of  schema                design    Flexibility  &     Agility   Mobile  Media  Site  
  15. CIGNEX Datamatics Confidential Drupal  with  MongoDB  Solution   Themes  

    Core   Modules   Nodes   Taxonomy   User   Roles   Forms  &   Menu   PHP     Custom   Modules   Work[low   Forums   Comments   &  Ratings   Tagging   Web   Services   3rd  party  &     Internal   Applications   MongoDB   Driver   Mongos   Routing  Process   Replica  Set   MongoDB   MongoDB   Replica  Set   MongoDB   MongoDB   16
  16. CIGNEX Datamatics Confidential Why  Drupal? Pluggable   Architecture   Data

     Abstraction   Layer   Easy  to  Upgrade   Secure   Active  Community   Widely  Adopted   Scalable   3rd  Party  Tools     Integration   User  Management   &  Permissions     HTML5  &  CSS   Support   17
  17. CIGNEX Datamatics Confidential Websites  using  Drupal   Website:  Whitehouse.gov  

    Website:  Data.gov.uk   Website:  mtv.co.uk   Website:    research.yahoo.com   Website:  pdx.edu   Website:   EndPoverty2015.org   18
  18. CIGNEX Datamatics Confidential Why  MongoDB?   Agile  and   Scalable

      Full  Index   Support   Document   Oriented  Storage   Replication   Querying   Atomic  Updates   Data  Processing   and  aggregation   High  Availability   19
  19. CIGNEX Datamatics Confidential Customers  using  MongoDB   •  Centralized  data

     management  platform   •  2  billion+  documents   •  20  TB  of  photo  metadata     •  TV  episodes  and  series   •  Risk  solutions  auditing  data   Source:  http://www.10gen.com/customers   20
  20. CIGNEX Datamatics Confidential Demo       21 •  Media

     site  on  mobile  simulator   •  Like  &  comment  on  an  image  on  mobile  simulator   •  Mobile  site  on  web  browser   •  Verify  ‘Like’  &  comment  of  the  same  image  on  web  browser   •  Search  images  &  access  control        
  21. CIGNEX Datamatics Confidential Architecture User  Metadata   Indexes   Albums

      Image   Metadata   GridFS   Form  API   Drupal  API   Custom   Module   Browser /Mobile   Theme   MongoDB  PHP   Driver   Menu  API   User   Mobile   Device   Image   Metadata   GridFS   23
  22. CIGNEX Datamatics Confidential Add  Album   Flow Add  Image  

    View  Album   View  Individual    Images   Like  Image   Comment     Image   Add  Tags  to     Images   View  Counter   Search  Images   By  Tags   User  Metadata   Albums   GridFS   Image  Metadata   DBRef   DBRef   DBRef   DBRef   MongoDB    Collections   24 User  Actions  
  23. CIGNEX Datamatics Confidential Schema  Design   User  Metadata   GridFS

      Albums   Image  Metadata   •  User  ID   •  DBRef  (Album)   •  Tags   •  Thumbnail   •  Likes   •  View  Counter   •  Comments   •  Permission   •  FS.Files   •  FS.Chunks   •  User  ID   •  Tags   •  Title   •  Make   •  Model   •  Date  Time   •  Aperture   •  Exposure   •  DBRef  (GridFS)   25
  24. CIGNEX Datamatics Confidential MongoDB  Monitoring  Service  (MMS)     27

    •  DB  Storage     •  Cursors   •  Replica  Sets   •  Network   Connections   •  Non  Mapped   Virtual  Memory   •  Opcounters  
  25. CIGNEX Datamatics Confidential Bene[its Drupal   MongoDB   Most  advanced

     content  management     solutions   Scalability  –  billions  of  content  items,   millions  of  users   Highly  customized  websites   Performance  –  FAST  writes  through   sharding,  reads  through  indexes   Most  search  friendly  CMS   Data  safety  through  replication   Less  coding,  high  on  automation   Centralized  single  system  for  data   storage     Powered  by  7000  plugins  and   extensions   Monitoring  through  MMS     Active  community,  real  time  assistance     Enterprise  support  through  10gen     28
  26. CIGNEX Datamatics Confidential Summary  &  Key  Takeaways •  MongoDB  provides

     the  RIGHT  [it  for  CMS  applications  with   [lexibility,  scale  &  speed •  Drupal’s  advanced  &  automated  CMS  features  and  tight   integration  with  MongoDB  makes  it  the  right  choice  for   building  agile  websites   •  Both  Drupal  &  MongoDB  are  feature  rich  and  being  Open   Source,  provide  signi[icant  cost  bene[its     29
     Yash  Badiani   
Big  Data  Practice  Lead   

Gaurav  Khambhala   
Technical  Lead