Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The BIG Opportunity - A peek into Big Data Rese...

The BIG Opportunity - A peek into Big Data Research

My talk @ Tcrix Faculty Summit'13 about Research opportunities in Big Data and IIIT's journey so far.

Other talks at http://dharmeshkakadia.github.io/talks

dharmeshkakadia

October 01, 2013
Tweet

More Decks by dharmeshkakadia

Other Decks in Research

Transcript

  1. Whoami Sep  20  &  21,  2013   Faculty  Summit  on

     Big  Data  ©TCRIX   2   •  MS student @ IIIT-H working under Prof. Vasudeva Varma •  Just finishing my thesis in Scheduling •  Love large scale [systems | data | learning] •  Automation freak •  Like to work at the intersection of Data and System •  Want to work on interesting things
  2. Why bother? Sep  20  &  21,  2013   Faculty  Summit

     on  Big  Data  ©TCRIX   3   “It’s  not  who  has  the  best  algorithm  that  wins.  It’s  who  has  the  most  data.”                            -­‐  Banko  and  Brill,  2001   Source  :  hBps://amplab.cs.berkeley.edu/2013/02/07/for-­‐big-­‐data-­‐moores-­‐law-­‐means-­‐beBer-­‐decisions  
  3. What is Data ? Sep  20  &  21,  2013  

    Faculty  Summit  on  Big  Data  ©TCRIX   4  
  4. What is Big Data? Anything that is too big or

    too fast or too hard by existing tools. Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   5  
  5. What is Big Data? •  Anything that is too big,

    too fast or too hard by existing tools – 92%  of  world  data  is  generated  in  past  2   years   – 1.4  Trillion  digital  transecQons  per  month   – 30  Billion+  pieces  of  data  added  to   Facebook  every  month.   – …   Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   6  
  6. What is Big Data? •  Anything that is too big,

    too fast or too hard by existing tools – Think  twiBer   – Think  as  display  on  web   – Think  stocks   – Think  Medical  equipment   Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   7  
  7. What is Big Data? •  Anything that is too big,

    too fast or too hard by existing tools – Jeopardy?   – Brain  simulaQons?   – And  everything  else  that  we  don’t  know  yet.   Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   8  
  8. Why should I care ? •  March 2012, The White

    House announced a national "Big Data Initiative”, committing more than $200 million to big data research projects •  The European Commission is funding a 2-year- long Big Data Public Private Forum. •  Open Data Initiative by Government of India. •  Endless enterprise investments. Big Data is here to stay Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   9  
  9. The Big Data Tools Ecosystem Sep  20  &  21,  2013

      Faculty  Summit  on  Big  Data  ©TCRIX   10   Source  :  hBp://www.bigdata-­‐startups.com/open-­‐source-­‐tools/  
  10. Why is it hard interesting ? •  Interdisciplinary, by definition

    •  Requires thinking beyond your comfort zone – Machine  Learning   – StaQsQcs   – Systems   – VisualizaQon   – Signal  Processing   – …   Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   11  
  11. Why is it hard interesting ? •  Interdisciplinary, by definition

    •  Requires thinking beyond your comfort zone – Machine  Learning   – StaQsQcs   – Systems   – VisualizaQon   – Signal  Processing   – …   Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   12  
  12. Big Data Research @ IIIT-H •  Multiple Research Centers involved

    –  Centre  for  Data  Engineering  (CDE)   –  Search  and  InformaQon  ExtracQon  Lab  (SIEL)   –  Center  for  Visual  InformaQon  Technology  (CVIT)   –  Speech  and  Vision  Laboratory  (SVL)   –  Center  for  Structural  Engineering  (CASE)   –  Language  Technologies  Research  Center  (LTRC)   •  Areas of focus in Big Data –  Systems   –  ApplicaQons   Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   13  
  13. Big Data Systems : Data Processing frameworks •  Improving Processing

    efficiency –  Hadoop  Scheduler   –  Hive  query  opQmizaQons     •  Improving Human efficiency –  Automate  everything   –  BeBer  VisualizaQon  techniques     •  How to process new kinds of data ? –  Image   –  Video   –  Speech.   Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   14  
  14. Big Data Systems : Cloud •  Converged Infrastructure – UQlize  full

     capabiliQes  of  infrastructure   – IntegraQon  of  private  and  public  resources       •  Resource optimization – For  energy,  SLA,  performance  ...   – Hot  replicaQon  of  storage     •  Security & Privacy – Privacy  preserving  computaQon   – Security  against  theg   Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   15  
  15. Big Data Application : Text Analytics •  Entity linking • 

    Summarize •  Sarcasm detection •  Author profiling •  Sentiment analysis •  Cross language search •  Question answering Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   16  
  16. Big Data Applications in Languages •  How do you model

    languages ? •  Auto generation of resources •  Part of Speech tagging •  Stemming •  Morphological analysis •  Machine translation •  Transfer learning Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   17  
  17. Big Data Applications in Speech •  Can we understand what

    is being said in real time ? •  Speech synthesis •  Emotion Detection in speech •  Translate speech from one language to another •  “Ok Google” “ठीक $ ग&गल” Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   18  
  18. Big Data Applications in Vision Sep  20  &  21,  2013

      Faculty  Summit  on  Big  Data  ©TCRIX   19   •  Image  Search   •  Cancer  DetecQon  from  scan   •  3D  construcQon  from  2D     •  Perfect  Group  Photo  ?  
  19. Big Data has lot to offer •  Education •  Healthcare

    •  Bioscience •  Energy •  Economics •  Defense •  Environmental Science Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   20  
  20. Big Data Impact : Education •  Intelligent Tutors and Environments

    •  Personalized Learning – Identify student’s competencies and knowledge over time, understand interests, goals and characteristics to improve learning experience. •  Education Data mining - Educational data based on an individual’s work and behaviors can be mined to better understand learning achievements, approaches, etc. Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   21  
  21. Big Data Impact : Economics •  Decision support governments • 

    Fraud detection •  Effectiveness of various government initiatives and spending •  Helping policy and administrative decisions •  Finding and correcting Operational efficiency issues Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   22  
  22. Big Data Impact : Defense •  Smart sensing, perception and

    decision support for autonomous systems •  Situational awareness in warfighters •  Communication analytics of all forms to prevent unwanted events Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   23  
  23. Big Data Impact : Energy •  Data analytics to understand

    Building energy consumptions •  Grid Analytics •  Optimized distribution and generation of electric power •  Self-healing capabilities to Anticipate and respond to system disturbances Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   24  
  24. Big Data Impact : Bioscience and Healthcare •  Genomics • 

    Personalized Medicine •  Data Driven drug discovery •  Focus on wellbeing rather than disease •  Healthcare preventive, proactive, evidence-based, person-centered and, •  Treatment personalization •  Evaluating Effectiveness of treatments Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   25  
  25. Big Data Impact : Environmental Science •  Causes and effects

    of climate change •  Land fertility and usage over time •  Discovery of natural Resources •  Predictive data analysis for disaster prevention •  Quick response for disaster management Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   26  
  26. Challenges •  Lack of Data •  Nuggets vs Noise • 

    Talent lag •  Data Governance Policy Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   27  
  27. Take away Sep  20  &  21,  2013   Faculty  Summit

     on  Big  Data  ©TCRIX   28  
  28. Sep  20  &  21,  2013   Faculty  Summit  on  Big

     Data  ©TCRIX   29   Thanks     @dharmeshkakadia   [email protected]  
  29. References 1.  Bertino, Elisa et al. Challenges and Opportunities with

    Big Data. Community whitepaper. 2.  Rajvi Shah et al. All Smiles : Automatic Photo Enhancement by Facial Expression Analysis. CVMP’12. 3.  Halevy, A et al. The Unreasonable Effectiveness of Data. Intelligent Systems, IEEE 2009. Sep  20  &  21,  2013   Faculty  Summit  on  Big  Data  ©TCRIX   30