Taiwan University Overview Diagram 2013.09 Slides by Liang Bo Wang upload samples request data return data cluster & job status submit computing intensive job ask data to process provide data Front-end provides: • interface to run & design pipeline • monitor system status • view & manipulate analysis result
Taiwan University Tool with Hadoop 2013.09 Slides by Liang Bo Wang cluster & job status submit computing intensive job ask data to process provide data Try-n-Error Analysis Heavy Computation
Taiwan University Features – Graphically Interactive • Manage sample online – experiment metadata: condition, tissue type, … – groping: folder, labeling • Genome browser – interact with results • result shows along with reference genome • allow jumping over regions given a record clicked • Perform analysis – manipulate result: table filtering, search, … – visualization • static • interactive: HTML5, SVG, D3.js – export result • to Excel • download through link 2013.09 Slides by Liang Bo Wang
Taiwan University Features - Flexibility • Database backend – MySQL, PostgreSQL, MongoDB, Reddis • Storage – Local server – Cloud: Amazon S3, Microsoft Azure, Google Cloud • Computing cluster – optional (not all labs have this) – implementation not sure (message queue: ZeroMQ) – Foxconn custom Hadoop Cluster – Amazon Elastic MapReduce 2013.09 Slides by Liang Bo Wang
Taiwan University Features – Extensible • Provide API – for all bindings, e.g., DB, storage, … – for communication with main platform • set up one’s own pipeline • Provide SDK (Software Development Kit) – combining their own tools – for their own cloud 2013.09 Slides by Liang Bo Wang
Taiwan University Genome Browser • ChromoZOOM https://github.com/rothlab/chromozoom • scribl (HTML5) http://chmille4.github.io/Scribl/ 2013.09 Slides by Liang Bo Wang
Taiwan University Tedious to set up all instances • Create a system image • By using script (Azure SDK), one can control many instances same time. 2013.09 Slides by Liang Bo Wang
Taiwan University How’s the Price? (Cont’d) • Actually it is cheap, – You won’t open an instance 24/7 • Unless you are a web server • Azure has 6mo/12mo pre-paid discount (20% off) – Use when requires intensive computing • Example • L (NT$7.5/hr) x 20 instance x 5 hr ≈ NT$800 – Remember to turn instances off • You link the account w. your credit card, don’t’ be stupid • Worth try! 2013.09 Slides by Liang Bo Wang
Taiwan University Summary • Cent 6.3 fails, need some hack – Switch to Ubuntu 13.04 temporarily – Solution • Build VM locally on Windows Server (.VHD) • Upload VHD image file to server • Seems cheap and I have some quota to play around – $6,300 for 1 month trial, • Though Google gives me $63,000 for trial – Try Google Cloud Platform later 2013.09 Slides by Liang Bo Wang
Taiwan University FYI – Data Visualization • http://www.visualisingdata.com/index.php/ 2013/09/essential-resources-programming- languages-toolkits-and-libraries/ 2013.09 Slides by Liang Bo Wang