Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Post–Data Era

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.
Avatar for Mosky Liu Mosky Liu
November 29, 2020

The Post–Data Era

「資料科學家是 21 世紀最性感的職業。」然而,身處在資料領域的你覺得性感嗎?上代最性感的軟體工程師、現在的後端團隊主管,經歷過數次資訊狂潮,想跟你分享這些資料領域的暗流。

我們都喜愛資料,希望利用資料創造價值、造福人類,期待這場演講能幫助大家避開雷區,更有效率地利用資料解決問題!

Avatar for Mosky Liu

Mosky Liu

November 29, 2020
Tweet

More Decks by Mosky Liu

Other Decks in Technology

Transcript

  1. 2014: Graph-Tool 2017: Data Science With Python 2018: Hypothesis Testing

    With Python 2019: Statistical Regression With Python Mosky
  2. ➤ Statistics constructs more solid inferences. ➤ Machine learning constructs

    more interesting predictions. ➤ Machine Learning ⊃ Deep Learning ➤ The models may be the same, but the focuses are different. ➤ Good predictions usually needs good inferences on dataset. Statistics vs. Machine Learning
  3. Study Designs • RCT (A/B testing) • Cohort Study: Group

    by exposure. • Case-Control Study: Diff to find the exposure. • Case Series • Case Report • —Oxford CEBM 2009, Study Designs
  4. Science, Analysis, Scientist, and Engineering ➤ Data Engineering / Data

    Engineer ➤ Prepare the data infra to enable others to work with. ➤ Data Analysis / Data Analyst ➤ Analyze to help the company's decisions. ➤ Data Scientist ➤ Create software to optimize the company's operations. Role Matters
  5. • 會⾃然發⽣許多隱晦的技術問題 → 需要扎實的基礎功 • 不只可解釋,還要理解資料與模型 → 統計學、研究⽅法中有豐富的⼯具 • 還會⾃然發⽣許多隱晦的非技術問題

    → 需要領域知識才能發現 • ⼀個⼈時間有限 → 定位⾓⾊、磨練協作技能、持之以恆   例如專案管理、產品管理 • 創造價值?讓⼈感到開⼼!除了使⽤者,同事、老闆也是。
  6. Image Credits • “NoSQL”: https://www.reddit.com/r/ProgrammerHumor/comments/2mk8sb/history_of_nosql/ • “NoSQL Databases”: https://www.tech2shout.com/nosql-database-solutions-5-types/ •

    “Hype Cycle”: https://en.wikipedia.org/wiki/Hype_cycle#/media/File:Hype-Cycle-General.png • “Overfitting”: https://en.wikipedia.org/wiki/Overfitting#/media/File:Overfitting.svg • “Data Leakage”: https://www.kaggle.com/dansbecker/data-leakage • “Husky”: https://en.wikipedia.org/wiki/Husky • “Wolf”: https://en.wikipedia.org/wiki/Wolf#/media/File:Front_view_of_a_resting_Canis_lupus_ssp.jpg • “Stationarity”: https://en.wikipedia.org/wiki/Expected_value#/media/File:Largenumbers.svg • “Non-Stationarity”: https://en.wikipedia.org/wiki/Stationary_process#/media/File:Stationarycomparison.png • “Houses”: https://unsplash.com/photos/vZEPXDQHR4s • “Linear PCA vs. Nonlinear Principal Manifolds”: https://en.wikipedia.org/wiki/Principal_component_analysis#/media/ File:Elmap_breastcancer_wiki.png • “Teamwork”: https://unsplash.com/photos/g1Kr4Ozfoac • “Smile”: https://unsplash.com/photos/4K2lIP0zc_k