LEARNING ALGORITHM MODEL FUNCTION Input about the world Processing resources Learned representation „DOG“ Neural association Eyes + brain Outside world
OUTSIDE WORLD INTO A MACHINE? Input about the world 1 person, 2 trees, 1 animal, lots of grass, 1 path Different grayscale pixels Extracted relevant information People Trees Animals Grass Paths 1 2 1 Yes 1 Numerical representation ( 12 1 1 1 ) Data vector representation Describe or capture Remove context Summarize with numbers
LEARN? A function is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output. (Wikipedia) f ( ) = 1 MACHINES LEARN PREVIOUSLY UNKNOWN FUNCTIONS MAPPING FROM GIVEN INPUT TO GIVEN RESULTS MODEL FUNCTION f ( ) = 0 f (x) = ?
data Learned Model Terrain data (slope, roughness, etc.) Function mapping terrain to speed Customer & market data and past prices Function mapping input to future prices Gene sequence identificatio Lots and lots of genome data Clusters of re-occuring gene sequence patterns
LEARNING User tastes User 1 likes The Clash User 23 likes Die Ärzte User 42 likes Helene Fischer User 1 likes The Sex Pistols User 42 likes Heino Rain Wind Umbrella? heavy light yes none light no light strong no light light yes none strong no Supervised Unsupervised
in a very similar way to human learning! ▸ Learning: Pattern recognition, dealing with unfamiliar situations based on experience ▸ Situations and experience can be abstracted into data to be accessible to machines ▸ Machines learn previously unknown functions from data ▸ A ML system consists of input data, ML algorithms, model functions, results and optionally feedback and training data
OF RECOMMENDATION) ▸ Tutorial for the “Kaggle Titanic Competition” (using R): http://trevorstephens.com/post/72916401642/titanic-getting-started-with-r ▸ More advanced Tutorial based on the same dataset using Python (Scikit-learn, Pandas, Tensorflow): https://blog.socialcops.com/ technology/data-science/machine-learning-python/ ▸ Online courses (MOOCs): ▸ Udacity: Intro to Machine Learning: https://www.udacity.com/course/intro-to-machine-learning--ud120 (Excellent intro to applied ML using sci-kit learn and Python) ▸ Coursera: Machine Learning: https://www.coursera.org/learn/machine-learning (Friendly intro to the theory behind common ML algorithm) ▸ Machine Learning Mastery: Lots of self-study guides for ML learners http://machinelearningmastery.com/ ▸ UCI ML Repository: Collection of “Toy problems” for ML http://archive.ics.uci.edu/ml/datasets.html ▸ Toolkits: ▸ Scikit-Learn (Python, great online documentation): http://scikit-learn.org/stable/ ▸ stats package (many simple ML algorithms), pre-installed (R) Examples: http://www.statmethods.net/stats/regression.html ▸ Book: Abu-Mostafa, Magdon-Ismail, Lin: Learning From Data - A Short Course (AMLbook.com ) (Good intro to more academic perspectives, notation and vocabulary on ML)
LEARNING PROBLEMS 1. Understand the problem and context 2. Understand & clean the data, create some features 3. For supervised learning: Split into training and test data 4. Evaluate different algorithms with default parameters 5. Optimize the parameters and compute the results 6. Interpret the results 7. Repeat with different features until you get useful results