Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Word embeddings under the hood - Strata Data Co...

Word embeddings under the hood - Strata Data Conference

Slides from the talk "Word embeddings under the hood: How neural networks learn from language" as presented on March 8, 2018 at the Strata Data Conference in San Jose.

https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63773

Patrick Harrison

March 08, 2018
Tweet

More Decks by Patrick Harrison

Other Decks in Technology

Transcript

  1. We started with the scallop dish as an appetizer, followed

    by the spaghetti with tomato sauce and duck and foie gras ravioli. How do we represent data like this?
  2. 1 2 3 … V we 1 0 0 …

    0 started 0 1 0 … 0 with 0 0 1 … 0 … … … … … … ravioli 0 0 0 … 1 One-Hot Encoding
  3. …but one-hot encoding leaves a lot to be desired. Are

    better word representations possible?
  4. y -2 -1 0 1 2 x -2 -1 0

    1 2 beer wine cocktail spoon fork knife spaghetti pasta lasagna
  5. y -2 -1 0 1 2 x -2 -1 0

    1 2 beer wine cocktail spoon fork knife spaghetti pasta lasagna x y spaghetti 1.0 1.5 pasta 1.2 1.3 … … … fork 0.0 -0.7 spoon -0.5 -1.5
  6. “You shall know a word by the company it keeps.”

    — J.R. Firth, 1957 Postulate #1
  7. spaghetti followed 1 spaghetti by 1 … … … spaghetti

    sauce 1 spaghetti we 0 spaghetti parking 0 … … … spaghetti sushi 0
  8. with by 1 with the 1 … … … with

    and 1 with appetizer 0 with loud 0 … … … with up 0
  9. “Sigmoid” Activation Function Weighted Input Activation Value (z) = 1

    1 + e z <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit> <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit> <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit>
  10. “Sigmoid” Activation Function Weighted Input Activation Value 0.88 (z) =

    1 1 + e z <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit> <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit> <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit>
  11. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong
  12. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong 4. Repeat with the next context clue
  13. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong 4. Repeat with the next context clue
  14. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong 4. Repeat with the next context clue
  15. “Loss” Function Model Prediction Penalty L(ˆ y) = ln(ˆ y)

    <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> right answer: 1
  16. “Loss” Function Model Prediction Penalty right answer: 0 L(ˆ y)

    = ln (1 ˆ y) <latexit sha1_base64="Iltg1Ek3NfsdWz6wTqNhnEEl0fo=">AAACBnicdVDLSsNAFJ34rPEVdSnIYBHqoiERUbsQim5cuKhgbaENZTKdtEMnkzAzEULIzo2/4saFilu/wZ1/4/QFPg9cOJxzL/fe48eMSuU4H8bM7Nz8wmJhyVxeWV1btzY2b2SUCEzqOGKRaPpIEkY5qSuqGGnGgqDQZ6ThD86HfuOWCEkjfq3SmHgh6nEaUIyUljrWzmWp3UcqS/N989QstxnPSm55KuUdq+jYFcetHLnwN3FtZ4QimKDWsd7b3QgnIeEKMyRly3Vi5WVIKIoZyc12IkmM8AD1SEtTjkIivWz0Rw73tNKFQSR0cQVH6teJDIVSpqGvO0Ok+vKnNxT/8lqJCk68jPI4UYTj8aIgYVBFcBgK7FJBsGKpJggLqm+FuI8EwkpHZ+oQpp/C/0n9wK7YztVhsXo2SaMAtsEuKAEXHIMquAA1UAcY3IEH8ASejXvj0XgxXsetM8ZkZgt8g/H2CXbFmAQ=</latexit> <latexit sha1_base64="Iltg1Ek3NfsdWz6wTqNhnEEl0fo=">AAACBnicdVDLSsNAFJ34rPEVdSnIYBHqoiERUbsQim5cuKhgbaENZTKdtEMnkzAzEULIzo2/4saFilu/wZ1/4/QFPg9cOJxzL/fe48eMSuU4H8bM7Nz8wmJhyVxeWV1btzY2b2SUCEzqOGKRaPpIEkY5qSuqGGnGgqDQZ6ThD86HfuOWCEkjfq3SmHgh6nEaUIyUljrWzmWp3UcqS/N989QstxnPSm55KuUdq+jYFcetHLnwN3FtZ4QimKDWsd7b3QgnIeEKMyRly3Vi5WVIKIoZyc12IkmM8AD1SEtTjkIivWz0Rw73tNKFQSR0cQVH6teJDIVSpqGvO0Ok+vKnNxT/8lqJCk68jPI4UYTj8aIgYVBFcBgK7FJBsGKpJggLqm+FuI8EwkpHZ+oQpp/C/0n9wK7YztVhsXo2SaMAtsEuKAEXHIMquAA1UAcY3IEH8ASejXvj0XgxXsetM8ZkZgt8g/H2CXbFmAQ=</latexit> <latexit sha1_base64="Iltg1Ek3NfsdWz6wTqNhnEEl0fo=">AAACBnicdVDLSsNAFJ34rPEVdSnIYBHqoiERUbsQim5cuKhgbaENZTKdtEMnkzAzEULIzo2/4saFilu/wZ1/4/QFPg9cOJxzL/fe48eMSuU4H8bM7Nz8wmJhyVxeWV1btzY2b2SUCEzqOGKRaPpIEkY5qSuqGGnGgqDQZ6ThD86HfuOWCEkjfq3SmHgh6nEaUIyUljrWzmWp3UcqS/N989QstxnPSm55KuUdq+jYFcetHLnwN3FtZ4QimKDWsd7b3QgnIeEKMyRly3Vi5WVIKIoZyc12IkmM8AD1SEtTjkIivWz0Rw73tNKFQSR0cQVH6teJDIVSpqGvO0Ok+vKnNxT/8lqJCk68jPI4UYTj8aIgYVBFcBgK7FJBsGKpJggLqm+FuI8EwkpHZ+oQpp/C/0n9wK7YztVhsXo2SaMAtsEuKAEXHIMquAA1UAcY3IEH8ASejXvj0XgxXsetM8ZkZgt8g/H2CXbFmAQ=</latexit>
  17. “Loss” Function Model Prediction Penalty L(ˆ y) = ln(ˆ y)

    <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> right answer: 1
  18. “Loss” Function Model Prediction Penalty 0.51 0.67 L(ˆ y) =

    ln(ˆ y) <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> right answer: 1
  19. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong 4. Repeat with the next context clue
  20. “Loss” Function Model Prediction Penalty 0.51 0.67 L(ˆ y) =

    ln(ˆ y) <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> right answer: 1
  21. “Loss” Function Model Prediction Penalty 0.51 0.67 L(ˆ y) ˆ

    y = 1 ˆ y <latexit sha1_base64="ZDKHOKtBhi+AUW8uVfyzc5lgHDI=">AAACKnicdZDLSsNAFIYnXmu8RV26GSyCLgxJVdouBNGNCxcVrApNKZPJxA6dXJg5EUrI+7jxVVzoQsWtD+K0RlHRAwM/338OZ87vp4IrcJwXY2JyanpmtjJnzi8sLi1bK6sXKskkZW2aiERe+UQxwWPWBg6CXaWSkcgX7NIfHI/8yxsmFU/icximrBuR65iHnBLQqGcdeaEk1My9gAkg+HTL6xPIh8V28cVKUpgH5o5Ztrsju+Q9q+rY9Wat0XSwYzvj0sKt7Tfqu9gtSRWV1epZD16Q0CxiMVBBlOq4TgrdnEjgVLDC9DLFUkIH5Jp1tIxJxFQ3H99a4E1NAhwmUr8Y8Jh+n8hJpNQw8nVnRKCvfnsj+JfXySBsdHMepxmwmH4sCjOBIcGj4HDAJaMghloQKrn+K6Z9otMAHa+pQ/i8FP8v2jW7aTtne9XDVplGBa2jDbSFXFRHh+gEtVAbUXSL7tETejbujEfjxXj9aJ0wypk19KOMt3dcg6bV</latexit> <latexit sha1_base64="ZDKHOKtBhi+AUW8uVfyzc5lgHDI=">AAACKnicdZDLSsNAFIYnXmu8RV26GSyCLgxJVdouBNGNCxcVrApNKZPJxA6dXJg5EUrI+7jxVVzoQsWtD+K0RlHRAwM/338OZ87vp4IrcJwXY2JyanpmtjJnzi8sLi1bK6sXKskkZW2aiERe+UQxwWPWBg6CXaWSkcgX7NIfHI/8yxsmFU/icximrBuR65iHnBLQqGcdeaEk1My9gAkg+HTL6xPIh8V28cVKUpgH5o5Ztrsju+Q9q+rY9Wat0XSwYzvj0sKt7Tfqu9gtSRWV1epZD16Q0CxiMVBBlOq4TgrdnEjgVLDC9DLFUkIH5Jp1tIxJxFQ3H99a4E1NAhwmUr8Y8Jh+n8hJpNQw8nVnRKCvfnsj+JfXySBsdHMepxmwmH4sCjOBIcGj4HDAJaMghloQKrn+K6Z9otMAHa+pQ/i8FP8v2jW7aTtne9XDVplGBa2jDbSFXFRHh+gEtVAbUXSL7tETejbujEfjxXj9aJ0wypk19KOMt3dcg6bV</latexit> <latexit sha1_base64="ZDKHOKtBhi+AUW8uVfyzc5lgHDI=">AAACKnicdZDLSsNAFIYnXmu8RV26GSyCLgxJVdouBNGNCxcVrApNKZPJxA6dXJg5EUrI+7jxVVzoQsWtD+K0RlHRAwM/338OZ87vp4IrcJwXY2JyanpmtjJnzi8sLi1bK6sXKskkZW2aiERe+UQxwWPWBg6CXaWSkcgX7NIfHI/8yxsmFU/icximrBuR65iHnBLQqGcdeaEk1My9gAkg+HTL6xPIh8V28cVKUpgH5o5Ztrsju+Q9q+rY9Wat0XSwYzvj0sKt7Tfqu9gtSRWV1epZD16Q0CxiMVBBlOq4TgrdnEjgVLDC9DLFUkIH5Jp1tIxJxFQ3H99a4E1NAhwmUr8Y8Jh+n8hJpNQw8nVnRKCvfnsj+JfXySBsdHMepxmwmH4sCjOBIcGj4HDAJaMghloQKrn+K6Z9otMAHa+pQ/i8FP8v2jW7aTtne9XDVplGBa2jDbSFXFRHh+gEtVAbUXSL7tETejbujEfjxXj9aJ0wypk19KOMt3dcg6bV</latexit> L(ˆ y) = ln(ˆ y) <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> right answer: 1
  22. It works! Now… 1. How do we know which direction

    to nudge a weight? 2. How can we calculate this automatically for all the weights?
  23. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong 4. Repeat with the next context clue
  24. Context clues trained: 1 topics discerning masked sweets carmelized shelly

    cue prepare “amazing” as focus word: 0 cheerful succulent adjusting pop antenna suggesting vinegary brothers “server” as focus word: 0 ignorant sop refrigerators bags recliner introduce covered petco “spaghetti” as focus word: 1
  25. Context clues trained: 2,000,000 awesome delicious super here ) $

    customer excellent “amazing” as focus word: 1,854 thru along crab tacos / windows chef 1 “server” as focus word: 780 dollar rings loves = opened wrapped form provided “spaghetti” as focus word: 84
  26. Context clues trained: 100,000,000 incredible awesome outstanding excellent phenomenal fabulous

    superb fantastic “amazing” as focus word: 87,864 waiter waitress bartender hostess guide technician cashier barista “server” as focus word: 48,492 risotto veal katsu goat turkey enchilada raspberry meatloaf “spaghetti” as focus word: 3,600
  27. bun american + mexican ⇡ tortilla <latexit sha1_base64="nx3V5F3dTPHOtTf0fAS/HCGGUlI=">AAACLXicdZBLSwMxEMez9V1fVY9egkUQxLIrovZWEMGTKFgV2lKy6bQNzSZLMistpZ/Ii19FD4IPvPo1TNsVfA4E/vnNTCbzD2MpLPr+k5eZmJyanpmdy84vLC4t51ZWL61ODIcy11Kb65BZkEJBGQVKuI4NsCiUcBV2job5qxswVmh1gb0YahFrKdEUnKFD9dxxFaGL/TBRg+xOdnxhERhX4Mh2SiLojkGVxbHR3RSjNiikZIN6Lu8Xin5Q3A/obxEU/FHkSRpn9dxDtaF5EoFCLpm1lcCPsdZn7j0uwc1JLMSMd1gLKk4q9yVb64/WHdBNRxq0qY07CumIfu1wC1jbi0JXGTFs25+5IfwrV0mweVjrCxUnCIqPBzUTSVHToXe0IQxwlD0nGDfC/ZXyNjOMo3M460z43JT+L8q7hWIhON/Ll05TN2bJOtkgWyQgB6RETsgZKRNObsk9eSYv3p336L16b+PSjJf2rJFv4b1/AGJDqk4=</latexit> <latexit sha1_base64="nx3V5F3dTPHOtTf0fAS/HCGGUlI=">AAACLXicdZBLSwMxEMez9V1fVY9egkUQxLIrovZWEMGTKFgV2lKy6bQNzSZLMistpZ/Ii19FD4IPvPo1TNsVfA4E/vnNTCbzD2MpLPr+k5eZmJyanpmdy84vLC4t51ZWL61ODIcy11Kb65BZkEJBGQVKuI4NsCiUcBV2job5qxswVmh1gb0YahFrKdEUnKFD9dxxFaGL/TBRg+xOdnxhERhX4Mh2SiLojkGVxbHR3RSjNiikZIN6Lu8Xin5Q3A/obxEU/FHkSRpn9dxDtaF5EoFCLpm1lcCPsdZn7j0uwc1JLMSMd1gLKk4q9yVb64/WHdBNRxq0qY07CumIfu1wC1jbi0JXGTFs25+5IfwrV0mweVjrCxUnCIqPBzUTSVHToXe0IQxwlD0nGDfC/ZXyNjOMo3M460z43JT+L8q7hWIhON/Ll05TN2bJOtkgWyQgB6RETsgZKRNObsk9eSYv3p336L16b+PSjJf2rJFv4b1/AGJDqk4=</latexit>

    <latexit sha1_base64="nx3V5F3dTPHOtTf0fAS/HCGGUlI=">AAACLXicdZBLSwMxEMez9V1fVY9egkUQxLIrovZWEMGTKFgV2lKy6bQNzSZLMistpZ/Ii19FD4IPvPo1TNsVfA4E/vnNTCbzD2MpLPr+k5eZmJyanpmdy84vLC4t51ZWL61ODIcy11Kb65BZkEJBGQVKuI4NsCiUcBV2job5qxswVmh1gb0YahFrKdEUnKFD9dxxFaGL/TBRg+xOdnxhERhX4Mh2SiLojkGVxbHR3RSjNiikZIN6Lu8Xin5Q3A/obxEU/FHkSRpn9dxDtaF5EoFCLpm1lcCPsdZn7j0uwc1JLMSMd1gLKk4q9yVb64/WHdBNRxq0qY07CumIfu1wC1jbi0JXGTFs25+5IfwrV0mweVjrCxUnCIqPBzUTSVHToXe0IQxwlD0nGDfC/ZXyNjOMo3M460z43JT+L8q7hWIhON/Ll05TN2bJOtkgWyQgB6RETsgZKRNObsk9eSYv3p336L16b+PSjJf2rJFv4b1/AGJDqk4=</latexit> <latexit sha1_base64="nx3V5F3dTPHOtTf0fAS/HCGGUlI=">AAACLXicdZBLSwMxEMez9V1fVY9egkUQxLIrovZWEMGTKFgV2lKy6bQNzSZLMistpZ/Ii19FD4IPvPo1TNsVfA4E/vnNTCbzD2MpLPr+k5eZmJyanpmdy84vLC4t51ZWL61ODIcy11Kb65BZkEJBGQVKuI4NsCiUcBV2job5qxswVmh1gb0YahFrKdEUnKFD9dxxFaGL/TBRg+xOdnxhERhX4Mh2SiLojkGVxbHR3RSjNiikZIN6Lu8Xin5Q3A/obxEU/FHkSRpn9dxDtaF5EoFCLpm1lcCPsdZn7j0uwc1JLMSMd1gLKk4q9yVb64/WHdBNRxq0qY07CumIfu1wC1jbi0JXGTFs25+5IfwrV0mweVjrCxUnCIqPBzUTSVHToXe0IQxwlD0nGDfC/ZXyNjOMo3M460z43JT+L8q7hWIhON/Ll05TN2bJOtkgWyQgB6RETsgZKRNObsk9eSYv3p336L16b+PSjJf2rJFv4b1/AGJDqk4=</latexit>
  28. No magical black box AI… Just context clues and some

    arithmetic! Bonus: now you know the fundamentals of all neural network learning