Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Introduction to Natural Language Generation ...

Marco Bonzanini
September 27, 2018

An Introduction to Natural Language Generation in Python

Presented at the London Python meet-up, September 2018:
https://www.meetup.com/LondonPython/events/254408773/

Title:
Let the AI Do the Talk: Adventures with Natural Language Generation

Abstract:
Recent advances in Artificial Intelligence have shown how computers can compete with humans in a variety of mundane tasks, but what happens when creativity is required?

This talk introduces the concept of Natural Language Generation, the task of automatically generating text, for examples articles on a particular topic, poems that follow a particular style, or speech transcripts that express some attitude. Specifically, we'll discuss the case for Recurrent Neural Networks, a family of algorithms that can be trained on sequential data, and how they improve on traditional language models.

The talk is for beginners, we'll focus more on the intuitions behind the algorithms and their practical implications, and less on the mathematical details. Practical examples with Python will showcase Keras, a library to quickly prototype deep learning architectures.

Marco Bonzanini

September 27, 2018
Tweet

More Decks by Marco Bonzanini

Other Decks in Programming

Transcript

  1. Let the AI do the Talk Adventures with Natural Language

    Generation @MarcoBonzanini London Python meet-up // September 2018
  2. • Sept 2016: Intro to NLP • Sept 2017: Intro

    to Word Embeddings • Sept 2018: Intro to NLG • Sept 2019: ???
  3. Infinite Monkey Theorem from random import choice from string import

    printable def monkey_hits_keyboard(n): output = [choice(printable) for _ in range(n)] print("The monkey typed:") print(''.join(output)) 20
  4. Infinite Monkey Theorem >>> monkey_hits_keyboard(30) The monkey typed: % a9AK^YKx

    OkVG)u3.cQ,31("!ac% >>> monkey_hits_keyboard(30) The monkey typed: fWE,ou)cxmV2IZ l}jSV'XxQ**9'| 21
  5. n-grams >>> from nltk import ngrams >>> list(ngrams("pizza", 3)) [('p',

    'i', 'z'), ('i', 'z', 'z'), ('z', 'z', ‘a')] 25
  6. n-grams >>> from nltk import ngrams >>> list(ngrams("pizza", 3)) [('p',

    'i', 'z'), ('i', 'z', 'z'), ('z', 'z', ‘a')] character-based trigrams 26
  7. n-grams >>> s = "The quick brown fox".split() >>> list(ngrams(s,

    2)) [('The', 'quick'), ('quick', 'brown'), ('brown', 'fox')] 28
  8. n-grams >>> s = "The quick brown fox".split() >>> list(ngrams(s,

    2)) [('The', 'quick'), ('quick', 'brown'), ('brown', 'fox')] word-based bigrams 29
  9. From n-grams to Language Model • Given a large dataset

    of text • Find all the n-grams • Compute probabilities, e.g. count bigrams:
 
 
 31
  10. Marco is a good time to get the latest flash

    player is required for video playback is unavailable right now because this video is not sure if you have a great day. 36 Example: Predictive Text in Mobile
  11. Limitations of LM so far • P(word | full history)

    is too expensive • P(word | previous few words) is feasible • … Local context only! Lack of global context 38
  12. Neural Networks 42 x1 x2 h1 y1 h2 h3 Input

    layer Output layer Hidden layer(s)
  13. Training the Network 47 • Random weight init • Run

    input through the network • Compute error
 (loss function) • Use error to adjust weights
 (gradient descent + back-propagation)
  14. More on Training • Batch size • Iterations and Epochs

    • e.g. 1,000 data points, if batch size = 100 we need 10 iterations to complete 1 epoch 49
  15. Deep Learning in Python • Some NN support in scikit-learn

    • Many low-level frameworks: Theano, PyTorch, TensorFlow • … Keras! • Probably more 63
  16. Keras • Simple, high-level API • Uses TensorFlow, Theano or

    CNTK as backend • Runs seamlessly on GPU • Easier to start with 65
  17. LSTM Example model = Sequential() model.add( LSTM( 128, input_shape=(maxlen,len(chars)) )

    ) model.add(Dense(len(chars), activation='softmax')) 67 Define the network
  18. LSTM Example for i in range(output_size): ... preds = model.predict(x_pred,

    verbose=0)[0] next_index = sample(preds, diversity) next_char = indices_char[next_index] generated += next_char 70 Generate text
  19. LSTM Example for i in range(output_size): ... preds = model.predict(x_pred,

    verbose=0)[0] next_index = sample(preds, diversity) next_char = indices_char[next_index] generated += next_char 71 Seed text
  20. Sample Output are the glories it included. Now am I

    lrA to r ,d?ot praki ynhh kpHu ndst -h ahh umk,hrfheleuloluprffuamdaedospe aeooasak sh frxpaphrNumlpAryoaho (…) 73 Seed text After 1 epoch
  21. Sample Output I go from thee: Bear me forthwitht wh,

    t che f uf ld,hhorfAs c c ff.h scfylhle, rigrya p s lee rmoy, tofhryg dd?ofr hl t y ftrhoodfe- r Py (…) 74 After ~5 epochs
  22. Sample Output a wild-goose flies, Unclaim'd of any manwecddeelc uavekeMw

    gh whacelcwiiaeh xcacwiDac w fioarw ewoc h feicucra h,h, :ewh utiqitilweWy ha.h pc'hr, lagfh eIwislw ofiridete w laecheefb .ics,aicpaweteh fiw?egp t? (…) 75 After 20+ epochs
  23. Wyr feirm hat. meancucd kreukk? , foremee shiciarplle. My, Bnyivlaunef

    sough bus: Wad vomietlhas nteos thun. lore orain, Ty thee I Boe, I rue. niat 78 Tuning After 1 epoch
  24. Second Lord: They would be ruled after this chamber, and

    my fair nues begun out of the fact, to be conveyed, Whose noble souls I'll have the heart of the wars. Clown: Come, sir, I will make did behold your worship. 79 Tuning Much later http://karpathy.github.io/2015/05/21/rnn-effectiveness/
  25. A Couple of Tips • You’ll need a GPU •

    Develop locally on very small dataset
 then run on cloud on real data • At least 1M characters in input,
 at least 20 epochs for training • model.save() !!!
  26. Summary • Natural Language Generation is fun • Simple models

    vs. Neural Networks • Keras makes your life easier • A lot of trial-and-error!
  27. • Brandon Rohrer on "Recurrent Neural Networks (RNN) and Long

    Short-Term Memory (LSTM)": https://www.youtube.com/watch?v=WCUNPb-5EYI • Chris Olah on Understanding LSTM Networks:
 http://colah.github.io/posts/2015-08-Understanding-LSTMs/ • Andrej Karpathy on "The Unreasonable Effectiveness of Recurrent Neural Networks":
 http://karpathy.github.io/2015/05/21/rnn-effectiveness/ Pics: • Weather forecast icon: https://commons.wikimedia.org/wiki/File:Newspaper_weather_forecast_-_today_and_tomorrow.svg • Stack of papers icon: https://commons.wikimedia.org/wiki/File:Stack_of_papers_tied.svg • Document icon: https://commons.wikimedia.org/wiki/File:Document_icon_(the_Noun_Project_27904).svg • News icon: https://commons.wikimedia.org/wiki/File:PICOL_icon_News.svg • Cortana icon: https://upload.wikimedia.org/wikipedia/commons/thumb/8/89/Microsoft_Cortana_light.svg/1024px- Microsoft_Cortana_light.svg.png • Siri icon: https://commons.wikimedia.org/wiki/File:Siri_icon.svg • Google assistant icon: https://commons.wikimedia.org/wiki/File:Google_mic.svg Readings & Credits