Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building smarter apps with machine learning, from magic to reality

Laurent Picard
December 06, 2023

Building smarter apps with machine learning, from magic to reality

“Any sufficiently advanced technology is indistinguishable from magic.”
— Arthur C Clarke

Well, machine learning can look like magic, but you don't need to be a data scientist or an ML researcher to develop with ML.

So, what about making your solution smarter without any knowledge in AI? With pre-trained models and a few lines of code, machine learning APIs can analyze your data. Moreover, AutoML techniques can now help in getting even more specific insights tailored to your needs.

In this session, you’ll see how to transform or extract information from text, image, audio & video with the latest ML APIs, how to train an AutoML custom model, and you’ll be an active player of a live demo. Don't put your smartphone in airplane mode!

Laurent Picard

December 06, 2023
Tweet

More Decks by Laurent Picard

Other Decks in Technology

Transcript

  1. Building smarter solutions
    with no expertise
    in machine learning
    Laurent Picard
    @PicardParis
    Google Amsterdam
    December 6, 2023

    View full-size slide

  2. Hey! I'm Laurent!
    Laurent Picard ‒ @PicardParis
    ○ Developer Advocate ‒ Google Cloud
    ○ Applied AI, Serverless, Python
    Previous lives
    ○ CTO, cofounder of Bookeen
    ○ Ebook pioneer (17 years)
    ○ Educational products (CD-ROMs)

    View full-size slide

  3. Any sufficiently advanced technology
    is indistinguishable
    from magic
    — Arthur C. Clarke

    View full-size slide

  4. @PicardParis
    What is machine learning (for me)?
    Data
    Information

    View full-size slide

  5. @PicardParis
    What is machine learning, for real?
    Artificial Intelligence (make machines "intelligent")
    Machine Learning (learn from data)
    Deep Learning (using neural networks)
    Generative AI
    (create new content)

    View full-size slide

  6. @PicardParis
    How does deep learning work?
    How
    Using
    many examples
    to find answers
    Result
    Solving problems
    without explicitly
    knowing the answer
    Origin
    Trying to mimic
    how (we think)
    our brain works

    View full-size slide

  7. @PicardParis
    Why is machine learning now possible?
    Theory Data
    Computing
    ML

    View full-size slide

  8. @PicardParis
    Focus on ML
    Focus on Dev
    ML APIs
    Ready-to-use
    models
    AutoML
    Customized
    models
    ML
    Data & neural
    networks
    Building blocks
    Four ways we can build with ML in 2023
    Gen AI
    Generative
    models

    View full-size slide

  9. 01
    Machine
    Learning
    APIs
    Ready-to-use models

    View full-size slide

  10. @PicardParis
    Ready-to-use models
    Image Video Text Text Speech Text
    Translation
    API
    Speech-To-Text
    API
    Natural
    Language
    API
    Vision
    API
    Video
    Intelligence
    API
    Text-To-Speech
    API
    Info Info Info Translation Text Speech

    View full-size slide

  11. @PicardParis
    Generative AI
    Text
    Prompt → Text
    - Chat
    - Summarization
    - Classification
    - Extraction
    - Writing/ideation
    Image → Info
    - Image captioning
    Image+Q → Info
    - Visual Q & A
    Text → App
    - Generative AI agent
    - Enterprise Search
    - Recommendations
    Image
    Prompt → Code
    - Code generation
    - Code completion
    Vertex AI
    > Generative AI Studio
    > Search and Conversation
    Speech
    Prompt → Image
    - Image generation
    + Prompt → Image
    - Image editing

    View full-size slide

  12. @PicardParis
    Vision API
    Extract information from images

    View full-size slide

  13. @PicardParis
    Computer vision before ML
    Photo by Shaun Jeffers: hobbitontours.com Edge detection with Sobel convolution filter

    View full-size slide

  14. @PicardParis
    Label detection
    Photo by Shaun Jeffers: hobbitontours.com
    "labelAnnotations": [
    {
    "description": "Nature",
    "mid": "/m/05h0n",
    "score": 0.9516123,
    },
    {
    "description": "Flower",
    "mid": "/m/0c9ph5",
    "score": 0.91467637,
    },
    {
    "description": "Garden",
    "mid": "/m/0bl0l",
    "score": 0.903375,
    },

    ]

    View full-size slide

  15. @PicardParis
    Photo by Dominic Monaghan (Instagram)
    Object detection
    "localizedObjectAnnotations": [
    {
    "boundingPoly": {…},
    "mid": "/m/01g317",
    "name": "Person",
    "score": 0.90216154
    },
    {
    "boundingPoly": {…},
    "mid": "/m/01g317",
    "name": "Person",
    "score": 0.88069034
    },
    {
    "boundingPoly": {…},
    "mid": "/m/01g317",
    "name": "Person",
    "score": 0.86947715
    },

    ]

    View full-size slide

  16. @PicardParis
    Rendering by Elendil:
    www.zbrushcentral.com/printthread.php?t=45397
    Face detection
    "faceAnnotations": [{
    "detectionConfidence": 0.93634903,
    "boundingPoly": {…},
    "fdBoundingPoly": {…},
    "landmarkingConfidence": 0.18798567,
    "landmarks": [{
    "type": "LEFT_EYE"
    "position": {…},
    },…],
    "panAngle": -1.7626401,
    "rollAngle": 7.024975,
    "tiltAngle": 9.038818,
    "angerLikelihood": "LIKELY",
    "joyLikelihood": "VERY_UNLIKELY",
    "sorrowLikelihood": "VERY_UNLIKELY",
    "surpriseLikelihood": "VERY_UNLIKELY",
    "headwearLikelihood": "VERY_UNLIKELY",
    "blurredLikelihood": "VERY_UNLIKELY",
    "underExposedLikelihood": "VERY_UNLIKELY"
    }]

    View full-size slide

  17. @PicardParis
    Screenshot from Goodreads: goodreads.com/quotes/4454
    Text detection
    "fullTextAnnotation": {
    "text": "
    J.R.R. Tolkien > Quotes > Quotable Quote
    \"Three Rings for the Elven-kings under the…
    Seven for the Dwarf-lords in their halls of…
    Nine for Mortal Men, doomed to die,
    One for the Dark Lord on his dark throne
    In the Land of Mordor where the Shadows lie.
    One Ring to rule them all, One Ring to find…
    One Ring to bring them all and in the darkn…
    In the Land of Mordor where the Shadows lie.\"
    - J. R. R. Tolkien, The Lord of the Rings
    "
    }

    View full-size slide

  18. @PicardParis
    Screenshot from Goodreads: goodreads.com/quotes/4454
    Text detection
    "fullTextAnnotation": {
    "text": "
    J.R.R. Tolkien > Quotes > Quotable Quote
    \"Three Rings for the Elven-kings under the…
    Seven for the Dwarf-lords in their halls of…
    Nine for Mortal Men, doomed to die,
    One for the Dark Lord on his dark throne
    In the Land of Mordor where the Shadows lie.
    One Ring to rule them all, One Ring to find…
    One Ring to bring them all and in the darkn…
    In the Land of Mordor where the Shadows lie.\"
    - J. R. R. Tolkien, The Lord of the Rings
    "
    }

    View full-size slide

  19. @PicardParis
    Tolkien handwriting: pinterest.com/pin/145311525456602832
    Handwriting detection
    "fullTextAnnotation": {
    "text": "
    The Lord of
    the Rings.
    Three Rings for the Elven-kings under the sky,
    Seven for the Dwarf-lords in their halls of…
    Nine for Mortal Men doomed to die,
    One for the Dark Lord on his dark throne
    In the Land of Mordor where the shadows lie.
    One Ring to rule them all, One Ring to find…
    One Ring to bring them all and in the shadows…
    In the Land of Mordor where the shadows lie\".
    "
    }

    View full-size slide

  20. @PicardParis
    Landmark detection
    "landmarkAnnotations": [
    {
    "boundingPoly": {…},
    "description": "Hobbiton Movie Set",
    "locations": [
    {
    "latLng": {
    "latitude": -37.8723441,
    "longitude": 175.6833613
    }
    }
    ],
    "mid": "/m/012r3jqg",
    "score": 0.61243546
    }
    ]
    Original photo by Shaun Jeffers: hobbitontours.com

    View full-size slide

  21. @PicardParis
    Web entity detection and image matching
    "webDetection": {
    "bestGuessLabels": [
    {
    "label": "jrr tolkien",
    "languageCode": "es"
    }
    ],
    "webEntities": [
    {
    "entityId": "/m/041h0",
    "score": 14.976,
    "description": "J. R. R. Tolkien"
    },…
    ],
    "partialMatchingImages": [
    {
    "url":
    "http://e00-elmundo.uecdn.es/…jpg"
    },…
    ],
    "pagesWithMatchingImages": […],
    "visuallySimilarImages": […]
    }
    Photo by Bill Potter: elmundo.es/cultura/2017/08/11/598c81b6e2704ebf238b469e.html

    View full-size slide

  22. @PicardParis
    Client libraries
    from google.cloud import vision
    uri_base = 'gs://cloud-vision-codelab'
    pics = ('face_surprise.jpg', 'face_no_surprise.png')
    client = vision.ImageAnnotatorClient()
    image = vision.Image()
    for pic in pics:
    image.source.image_uri = f'{uri_base}/{pic}'
    response = client.face_detection(image=image)
    for face in response.face_annotations:
    likelihood = vision.Likelihood(face.surprise_likelihood)
    vertices = [f'({v.x},{v.y})' for v in face.bounding_poly.vertices]
    print(f'Face surprised: {likelihood.name}')
    print(f'Face bounds: {",".join(vertices)}')
    Python package: pypi.org/project/google-cloud-vision
    Tutorial code: codelabs.developers.google.com/codelabs/cloud-vision-api-python

    View full-size slide

  23. Demo – Vision API

    View full-size slide

  24. @PicardParis
    Video Intelligence API
    Extract information from videos

    View full-size slide

  25. @PicardParis
    Video Intelligence API
    Label Detection
    Detect entities within
    the video, such as "dog",
    "flower" or "car".
    Enable Video Search
    Search your video
    catalog the same way
    you search text
    documents.
    Insights from Videos
    Extract actionable
    insights from video files
    without requiring any
    machine learning or
    computer vision
    knowledge.
    More…
    Detect sequences
    Detect and track objects
    Detect explicit content
    Transcribe speech
    + OCR, logo, face,
    person detection, pose
    estimation…

    View full-size slide

  26. Demo - Video Intelligence API

    View full-size slide

  27. Demo – Video Intelligence API

    View full-size slide

  28. @PicardParis
    Client libraries
    from google.cloud import videointelligence
    from google.cloud.videointelligence import enums, types
    def track_objects(video_uri, segments=None):
    video_client = videointelligence.VideoIntelligenceServiceClient()
    features = [enums.Feature.OBJECT_TRACKING]
    context = types.VideoContext(segments=segments)
    print(f'Processing video "{video_uri}"...')
    operation = video_client.annotate_video(input_uri=video_uri,
    features=features,
    video_context=context)
    return operation.result()
    Python package: pypi.org/project/google-cloud-videointelligence
    Tutorial code: codelabs.developers.google.com/codelabs/cloud-video-intelligence-python3

    View full-size slide

  29. @PicardParis
    Natural Language API
    Extract information from text

    View full-size slide

  30. @PicardParis
    Syntax analysis
    Tolkien was a British writer, poet, philologist, and university professor
    who is best known as the author of the classic high-fantasy works
    The Hobbit, The Lord of the Rings, and The Silmarillion.

    View full-size slide

  31. @PicardParis
    Tolkien was a British writer, poet, philologist, and university professor
    who is best known as the author of the classic high-fantasy works
    The Hobbit, The Lord of the Rings, and The Silmarillion.
    {
    "language": "en"
    }
    Syntax analysis

    View full-size slide

  32. @PicardParis
    Entity detection
    Tolkien was a British writer, poet, philologist, and university professor
    who is best known as the author of the classic high-fantasy works
    The Hobbit, The Lord of the Rings, and The Silmarillion.

    View full-size slide

  33. @PicardParis
    Entity detection
    Tolkien was a British writer, poet, philologist, and university professor
    who is best known as the author of the classic high-fantasy works
    The Hobbit, The Lord of the Rings, and The Silmarillion.

    View full-size slide

  34. @PicardParis
    Entity detection
    Tolkien was a British writer, poet, philologist, and university professor
    who is best known as the author of the classic high-fantasy works
    The Hobbit, The Lord of the Rings, and The Silmarillion.
    {
    "name": "British",
    "type": "LOCATION",
    "metadata": {
    "mid": "/m/07ssc",
    "wikipedia_url": "https://en.wikipedia.org/wiki/United_Kingdom"
    }
    }
    {
    "name": "Tolkien",
    "type": "PERSON",
    "metadata": {
    "mid": "/m/041h0",
    "wikipedia_url": "https://en.wikipedia.org/wiki/J._R._R._Tolkien"
    }
    }
    {
    "name": "The Silmarillion",
    "type": "WORK_OF_ART",
    "metadata": {
    "mid": "/m/07c4l",
    "wikipedia_url": "https://en.wikipedia.org/wiki/The_Silmarillion"
    }
    }

    View full-size slide

  35. @PicardParis
    Content classification
    Tolkien was a British writer, poet, philologist, and university professor
    who is best known as the author of the classic high-fantasy works
    The Hobbit, The Lord of the Rings, and The Silmarillion.
    {
    "categories": [
    {
    "name": "/Books & Literature",
    "confidence": 0.97
    },
    {
    "name": "/People & Society/Subcultures…",
    "confidence": 0.66
    },
    {
    "name": "/Hobbies & Leisure",
    "confidence": 0.58
    }
    ]
    }

    View full-size slide

  36. @PicardParis
    Sentiment analysis
    2 example reviews of “The Hobbit”:
    - Positive from the NYT (1938)
    - Negative from GoodReads

    View full-size slide

  37. @PicardParis
    Text moderation
    Detect sensitive or harmful content
    by scoring 16 categories.

    View full-size slide

  38. @PicardParis
    Client libraries
    from google.cloud import language
    from google.cloud.language import enums, types
    def analyze_text_sentiment(text):
    client = language.LanguageServiceClient()
    document = types.Document(content=text,
    type=enums.Document.Type.PLAIN_TEXT)
    response = client.analyze_sentiment(document=document)
    sentiment = response.document_sentiment
    results = [('text', text),
    ('score', sentiment.score),
    ('magnitude', sentiment.magnitude)]
    for k, v in results:
    print('{:10}: {}'.format(k, v))
    Python package: pypi.org/project/google-cloud-language
    Tutorial code: codelabs.developers.google.com/codelabs/cloud-natural-language-python3

    View full-size slide

  39. @PicardParis
    Translation API
    Translate text
    in 100+ languages

    View full-size slide

  40. @PicardParis
    Translation API
    Translate Many
    Languages
    100+ different
    languages, from
    Afrikaans to Zulu.
    Used in combination,
    this enables translation
    between thousands of
    language pairs.
    Language Detection
    Translation API can
    automatically identify
    languages with high
    accuracy.
    Simple Integration
    Easy to use Google
    REST API.
    No need to extract text
    from your document,
    just send it HTML
    documents and get
    back translated text.
    High Quality
    Translations
    High quality
    translations that push
    the boundary of
    Machine Translation.
    Updated constantly to
    seamlessly improve
    translations and
    introduce new
    languages and
    language pairs.

    View full-size slide

  41. @PicardParis
    Switch to a neural translation model in 2016
    Neural Network for Machine Translation, at Production Scale (ai.googleblog.com)

    View full-size slide

  42. @PicardParis
    Models match empirical studies
    Exploring Massively Multilingual, Massive Neural Machine Translation (ai.googleblog.com)

    View full-size slide

  43. @PicardParis
    Models keep improving over time
    Recent Advances in Google Translate (ai.googleblog.com)

    View full-size slide

  44. @PicardParis
    Client libraries
    from google.cloud import translate
    def translate_text(target, text):
    """Translates text into the target language."""
    translate_client = translate.Client()
    # Text can also be a sequence of strings, in which case this method
    # will return a sequence of results for each text.
    result = translate_client.translate(text, target_language=target)
    print('Text: {}'.format(result['input']))
    print('Translation: {}'.format(result['translatedText']))
    print('Detected source language: {}'.format(result['detectedSourceLanguage']))
    Sample from Python open source client library
    github.com/GoogleCloudPlatform/python-docs-samples

    View full-size slide

  45. @PicardParis
    Speech-to-Text API
    Convert speech to text
    in 125 languages

    View full-size slide

  46. @PicardParis
    Speech-to-Text API
    Speech Recognition
    Recognizes 125
    languages & variants.
    Powered by deep
    learning neural
    networking to power
    your applications.
    Real-Time Results
    Can stream text results,
    returning partial
    recognition results as
    they become available.
    Can also be run on
    buffered or archived
    audio files.
    Noise Robustness
    No need for signal
    processing or noise
    cancellation before
    calling API.
    Can handle noisy audio
    from a variety of
    environments.
    More…
    Customized recognition
    Word timestamps
    Auto-punctuation
    Profanity filter
    Spoken punctuation
    Spoken emojis

    (Preview)
    Language auto-detection
    Multiple speaker detection
    Word-level confidence

    View full-size slide

  47. @PicardParis
    Speech timestamps
    Search for text
    within your audio
    "transcript": "Hello world…",
    "confidence": 0.96596134,
    "words": [
    {
    "startTime": "1.400s",
    "endTime": "1.800s",
    "word": "Hello"
    },
    {
    "startTime": "1.800s",
    "endTime": "2.300s",
    "word": "world"
    },

    ]

    View full-size slide

  48. @PicardParis
    Client libraries
    from google.cloud import speech_v1 as speech
    def speech_to_text(config, audio):
    client = speech.SpeechClient()
    response = client.recognize(config, audio)
    config = {'language_code': 'fr-FR',
    'enable_automatic_punctuation': True,
    'enable_word_time_offsets': True}
    audio = {'uri': 'gs://cloud-samples-data/speech/corbeau_renard.flac'}
    speech_to_text(config, audio)
    """
    Transcript: Maître corbeau sur un arbre perché tenait en son bec un fromage...
    Confidence: 93%
    """
    Python package: pypi.org/project/google-cloud-speech
    Tutorial code: codelabs.developers.google.com/codelabs/cloud-speech-text-python3

    View full-size slide

  49. @PicardParis
    Text-to-Speech API
    Generate natural speech

    View full-size slide

  50. @PicardParis
    WaveNet natural voices, by Deepmind
    https://deepmind.com/blog/wavenet-generative-model-raw-audio
    https://deepmind.com/blog/high-fidelity-speech-synthesis-wavenet

    View full-size slide

  51. @PicardParis
    Which one is the original recording?

    View full-size slide

  52. Demo – Live Search & Response

    View full-size slide

  53. @PicardParis
    Client libraries
    from google.cloud import texttospeech
    from google.cloud.texttospeech import enums, types
    def text_to_wav(voice_name, text):
    language_code = "-".join(voice_name.split("-")[:2])
    input = types.SynthesisInput(text=text)
    voice = types.VoiceSelectionParams(language_code=language_code, name=voice_name)
    audio_config = types.AudioConfig(audio_encoding=enums.AudioEncoding.LINEAR16)
    client = texttospeech.TextToSpeechClient()
    response = client.synthesize_speech(input, voice, audio_config)
    save_to_wav(f"{language_code}.wav", response.audio_content)
    text_to_wav("en-AU-Wavenet-A", "What is the temperature in Sydney?")
    text_to_wav("en-GB-Wavenet-B", "What is the temperature in London?")
    text_to_wav("en-IN-Wavenet-C", "What is the temperature in Delhi?")
    Python package: pypi.org/project/google-cloud-texttospeech
    Tutorial code: codelabs.developers.google.com/codelabs/cloud-text-speech-python3

    View full-size slide

  54. @PicardParis
    Generative AI
    Text
    Prompt → Text
    - Chat
    - Summarization
    - Classification
    - Extraction
    - Writing/ideation
    Image → Info
    - Image captioning
    Image+Q → Info
    - Visual Q & A
    Text → App
    - Generative AI agent
    - Enterprise Search
    - Recommendations
    Image
    Prompt → Code
    - Code generation
    - Code completion
    Vertex AI
    > Generative AI Studio
    > Search and Conversation
    Speech
    Prompt → Image
    - Image generation
    + Prompt → Image
    - Image editing

    View full-size slide

  55. 02
    AutoML
    Build your custom model
    with no expertise

    View full-size slide

  56. @PicardParis
    Generic results with the Vision API

    View full-size slide

  57. @PicardParis
    More specific results?
    CIRRUS
    ALTOCUMULUS

    View full-size slide

  58. @PicardParis
    AutoML
    AutoML
    Train Deploy Serve
    Your training data
    Your custom model
    with a REST API
    Your custom
    edge model
    TF Lite mobile
    TF.js browser
    Container anywhere

    View full-size slide

  59. @PicardParis
    Dataset

    View full-size slide

  60. @PicardParis
    Training

    View full-size slide

  61. @PicardParis
    Evaluating

    View full-size slide

  62. @PicardParis
    Serving

    View full-size slide

  63. @PicardParis
    Auto-generate a custom model from your data
    Image Text Text
    Video Structured Data
    AutoML
    Vision
    AutoML
    Natural Language
    AutoML
    Translation
    AutoML
    Video Intelligence
    AutoML
    Tables
    Custom
    - Classification
    - Object Detection
    - Pix Segmentation
    Custom
    - Classification
    - Shot Detection
    - Obj. Detect./Track.
    Custom
    - Classification
    - Entity Extraction
    - Sentiment Analysis
    Custom
    Translation
    Custom
    - Classification
    - Metrics Prediction

    View full-size slide

  64. @PicardParis
    AutoML in Vertex AI
    Image classification (single-label) Image classification (multi-label) Image object detection
    Video classification Video action recognition Video object tracking
    Text classification (single-label) Text classification (multi-label) Text entity extraction Text sentiment analysis
    Regression/classification
    Image segmentation
    + Custom translation models with AutoML Translation
    Forecasting

    View full-size slide

  65. @PicardParis
    What are your emotions?
    Ready-to-use model
    Vision API
    😃 Joy
    😮 Surprise
    😢 Sorrow
    😠 Anger
    I want to detect faces + general emotions
    Custom model
    AutoML Vision
    😛 Tongue out
    🥱 Yawning
    😴 Sleeping
    I want to detect new custom expressions

    View full-size slide

  66. Stache Club demo serverless architecture
    Source Selfies
    Cloud Storage
    Stache Club App
    App Engine
    Stache Club Selfies
    Cloud Storage
    Face Detection
    Vision API
    Custom Detection
    AutoML Vision
    Selfie Processing
    Cloud Functions
    User
    Web request
    Event trigger
    1. Upload a selfie
    2. Function is automatically triggered
    3. Function gets insights from ML APIs
    4. Function uploads result image
    1
    2
    3
    4
    Admin

    View full-size slide

  67. @PicardParis
    Evaluation: results vs expectations
    Results
    returned
    by model
    Results
    we expect
    Results
    we don't expect
    Results
    not returned
    by model
    Model
    positives

    Model
    negatives

    True
    positives
    False
    negatives
    True
    negatives
    False
    positives

    View full-size slide

  68. @PicardParis
    Model precision
    Precision =
    True
    +
    True
    +
    False
    +
    Precision can be seen as a measure
    of exactness or quality.
    High precision means that the model
    returns substantially more expected
    results than unexpected ones.

    View full-size slide

  69. @PicardParis
    Model recall
    Recall can be seen as a measure
    of completeness or quantity.
    High recall means that the
    model returns most of the
    expected results.
    Recall =
    True
    +
    True
    +
    False

    View full-size slide

  70. @PicardParis
    Learning to learn
    Models to identify
    optimal model
    architectures
    AutoML under the hood
    Transfer learning
    Build on existing
    models
    Hyperparameter auto-tuning
    Algorithm for finding the best
    hyperparameters for your
    model & data

    View full-size slide

  71. @PicardParis
    Learning to learn: neural architecture search
    Controller: proposes ML models
    Train & evaluate models
    20K
    times
    Iterate to find the
    most accurate
    model
    Layers
    Learning rate
    Research paper: bit.ly/nas-paper

    View full-size slide

  72. @PicardParis
    Updated output using
    your training data
    Transfer learning
    Model trained on a lot of data
    Your data
    Hidden layers

    View full-size slide

  73. @PicardParis
    Hyperparameter tuning
    ● Hyperparameters: any value which
    affects the accuracy of an algorithm,
    but is not directly learned by it
    ● HyperTune: Google-developed
    algorithm to find the best
    hyperparameter combinations for
    your model
    ● Available as a Cloud API:
    Vertex AI Vizier HyperParam #1
    Objective
    Want to find this
    Not these
    HyperParam
    #2

    View full-size slide

  74. @PicardParis
    MVP with Vertex AI

    View full-size slide

  75. 03
    More machine learning!
    From focusing on industry verticals…
    …to building neural networks

    View full-size slide

  76. @PicardParis
    AI platforms & industry verticals
    ● Vertex AI
    DS+AutoML+MLOps+…
    ● Document AI
    OCR+HW+Tables+Forms
    Invoices+Receipts+…
    ● Dialogflow
    Build your chat bot
    ● Call Center AI

    View full-size slide

  77. @PicardParis
    MVP with Document AI - Form Automation

    View full-size slide

  78. @PicardParis
    MVP with Document AI - Identity Form Autofiller

    View full-size slide

  79. @PicardParis
    TensorFlow: favorite ML repo on GitHub
    ML framework consolidation, 5 years later
    ● TensorFlow
    ● PyTorch
    ● JAX
    ● Keras (multi-framework)

    View full-size slide

  80. Time to
    wrap up!

    View full-size slide

  81. @PicardParis
    How fast & easy is it to build a prototype?
    Focus on ML
    Focus on Dev
    ML APIs
    Ready-to-use
    models
    AutoML
    Customized
    models
    ML
    Data & neural
    networks
    Gen AI
    Generative
    models
    hours days days, weeks…
    Time? hours
    none dataset
    dataset
    + NN + …
    Difficulty? prompt

    View full-size slide

  82. Resources
    Ready-to-use machine learning models
    Cloud Vision API cloud.google.com/vision
    Cloud Video Intelligence API cloud.google.com/video-intelligence
    Cloud Natural Language API cloud.google.com/natural-language
    Cloud Translation API cloud.google.com/translation
    Cloud Speech-To-Text API cloud.google.com/speech-to-text
    Cloud Text-to-Speech API cloud.google.com/text-to-speech
    Use, customize, and deploy generative models
    Generative AI Studio cloud.google.com/generative-ai-studio
    Build your custom model with your own data without any expertise
    Cloud AutoML cloud.google.com/automl
    Build your model from scratch with deep learning expertise
    Vertex AI cloud.google.com/vertex-ai
    Extract structured information from documents
    Document AI cloud.google.com/document-ai

    View full-size slide

  83. @PicardParis
    Python resources & articles
    Codelabs (g.co/codelabs)
    Using the Vision API Using the Video Intelligence API
    Using the Natural Language API Using the Translation API
    Using the Speech-to-Text API Using the Text-to-Speech API
    Inspirational articles (medium.com/@PicardParis)
    Summarizing videos in 300 lines of code
    Tracking video objects in 300 lines of code
    Face detection and processing in 300 lines of code
    Deploy a coloring page generator in minutes
    From pixels to information with Document AI
    Automate identity document processing
    Moderating text with the Natural Language API
    Deploying a Python serverless function in minutes

    View full-size slide

  84. bit.ly/ml-comic
    Google AI Online Comic

    View full-size slide

  85. goo.gle/vertexai
    Join Google Cloud Innovators

    View full-size slide

  86. Thank you!
    Any questions?
    Presentation + pointers
    bit.ly/ml-for-developers
    Laurent Picard
    @PicardParis

    View full-size slide