Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PM Update: Whats New on Oracle Machine Learning...

PM Update: Whats New on Oracle Machine Learning on Autonomous

On this weekly Office Hours for Oracle Machine Learning on Autonomous Database, the OML team gave an update on the many new features of OML, including: OML Notebook updates, updated structure for OML and REST URLs, OML Services support for Cluster ONNX models, support for Italian language on OML Services cognitive text, new filters for modelType and shared attributes. They also showed a new ESA Wiki model for Database 19c that can be used on premises and in the Cloud.

The Oracle Machine Learning product family supports data scientists, analysts, developers, and IT to achieve data science project goals faster while taking full advantage of the Oracle platform.

The Oracle Machine Learning Notebooks offers an easy-to-use, interactive, multi-user, collaborative interface based on Apache Zeppelin notebook technology, and support SQL, PL/SQL, Python and Markdown interpreters. It is available on all Autonomous Database versions and Tiers, including the always-free editions.

OML includes AutoML, which provides automated machine learning algorithm features for algorithm selection, feature selection and model tuning, in addition to a specialized AutoML UI exclusive to the Autonomous Database.

OML Services is also included in Autonomous Database, where you can deploy and manage native in-database OML models as well as ONNX ML models (for classification and regression) built using third-party engines, and can also invoke cognitive text analytics.

Marcos Arancibia

December 07, 2021
Tweet

More Decks by Marcos Arancibia

Other Decks in Technology

Transcript

  1. OML Product Management Update: What’s New in Oracle Machine Learning

    OML AskTOM Office Hours Marcos Arancibia and Sherry LaMonica Product Management, Oracle Machine Learning Move the Algorithms; Not the Data! Copyright © 2021, Oracle and/or its affiliates. This Session will be Recorded
  2. • OML Notebook updates • Updated structure for OML and

    REST URLs • OML Services – Clustering for ONNX models – Cognitive Text: New support for Italian language – New filters for modelType and shared attributes • New ESA Wiki model for Database 19c What’s New in Oracle Machine Learning? Copyright © 2021, Oracle and/or its affiliates 2
  3. New additions and updates • Upgraded to Zeppelin 0.9 •

    Import Jupyter notebooks (*ipynb) • 79 template example notebooks • 40 OML4Py notebooks • 39 OML4SQL notebooks (4 are 21c-only) • 15 new notebooks • 3 OML XGBoost + MSET templates for 21c • Data loading from GitHub mechanism highlighted (OML Run-me-first) • Included details of available algorithm settings options in applicable notebooks • Added demos using the OML4SQL to score data and display prediction details to OML4Py notebooks OML Notebooks Copyright © 2021, Oracle and/or its affiliates 4
  4. • Base URL now includes tenancy ID and database name,

    and works for OML Notebooks too • Token request no longer needs Tenancy OCID nor Database name in the PATH • Root domain is now oraclecloudapps.com CURRENT NEW OML4Py and OML Services REST APIs Copyright © 2021, Oracle and/or its affiliates. 6 https://adb.us-sanjose-1.oraclecloud.com /tenant/ocid1.tenancy.oc1..aaaaa…/database/omldb https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com Database name root domain Datacenter region Tenancy ID (not OCID nor name) New URL structure Database name root domain Datacenter region Tenancy OCID |---------section required for Token acquisition-------| Same for Token acquisition
  5. New style URL omlserver/omlusers/api/oauth2/v1/token • omlserver = OML cloud service

    location URL for Autonomous Database, for example: https://qtraya2braestch-omldb.adb.us- sanjose-1.oraclecloudapps.com Old style URL omlserver/omlusers/tenants/tenant/datab ases/database/api/oauth2/v1/token • omlserver = OML cloud service location URL for Autonomous Database, for example : https://adb.us-sanjose-1.oraclecloud.com • tenant = Oracle Autonomous Database Tenancy OCID, in the form of: OCID1.TENANCY.OC1..AAAAAAAAFCUE4…… • database = Oracle Autonomous Database database name, for example: OMLDB REST API Authentication Copyright © 2021, Oracle and/or its affiliates. 7 Standard call for all OML REST API token endpoints
  6. Where can I find the URLs that correspond to my

    tenancy? Location of REST URLs From your Oracle Autonomous Database instance: 1. Click Service Console 2. Click Development 3. Scroll down to Oracle Machine Learning RESTful Services and copy the URL Oracle Machine Learning RESTful URLs Copyright © 2021, Oracle and/or its affiliates 8 https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/omlusers/ https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/oml/ https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/omlmod/ https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/ords/
  7. Initial call to get a token and be able to

    access OML REST endpoints To request a token for accessing OML REST API endpoints, you need a valid user and password for your Oracle Autonomous Database with the proper grants as an OML Developer from the OML Administrator. For the following REST call, we will consider: omlserver=https://tenancy id-database.adb-region.oraclecloudapps.com $ curl –I \ --header 'Content-Type: application/json' \ --header 'Accept: application/json' \ –d '{"grant_type":"password", "username": "YourOMLuser", "password": "YourOMLpass"}’\ “omlserver/omlusers/api/oauth2/v1/token" Token acquisition Copyright © 2021, Oracle and/or its affiliates 9
  8. Call to get the Open API description for the current

    OML Services Open API description To review the Open API specification for the OML Services REST end points, you need to pass a valid token. For the following REST call, we will consider: OML_URL = omlserver/omlmod, and remember to provide the full Token after "Bearer" $ curl --location --request GET 'OML_URL/v1/api' \ --header 'Authorization: Bearer eyJhbGciOiJSUzI1NiJ9.....==' Send a Request – OML Services Copyright © 2021, Oracle and/or its affiliates 10 This is the token
  9. Call to get the Open API description for the current

    OML4Py REST services Open API description To review the Open API specification for the OML4Py REST end points, you need to pass a valid token. For the following REST call, we will consider: OML_URL = omlserver/oml, and remember to provide the full Token after "Bearer" $ curl --location --request GET 'OML_URL/api/py-scripts/v1 ' \ --header 'Authorization: Bearer eyJhbGciOiJSUzI1NiJ9.....==' Send a Request – OML4Py Copyright © 2021, Oracle and/or its affiliates 11 This is the token
  10. Link to access OML in Autonomous Database Original Link to

    access OML User Interface today from a bookmarked link: https://adb.us-sanjose-1.oraclecloud.com/omlusers/login.html Plus the necessary options: ?tenant=OCID1.TENANCY.OC1..AAAAA....&database=OMLDB&redirect_uri=https://adb.us-sanjose- 1.oraclecloud.com/omlusers/api/oauth2/v1/login New Style URL to access OML User Interface: https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/oml New Style URL to access OML User Interface user administration: https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/omlusers Oracle Machine Learning Copyright © 2021, Oracle and/or its affiliates 12
  11. Example of clustering two attributes from the Breastcancer Dataset Visualization

    of the original Data with the target in different colors Copyright © 2021, Oracle and/or its affiliates 14 ONNX Clustering is now supported OML Services
  12. Create the clustering model using only two input attributes for

    this small test, and show the predictions Show the Cluster Centroids as an example Export the SciKit Learn cluster model to ONNX format into Zip file for OML Services which includes the .onnx file and the metadata.json) cURL example of the OML Services scoring $ curl -L -X POST 'https://qtraya2braestch-omldb.adb.us-sanjose- 1.oraclecloudapps.com/omlmod/v1/deployment/SKLearn_kmeans_BC/score' -H 'Content-Type: application/json' -H 'Authorization: Bearer eyJ………iOiJ'-d '{"inputRecords":[{ "X": [[10.38, 17.77]]}] }'|jq Copyright © 2021, Oracle and/or its affiliates 15 ONNX Clustering is now supported OML Services
  13. Cognitive text capability for Italian Language Returns most relevant topics

    and weights: OML Services Copyright © 2021, Oracle and/or its affiliates 16 $ curl -X POST "${omlserver}/omlmod/v1/cognitive-text/topics" \ --header 'Content-Type: application/json’ \ --header "Authorization: Bearer ${token}" \ --data ‘{ "topN":5, "language": "ITALIAN", "textList":["Con Oracle Machine Learning, Oracle sposta gli algoritmi sui dati. Oracle esegue …… l'automazione richieste dai progetti di data science su scala aziendale, sia on-premise che nel cloud."] }’ Blog: OML Services Cognitive Text – Italian Language https://blogs.oracle.com/machinelearning/post/oml-services-cognitive- text---italian-language-now-available Example: Topic Discovery "topicResults": [ { "topic": "Oracle Corporation", "weight": 0.23331640964885378}, { "topic": "Oracle Database", "weight": 0.20443284083978977}, { "topic": "Big data", "weight": 0.16381463223223036}, { "topic": "Base di conoscenza", "weight": 0.13233125000617454}, { "topic": "Apprendimento automatico", "weight": 0.13091866812720565} ]
  14. New filters for model type and shared model attributes Filter

    by ONNX models $ curl -X GET --header "Authorization: Bearer $token" "${omlserver}/omlmod/v1/models?modelType=ONNX“ Filter by shared models $ curl -X GET --header "Authorization: Bearer $token" "${omlserver}/omlmod/v1/models?shared=true“ OML Services Copyright © 2021, Oracle and/or its affiliates 17
  15. Built under Database 19c • ESA is a pre-built model

    for feature extraction of explicit features in a knowledge base – Maps words to relevant concepts – Wikipedia is a good source for ESA - comprehensive knowledge base • The new ESA model was built using millions of Wikipedia articles available as of July 1, 2021 – Topics reduced to about 161,000 – Users can also create their own custom, domain-specific ESA models Blog: New Wiki ESA model available for 19c https://blogs.oracle.com/machinelearning/post/wiki-esa-model-available-for-database-19c New ESA Wiki Model Copyright © 2021, Oracle and/or its affiliates 19
  16. (shown only for general interest – many people/companies use their

    own custom processing) Load Wikipedia dumps Wikipedia dumps are compressed XML files. Individual pages are tagged as <page>. The contents of the pages is tagged as <text>. Contents inside <text> contain plenty of Wikipedia-specific information that is not visible and various brackets are present. Page Filtering To collect the pages that describe concepts and more general knowledge about various subjects, there is a lot of: parsing and stripping HTML tags from pages, partial tokenization, special characters removals, dropping of words with special characters or numbers and more. The outcome of Wikipedia page processing is tab-separated files. Category & Article DocStore from Oracle Labs is used to remove non-usable information and to split the Wikipedia XML dumps into individual entities including article and category pages (ignoring other types of pages). The outcome of DocStore processing is text with HTML tags. ESA Model Build We calculate the number of incoming links for every page using cross-page links. ESA model is reduced to retain the pages that are more general and describe concepts, filtering out References, References and links, Sources, Further reading etc.. The final ESA model is built with a limit of 200,000 Features and 1,000 Top Features retained, resulting in some 27 mi records and 800 MB in size (current version) Steps used by the Oracle Team (internally) to Process the Wikipedia data Copyright © 2021, Oracle and/or its affiliates 20 XML Article pages Category pages TSV Pages TSV pages x-links TSV pages by category OML in-DB ESA Wiki Model
  17. Download from https://oss.oracle.com/machine-learning/ Where can I download the new ESA

    Wiki model? Copyright © 2021, Oracle and/or its affiliates 21 Blog: New Wiki ESA model available for 19c https://blogs.oracle.com/machinelearning/post/wiki-esa-model-available-for-database-19c For complete examples, search OML Notebooks Template Examples for "ESA".
  18. Quick demo 22 Copyright © 2021, Oracle and/or its affiliates

    • New URL to access OML • OML Services on Postman • OML Services new URL • OML Services new modelType filter • OML Services new Clustering ONNX model support • OML4Py REST APIs new URL