Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Generative A.I. + Law - Background, Application...

Generative A.I. + Law - Background, Applications and Use Cases Including GPT-4 Passes the Bar Exam

Professor Daniel Martin Katz - Generative A.I. + Law - Background, Applications and Use Cases Including GPT-4 Passes the Bar Exam - Topics include some History of NLP, The Path to Generative A.I. (including ChatGPT and GPT-4), Some Applications / Use Cases in the Legal Sector, GPT-4 Passes the Bar Exam, Legal A.I. Operational Issues and the Path Forward ... (Updated 06.14.23)

Daniel Martin Katz

April 20, 2023
Tweet

More Decks by Daniel Martin Katz

Other Decks in Research

Transcript

  1. G E N E R AT I V E A

    . I . + L AW @ computational professor daniel martin katz danielmartinkatz.com Illinois tech - chicago kent law 273Ventures.com B AC KG R O U N D, A P P L I CAT I O N S A N D U S E CA S E S I N C LU D I N G G P T- 4 PA S S E S T H E B A R E X A M https://bit.ly/3yxhY2e
  2. I THINK THIS DAY WILL GO DOWN AS A VERY

    IMPORTANT DAY IN THE HISTORY OF TECHNOLOGY
  3. SO NOW WE ARE JUST A FEW MONTHS INTO WHAT

    IS ARGUABLY THE MOST SUCCESSFUL PRODUCT LAUNCH IN HISTORY …
  4. < CONSIDERATION 1 > LANGUAGE IS THE ‘COIN OF THE

    REALM’ HERE IN THE WORLD OF LAW < CONSIDERATION 2 > THE MARCH OF LEGAL COMPLEXITY < CONSIDERATION 3 > MACHINES ARE INCREASINGLY IMPROVING IN LANGUAGE PROCESSING < CONSIDERATION 4 > LEGAL LANGUAGE != REGULAR LANGUAGE ?
  5. NOT ONLY IS IT THE SHEER VOLUME OF TEXT BUT

    ALSO THESE MASSIVE VOLUMES OF TEXT ARE NOTORIOUSLY COMPLEX
  6. W H E T H E R YO U A

    R E A L A R G E M U LT I N AT I O N A L CO R P O R AT I O N , A S M A L L B U S I N E S S O R A N I N D I V I D UA L C I T I Z E N …
  7. L AW H A S A CO M P L

    E X I TY C H A L L E N G E …
  8. DANIEL MARTIN KATZ, CORINNA COUPETTE, JANIS BECKEDORF & DIRK HARTUNG,

    COMPLEX SOCIETIES AND THE GROWTH OF THE LAW, 10 SCIENTIFIC REPORTS 18737 (2020) CORINNA COUPETTE, JANIS BECKEDORF, DIRK HARTUNG, MICHAEL BOMMARITO, & DANIEL MARTIN KATZ, MEASURING LAW OVER TIME: A NETWORK ANALYTICAL FRAMEWORK WITH AN APPLICATION TO STATUTES AND REGULATIONS IN THE UNITED STATES AND GERMANY, FRONT. PHYS. (2021 FORTHCOMING)
  9. KEY TAKE AWAY - OVER PAST TWO DECADES ~50%+ MORE

    STATUTES ~200% MORE REGULATIONS
  10. LETS BE CLEAR — FOR AI / NLP TO MAKE

    A DEEP INCURSION INTO THIS FIELD …
  11. THEY DESCRIBE LEGAL DOCUMENTS AND ARGUMENTS USING TERMS SUCH AS

    ‘LEGALESE’ ‘LEGAL JARGON’ ‘LEGAL GOBBLEDYGOOK’
  12. FRANKLY *NO* LEGAL TECH SOLUTION BUILT TO DATE HAS BEEN

    ABLE TO REALLY WORK WELL WITH NUANCES OF LEGAL LANGUAGE …
  13. (1) THE (LONG) PATH TO GENERATIVE A.I. (2) WHAT IS

    GPT / CHATGPT / GPT-4 ? (3) BAR EXAM AS WINDOW INTO CAPABILITIES (4) LLMs IN THE DELIVERY OF LEGAL SERVICES (5) STRATEGIC CONSIDERATIONS AND THE ROAD AHEAD PRESENTATION IN FIVE PARTS
  14. OVER THE COURSE OF MANY YEARS WE WERE ABLE TO

    LEVERAGE ALTERNATIVE FORMS OF COMPUTATION …
  15. “BEFORE 1949, ‘COMPUTERS LACKED A KEY PREREQUISITE FOR INTELLIGENCE: THEY

    COULDN’T STORE COMMANDS, ONLY EXECUTE THEM …” http://sitn.hms.harvard.edu/ fl ash/2017/history-arti fi cial-intelligence/
  16. BIG IDEA IN AI IS TO DEVELOP IN MACHINES SOME

    LEVEL OF SYNTHETIC (OR ARTIFICIAL) REPRESENTATION OF A PREVIOUSLY HUMAN CENTERED PROCESS
  17. EVEN IN THE FOUNDATIONAL DAYS THERE WAS A VIEW ABOUT

    THE ROLE OF A ‘CONVERSATIONAL AGENT’ AS A MEANS TO EVALUATE THE QUALITY OF AN AI SYSTEM
  18. WHICH IS TO SAY THAT LANGUAGE HAS ALWAYS BEEN A

    CORE TOPIC FOR THE FIELD OF A.I.
  19. NOTE THE MOVIE WAS MOSTLY ABOUT CRACKING ENIGMA BUT HERE

    IS WHERE IT GETS ITS TITLE THE TURING TEST
  20. IT IS IMPORTANT TO REMEMBER WITH TOPICS SUCH AS ARTIFICIAL

    INTELLIGENCE AND NATURAL LANGUAGE PROCESSING …
  21. “MACHINES WILL BE CAPABLE, WITHIN TWENTY YEARS, OF DOING ANY

    WORK THAT A MAN CAN DO.” – Herbert Simon in 1965
  22. “MACHINES WILL BE CAPABLE, WITHIN TWENTY YEARS, OF DOING ANY

    WORK THAT A MAN CAN DO.” – Herbert Simon in 1965 did not pan out by 1985 but did turn out okay for him …
  23. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0174698 Katz DM, Bommarito MJ II, Blackman J (2017), A

    General Approach for Predicting the Behavior of the Supreme Court of the United States. PLoS ONE 12(4): e0174698.
  24. http://www.sciencemag.org/news/ 2017/05/arti fi cial-intelligence-prevails- predicting-supreme-court-decisions Professor Katz noted that in

    the long term … “We believe the blend of experts, crowds, and algorithms is the secret sauce for the whole thing.” May 2nd 2017
  25. “Lawyers say the real value in mediation and arbitration might

    in the future come from large-scale data analysis of arbitrators and mediators themselves, in an effort to predict outcomes and potentially affect the course of settlements … Matthew Saunders, partner at Ashurst, notes that data analytics “could be extended to predicting which way arbitrators or a mediator might go”. Such technology may yet be some way off. In mediation, “a skilled facilitator helps the parties to explore where common ground can be found as the basis for an amicable settlement,” says James Freeman, arbitration partner at Allen & Overy. “The mediation process”, he adds, “is inherently a human one”.
  26. AND OF COURSE THERE ARE OTHER GENERATIVE MODELS MY SON

    AND I RECENTLY CREATED ‘BROCCOLI MAN’ GENERATIVE A.I. ART
  27. MARK LIBERMAN, "THE TREND TOWARDS STATISTICAL MODELS IN NATURAL LANGUAGE

    PROCESSING." NATURAL LANGUAGE AND SPEECH: SYMPOSIUM PROCEEDINGS BRUSSELS, SPRINGER, (1991)
  28. “There have been a series of clever approaches to backdoor

    into semantics* … (*while also being scalable) Semantic Methods (Fairly Difficult) Syntax Methods (Fairly Easy) Historically, Big Divide between Semantics and Syntax Quasi-Semantic Methods
  29. REGULAR EXPRESSION (REGEX) RULES-BASED METHOD(S) THAT CAN BE USED TO

    LOOK FOR WORD PATTERNS AND RETURN RESULTS FOR THOSE PATTERNS
  30. TF - IDF TERM FREQUENCY INVERSE DOCUMENT FREQUENCY EXPLOITS THE

    FREQUENCY OF WORDS IN DOCUMENTS IN ORDER TO PROFILE THEM
  31. SEMANTICS IS ABOUT THE MEANING OF INDIVIDUAL WORDS SEMANTICS IS

    RELATIONSHIP BETWEEN WORDS THAT INTERACT TO PRODUCE HIGHER ORDER MEANING
  32. LOTS OF THE TRAINING FOR LAWYERS IS ACTUALLY ABOUT THE

    DEEP SEMANTIC INTERPRETATION OF LANGUAGE CONTRACTS STATUTES REGULATIONS JUDICIAL DECISIONS EXAMPLES —>
  33. HERE WE ANALYZE TRENDS WITHIN EVERY LEGAL NLP PAPER (MORE

    THAN 600+ PAPERS) WRITTEN OVER THE PAST DECADE
  34. GPT = IT IS A LARGE LANGUAGE MODEL (LLM) BUT

    FAR FROM THE ONLY ONE GENERATIVE PRE-TRAINED TRANSFORMER
  35. TODAY I WILL MOSTLY LIMIT MY COMMENTS TO GPT BUT

    JUST WANTED TO FLAG THERE ARE OTHER MODELS OUT THERE
  36. THE LEADING LLM MODELS XAVIER AMATRIAIN, TRANSFORMER MODELS: AN INTRODUCTION

    AND CATALOG, ARXIV:2302.07730 (2023) HTTPS://ARXIV.ORG/ABS/2302.07730
  37. THE OPEN AI GPT MODELS COMBINE SEVERAL IDEAS TOGETHER FROM

    THE VARIOUS PAPERS IN THIS PROGRESSION …
  38. CHATGPT / GPT-3.5 / GPT-4 ARE PROPRIETARY MODELS SO WE

    DO NOT KNOW FOR CERTAIN ALL OF THE DEEP DETAILS
  39. I MET STEPHEN WOLFRAM’S ROBOT PRESENCE IN HOUSTON IN 2011

    HE APPEARED AT A COMPUTATIONAL LAW CONFERENCE VIA ROBOT (A REAL BOSS MOVE) STEPHEN AND YOURS TRULY TALKING ABOUT RULE 90 IT WAS A ‘BIG BANG’ MOMENT
  40. BETTER HARDWARE PARALLEL COMPUTING ATTENTION MECHANISM THE PATH TO MEGASCALE

    LARGE LANGUAGE MODELS HAS BEEN DRIVEN BY A MIXTURE OF
  41. THE SAME TECH THAT HAS POWERED HIGHLY IMMERSIVE GAMING HAS

    ALSO PUSHED SCIENCE FORWARD … https://www.thegamer.com/pc-games-best-intense-graphics/
  42. APRIL 5, 2023 NOT COMPARED DIRECTLY WITH THE H100 BUT

    THIS SHOWS YOU THE NATURE OF THE COMPETITION TAKING PLACE https://www.cnbc.com/2023/04/05/google- reveals-its-newest-ai-supercomputer- claims-it-beats-nvidia-.html
  43. WOULD TAKE 300 YEARS TO TRAIN EVEN GPT-3 ON A

    SINGLE GPU BUT WE TRANSFORMER ARCHITECTURE ALLOWS FOR SIGNIFICANT PARALLELIZATION
  44. TRAINED MODEL ON A BILLION PAIRS OF WORDS IN JUST

    3.5 DAYS ON EIGHT NVIDIA GPUS … THIS APPROACH HIGHLIGHTED A PATH TO THE PRESENT WITH EVER LARGER FUTURE https://blogs.nvidia.com/blog/2022/03/25/ what-is-a-transformer-model/
  45. GPT-3 (RELEASED IN 2020) IS THIRD GENERATION OF THE GPT

    FAMILY GPT-3.5 IS AN INTERMEDIATE IMPROVEMENT ON ORIGINAL GPT-3
  46. GPT-3 (RELEASED IN 2020) IS THIRD GENERATION OF THE GPT

    FAMILY GPT-4 IS THE MOST RECENT RELEASE AND IT IS LIKELY TO BE A ‘FAMILY OF MODELS’ GPT-3.5 IS AN INTERMEDIATE IMPROVEMENT ON ORIGINAL GPT-3
  47. SOME INPUTS TO GPT COMMON CRAWL WEBTEXT2 BOOKS1/2 WIKIPEDIA OPENAI

    LIKELY DID SOME SIGNIFICANT CLEANING / PREPROCESSING OF THESE SOURCES PRIOR TO TRAINING GPT-3 IS 175B PARAMETERS TRAINED WITH 499B TOKENS
  48. TRANSFORMER ARCHITECTURE NEURAL NETWORK MODEL SCALE REINFORCEMENT LEARNING PRETRAINING JUST

    SOME KEY TERMINOLOGY YOU SHOULD LEARN CONTEXT WINDOW ATTENTION MECHANISM MODEL TUNING GRADIENT DESCENT
  49. COMBINE SPECIFIC TRAINING ON INSTRUCTIONS TAKE ORIGINAL LLM GPT-3 https://yaofu.notion.site/How-does-GPT-Obtain-its-Ability-Tracing-Emergent-

    Abilities-of-Language-Models-to-their-Sources-b9a57ac0fcf74f30a1ab9e3e36fa1dc1 CHATGPT IS AN OFFSHOOT OF GPT-3
  50. FURTHER TUNE ON INSTRUCTIONS COMBINE SPECIFIC TRAINING ON INSTRUCTIONS TAKE

    ORIGINAL LLM GPT-3 https://yaofu.notion.site/How-does-GPT-Obtain-its-Ability-Tracing-Emergent- Abilities-of-Language-Models-to-their-Sources-b9a57ac0fcf74f30a1ab9e3e36fa1dc1 CHATGPT IS AN OFFSHOOT OF GPT-3
  51. FURTHER TUNE ON INSTRUCTIONS SUPPORT WITH SPECIFIC REINFORCEMENT LEARNING FEEDBACK

    LOOP COMBINE SPECIFIC TRAINING ON INSTRUCTIONS TAKE ORIGINAL LLM GPT-3 https://yaofu.notion.site/How-does-GPT-Obtain-its-Ability-Tracing-Emergent- Abilities-of-Language-Models-to-their-Sources-b9a57ac0fcf74f30a1ab9e3e36fa1dc1 CHATGPT IS AN OFFSHOOT OF GPT-3
  52. RLHF WAS CLEARLY PART OF THE BREW HERE BUT GENERALLY

    DOES NOT SEEM TO BE MOVING THE NEEDLE ALL THAT MUCH AT THIS POINT
  53. SO THE POINT HERE IS TO MAKE CLEAR THAT THERE

    ARE MANY STEPS AND MOVING PARTS …
  54. THERE ARE OTHER PLAYERS BESIDES OPENAI AND THEY WILL LIKELY

    PURSUE OTHER STEPS / MOVING PARTS … AND MANY MORE …
  55. AGAIN WE HAD BEEN TRYING TO DISCUSS / HIGHLIGHT THE

    INCREASING CAPABILITIES OF LANGUAGE MODELS FOR SOME TIME
  56. REMEMBER THIS IS A TASK THAT MANY WOULD THINK IS

    IMPOSSIBLE (LAST YEAR I WOULD HAVE SAID IT WOULD NOT OCCUR FOR MANY YEARS)
  57. SOME TASKS LAWYERS REGULARLY UNDERTAKE ARE ACTUALLY WAY *EASIER* THAN

    THE BAR EXAM (AND OTHERS ARE HARDER) LET’S BE CLEAR …
  58. EXAMINEE MUST POSSESS A THRESHOLD AMOUNT OF LEGAL KNOWLEDGE AND

    READING COMPREHENSION SKILLS AND SEMANTIC AND SYNTACTIC COMMAND OF THE ENGLISH LANGUAGE
  59. TO BE INVOLVED IN THE LAUNCH OF GPT-4* *OBVIOUSLY WE

    ARE JUST PLAYING A VERY SMALL ROLE IN THE BIG PICTURE (BUT WE WILL TAKE IT!)
  60. CHATGPT BAR EXAM PASSES Paper Now Available on SSRN! March

    15, 2023 - Version 1.01 https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233
  61. MULTISTATE BAR EXAM (MBE) SUBJECTS TESTED TORTS CONTRACTS EVIDENCE REAL

    PROPERTY CIVIL PROCEDURE CONSTITUTIONAL LAW CRIMINAL LAW AND PROCEDURE
  62. MULTISTATE ESSAY EXAMINATION (MEE) SUBJECTS TESTED TORTS EVIDENCE CONTRACTS FAMILY

    LAW REAL PROPERTY CONFLICT OF LAWS TRUSTS AND ESTATES CONSTITUTIONAL LAW BUSINESS ASSOCIATIONS FEDERAL CIVIL PROCEDURE UNIFORM COMMERCIAL CODE CRIMINAL LAW AND PROCEDURE
  63. ‘REPRESENTATIVE GOOD ANSWERS’ https://mdcourts.gov/sites/default/ fi les/import/ble/ examanswers/2022/202207uberepgoodanswers.pdf SEVERAL STATE BARS

    RELEASE REPRESENTATIVE GOOD ANSWERS THESE ARE VERY HELPFUL FOR EVALUATION PURPOSES AS THEY ARE ACTUAL ANSWERS WHICH ARE ABOVE MERELY PASSING ANSWERS (BUT NOT NECESSARILY PERFECT)
  64. MULTISTATE PERFORMANCE TEST (MPT) SKILLS TESTED EXAMINEE MUST COMPLETE A

    PRACTICAL LAWYERING TASK SUCH AS LEGAL ANALYSIS, FACT ANALYSIS, PROBLEM SOLVING, ORGANIZATION AND MANAGEMENT OF INFORMATION, AND CLIENT COMMUNICATION
  65. MULTISTATE PERFORMANCE EXAM (MPT) 10-15 PAGES OF MATERIALS THE FILE

    = THE FACTS THE LIBRARY = THE LAW ~5000 TOKEN INPUTS (ACCESS HERE) BIT.LY/40F3FQ2
  66. SOME KEY SHORTCOMINGS OF GPT-4 ON THE BAR EXAM FAILED

    TO PROPERLY CALCULATE THE DISTRIBUTION OF ASSETS FROM A TESTAMENTARY TRUST WHICH HAS BEEN DEEMED TO BE INVALID PROVIDED AN INCORRECT ANSWER ON A CIVIL PROCEDURE QUESTION REGARDING DIVERSITY JURISDICTION AFTER THE JOINDER OF A NECESSARY PARTY PROVIDED IMPROPER ANALYSIS ON A REAL PROPERTY (REAL ESTATE) SUBQUESTION REGARDING BOTH THE PROPER DESIGNATION OF A FUTURE INTEREST AND THE APPLICATION OF THE RULE AGAINST PERPETUITIES
  67. ZERO SHOT ENTER PROMPT RECEIVE ANSWER ’PROMPT ENGINEERING’ IS ABOUT

    TUNING / REFINING PROMPTS TO OBTAIN BETTER ANSWERS
  68. *NOT* A VERY SOPHISTICATED TAKE ON THIS SITUATION BUT REFLECTIVE

    OF THE MODAL PERSPECTIVE OF LAWYERS / LAW PROFS NOTE: ALL COMMERICAL RESEARCH TOOLS ALREADY HAVE ‘A.I.’ IN THEM
  69. ONE WAY TO DRAMATICALLY REDUCE THE LIKELIHOOD OF A HALLUCINATION

    IS TO MOVE OUT OF THE ‘ZERO SHOT’ PARADIGM
  70. ONE SHOT GET RESULT QUERY THAT RESULT AGAINST SOMETHING ELSE

    REFINE RESULT ENTER PROMPT OUTPUT FINAL ANSWER (RETRIEVAL AUGMENTATION) (IF NEEDED) EXAMPLES
  71. THE GENERALIZATION OF ALL OF THIS IS AN ORCHESTRATION LAYER

    TO BRING MULTIPLE STREAMS / TECHNOLOGIES TOGETHER
  72. HERE DAMIEN RIEHL FROM FASTCASE WILL WALK YOU THROUGH HOW

    TO BUILD MORE OF A BRIEF THROUGH SEQUENTIAL PROMPTING … HTTPS://WWW.LINKEDIN.COM/PULSE/CHATGPT- LEGAL-BRIEFWRITING-TOOL-DAMIEN-RIEHL/
  73. FEW SHOT (LEGAL DOMAIN) (LANGCHAIN / AUTOGPT STYLE ORCHESTRATION LAYER)

    https://docs.kelvin.legal/docs/examples/due-dilligence/ https://docs.kelvin.legal/docs/examples/litigation-automation/
  74. OR EVEN OTHER PLUGINS … FOR EXAMPLE WOLFRAM COULD HELP

    SOLVE FOR ISSUES WITH QUANTITATIVE REASONING
  75. OR EVEN OTHER PLUGINS … FOR EXAMPLE WOLFRAM COULD HELP

    SOLVE FOR ISSUES WITH QUANTITATIVE REASONING
  76. (2) DRAFT A REPLY BRIEF HERE DAMIEN RIEHL FROM FASTCASE

    WILL WALK YOU THROUGH HOW TO BUILD MORE OF A BRIEF THROUGH SEQUENTIAL PROMPTING … HTTPS://WWW.LINKEDIN.COM/PULSE/CHATGPT- LEGAL-BRIEFWRITING-TOOL-DAMIEN-RIEHL/
  77. HERE ARE JUST A FEW MORE OF MANY OTHER POTENTIAL

    AREAS WHERE THIS CLASS OF TECH MIGHT BE USEFUL … E-DISCOVERY LEGAL BILLING DEALS DATABASES CONTRACT DRAFTING AND MANY MANY OTHER EXAMPLES
  78. HERE A JUST A COUPLE OF RESOURCES THAT HAVE BEEN

    MADE AVAILABLE (THERE WILL BE MANY MORE) https://ssrn.com/abstract=4404017
  79. IT IS IMPORTANT TO USE THESE TOOLS (AT LEAST FOR

    NOW) AS PART OF A HUMAN IN THE LOOP PROCESS
  80. IN OTHER WORDS, FOR MOST USE CASES HUMANS SHOULD ALWAYS

    BE PART OF THE ‘RETRIEVAL AUGMENTATION’ LAYER
  81. AMAZING CAPABILITIES KEEPING UP WITH THE MARKET NOT WANTING TO

    LOOK OUT OF STEP TO CLIENTS FEAR OF MISSING OUT FOMO
  82. FUD VS FEAR, UNCERTAINTY, & DOUBT FEAR OF MISSING OUT

    FOMO ORGANIZATIONS / INDIVIDUALS TOGGLE BETWEEN THESE TWO POINTS OF VIEW (SOMETIME WITHIN THE SAME CONVERSATION)
  83. OBVIOUSLY WE HAVE A COMPANY AND WE WORKED WITH CASETEXT

    ON THE BAR PAPER — SO PERHAPS WE ARE NOT 100% NEUTRAL … SHANG GAO PABLO ARREDONDO
  84. BUT OF COURSE THESE WILL NOT BE THE LAST TOOLS

    AND IT REMAINS TO BE SEEN HOW IT WILL ALL SHAKE OUT …
  85. WHAT IS THE DATA AND TECHNOLOGY STRATEGY THAT ORGANIZATIONS CAN

    EMPLOY IN LIGHT OF THESE CHANGE IN THE TECHNOLOGY LANDSCAPE ?
  86. BUILD VS ASSEMBLE VS BUY BUILD IS LIKELY OFF THE

    TABLE FOR THE SHORT TO MEDIUM TERM ASSEMBLE HOWEVER IS A VIABLE OPTION SO CAREFUL PROCUREMENT IS GOING TO BE THE STRATEGIC PATH
  87. DATA STRATEGY HOW TO COLLECT / REGULARIZE DATA FOR USE

    INSIDE THESE SYSTEMS OR AS A LAYER ON TOP OF THESE SYSTEMS
  88. SEVERAL OTHER STRATEGIC CONSIDERATIONS WHAT IS THE ROLE OF DOMAIN

    SPECIFIC TRAINING? HOW COULD I CUSTOMIZE THESE MODELS FOR USE WITHIN MY OWN ORGANIZATION ? WHICH MODELS / LLMS DO I LEVERAGE ? HOW DO I CHOOSE ? CAN I MIX AND MATCH ? WHAT IS THE DIFFERENTIAL IMPACT OF THESE MODELS BY LEGAL ORGANIZATION ? HOW DO I THINK ABOUT QUESTIONS OF PRIVACY AND INFORMATION SECURITY ? COPYRIGHT?
  89. LOOK TO PLACES WHERE THERE IS A CONCENTRATION OF LOW

    TO MEDIUM COMPLEXITY TASKS MACRO LEVEL ALSP, LPO, LEGAL OPS, SPECIALIZE LAW FIRMS, ETC.
  90. TRAINING FOLKS TO USE THESE TOOLS TO THE MAXIMUM EXTENT

    POSSIBLE HUMAN CAPITAL (THIS WILL BE AN IMPORANT LIMITATION ON ORGANIZATIONAL SUCCESS)
  91. SHOULD / CAN I PUT CONFIDENTIAL CLIENT DATA IN THESE

    SYSTEMS ? MANY FOLKS HAVE ASKED ME —
  92. PROCEED WITH CAUTION — I WOULD TELL YOU TO AUDIT

    PRECISELY WHAT IS HAPPENING AND WHERE IT IS HAPPENING SHOULD / CAN I PUT CONFIDENTIAL CLIENT DATA IN THESE SYSTEMS ? (CLIENT CONSENT + AUDITED SAFEGUARDS) MANY FOLKS HAVE ASKED ME —
  93. THE ROGUE ASSOCIATE (OR PARTNER) PASTING STUFF INTO GPT OR

    ANOTHER LLM MODEL ONE VERSION OF THE CONCERN
  94. WE HAVE BEEN WORKING WITH A FEW FIRMS ON HOW

    TO NAVIGATE THE THORNY INFO SEC AND LEGAL ETHICS ISSUES
  95. IF NOT, DOES VENDOR HAVE SOC2 / ISO2700 ? HOW

    MANY OTHER PARTIES ARE IN THE FLOW OF YOUR INFORMATION ? CONTRACTUAL REPRESENTATIONS FROM THOSE 3RD PARTIES? IS THE SOLUTION AVAILABLE ON PREM ? SOME VENDOR QUESTIONS WHAT CONTROLS DOES THE VENDOR APPLY TO THE TRANSITING OF YOUR DATA ?
  96. LEVERAGING THESE TOOLS ON INTERNAL DATA IS REALLY THE HOLY

    GRAIL …. INTERNAL DATA WE ARE GOING TO GET THERE BUT IT REQUIRES A REAL THOUGHT OUT APPROACH HERE
  97. WHAT IS GOING TO HAPPEN WITH COPYRIGHT ? I THINK

    THIS IS GOING TO LEAD TO A MAJOR DECISION IN COPYRIGHT AND/OR ACTION BY CONGRESS, ETC. MANY FOLKS HAVE ASKED ME —
  98. WILL A GENERAL MODEL BEAT A DOMAIN SPECIFIC MODEL ?

    SURE THAT IS POSSIBLE / LIKELY — IF THE GENERAL MODEL IS LARGE AND DOMAIN MODEL IS SMALL
  99. SOURCES OF LLM MODEL IMPROVEMENT USE DIFFERENT TRAINING DATA NEURAL

    NET ACCESS / IMPLEMENTATION INSTRUCTION MODULE FINE TUNING RETRIEVAL AUGMENTATION (RAG) PROMPT ENGINEERING CHAIN OF THOUGHT PROMPTING RLHF AGENT KNOWLEDGE GRAPH TRAVERSAL MODEL CREATOR MODEL USER (REINFORCEMENT LEARNING FROM HUMAN FEEDBACK)
  100. SO WILL THERE BE A ‘LAW GPT’? PROBABLY SOON IT

    IS AN OPEN QUESTION WILL IT BE BETTER THAN GPT-4 ON LAW TASKS?
  101. WHICH MODELS / LLMS DO I LEVERAGE ? HOW DO

    I CHOOSE ? CAN I MIX AND MATCH ? (5)
  102. SO WE HAVE SEEN QUITE A BIT OF THIS SINCE

    THE LAUNCH OF CHATGPT NOW
  103. THIS IS NOT INHERENTLY BAD BUT I THINK ORGANIZATIONS NEED

    TO THINK ABOUT THEIR PRECISE STRATEGY HERE NOW
  104. SOURCES OF LLM MODEL IMPROVEMENT USE DIFFERENT TRAINING DATA NEURAL

    NET ACCESS / IMPLEMENTATION (ALPACA, ETC.) INSTRUCTION MODULE FINE TUNING RETRIEVAL AUGMENTATION (RAG) PROMPT ENGINEERING CHAIN OF THOUGHT PROMPTING (LANGCHAIN, ETC.) RLHF (E.G. BLOOMBERG GPT) (DOLLY, ETC.) (PLUGINS / UI / GROUNDING) MODEL CREATOR MODEL USER (VARIOUS FORMS) (VARIOUS FORMS) (REINFORCEMENT LEARNING FROM HUMAN FEEDBACK) (AGENTS BUILD / TRAVERSE GRAPHS) (MANY APPROACHES) AGENT KNOWLEDGE GRAPH TRAVERSAL
  105. SO WHAT IS THE VENDOR VALUE ADD OVER BASE LLM

    MODEL? TRUST BUT VERIFY THERE IS GOING TO BE SIGNIFICANT PRESSURE ON VENDORS TO OVERSTATE THE ACTUAL CONTRIBUTION OF THEIR OFFERING ABOVE JUST AN LLM API CALL I SUGGEST YOU TRUST BUT VERIFY ALL CLAIMS
  106. CAN YOU DETERMINE WHETHER A VENDOR IS SIMPLY SELLING YOU

    A UI/ UX WRAPPER ON TOP OF AN LLM ? AGAIN THIS MIGHT BE OKAY BUT DO NOT PAY AN EXCESSIVE PREMIUM …
  107. IT WOULD BE A BIG MISTAKE TO GO ALL IN

    ON ANY VENDOR OR SPECIFIC LLM AT THIS POINT PLACE A SMALL TO MEDIUM BET
  108. IT IS A HIGHLY COMPETITVE LANDSCAPE AND OTHER FOLKS WILL

    BE ENTERING THE MARKET WITH ADDITIONAL OFFERINGS
  109. THERE ARE OTHER PLAYERS BESIDES OPENAI AND THEY WILL LIKELY

    PURSUE OTHER STEPS / MOVING PARTS … AND MANY MORE …
  110. REMEMBER THAT GOOGLE ACTUALLY INVENTED MUCH OF THE TECH THAT

    BROUGHT YOU GPT https://www.forbes.com/sites/richardnieva/ 2023/02/08/google-openai-chatgpt-microsoft- bing-ai/ https://www.nytimes.com/2023/01/20/ technology/google-chatgpt-arti fi cial- intelligence.html
  111. CAN YOU BOTH RAPIDLY AND RIGOROUSLY EVALUATE THE QUALITY OF

    MODELS OR FIGURE OUT HOW TO MIX AND MATCH THEM … ? (ENSEMBLE LEARNING ANYONE?)
  112. NOTE THIS IS A CHATGPT PAPER NOT EVEN CONSIDERING POTENTIAL

    ADDITIONAL GAINS IN GPT-4 https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4375283 “TIME TAKEN DECREASES BY 0.8 SDS AND OUTPUT QUALITY RISES BY 0.4 SDS.”
  113. OVER 50% PRODUCTIVITY GAINS FROM GPT-3.5 VERSION OF COPILOT (USED

    FOR PROGRAMMING) https://arxiv.org/pdf/2302.06590 COPILOT X RELEASED MARCH 23RD 2023 https://github.blog/2023-03-22-github-copilot- x-the-ai-powered-developer-experience/ FEBRUARY 13 2023
  114. https://arxiv.org/abs/2303.10130 “OUR FINDINGS REVEAL THAT AROUND 80% OF THE U.S.

    WORKFORCE COULD HAVE AT LEAST 10% OF THEIR WORK TASKS AFFECTED BY THE INTRODUCTION OF LLMS, WHILE APPROXIMATELY 19% OF WORKERS MAY SEE AT LEAST 50% OF THEIR TASKS IMPACTED. WE DO NOT MAKE PREDICTIONS ABOUT THE DEVELOPMENT OR ADOPTION TIMELINE OF SUCH LLMS”
  115. TASKS VS JOBS WHAT IS BEING MADE POSSIBLE? WE RACE

    WITH THE MACHINE LABOR MARKET IMPLICATIONS (IT IS COMPLICATED) (IN A WORLD WHERE TEXT GENERATION IS FAR CHEAPER)
  116. I THINK THIS IS GOING TO BE THE MOST EXCITING

    YEAR IN TECHNOLOGY IN A VERY LONG TIME !
  117. G E N E R AT I V E A

    . I . + L AW @ computational professor daniel martin katz danielmartinkatz.com Illinois tech - chicago kent law 273Ventures.com B AC KG R O U N D, A P P L I CAT I O N S A N D U S E CA S E S I N C LU D I N G G P T- 4 PA S S E S T H E B A R E X A M https://bit.ly/3yxhY2e