Why Generative AI makes collaborative, versioned science more important than ever

Arfon Smith / 17 December 2024 Schmidt Sciences Why Generative
AI makes Collaborative, Versioned Science more important than ever

Software as a creditable research activity

Schmidt Sciences Why Generative AI makes collaborative, versioned science more
important than ever Plan for today 1. De fi ning Collaborative, Versioned Science   2. Generative AI is here      3. Making predictions, anticipating challenges What is is, what it’s core qualities are, why it matters. Current capabilities, what’s readily possible, where some challenges lie, where we might be heading. Why Collaborative, Versioned Science might be the answer.

important than ever Collaborative, Versioned Science

important than ever Open Source: Right to modify, not to contribute Open source refers to material (often software) released under terms that allow it to be freely shared, used, and modi fi ed by anyone. Open source projects often, though not always, also have a highly collaborative development process and are receptive to contributions of code, documentation, discussion, etc from anyone who shows competent interest.

important than ever Collaborative Open Source “Open source” way of working Modular, composable, reusable Transparency and inclusivity Process automation Documented High-quality review processes Well structured open code, with clear rules around reuse. Clear rules around governance, how decisions are made, roadmaps and project goals. Automation around testing, communications, and other key activities. How to use, how to contribute, guided tutorials, all in electronic form. Code review, process updates, testing procedures.

important than ever Collaborative Modular, composable, reusable Transparency and inclusivity Process automation Documented High-quality review processes Well-structured, interoperable research methods and data (protocols, datasets, and   pipelines). Open frameworks to accelerate time to scienti fi c value. Open governance and community-driven science. Transparent decision making, sharing   of fi ndings, assigning of credit. Diverse contributors (academic, industry, citizen science). Automated reproducibility and validation: Automated work fl ows for data analysis,   hypothesis testing, and updating research outputs and ensuring accuracy. Comprehensive and accessible scienti fi c knowledge. Open protocols, how-to guides, and tutorials for reproducing experiments. Step-by-step instructions simplify access. Rigorous peer review and testing. Community-led, version-controlled review systems ensure that research is thoroughly tested, validated, and reproducible. Collaborative, Versioned Science

important than ever

important than ever Transparent,   documented decisions 1/3

important than ever Modular, reusable code

important than ever High-quality review   processes

important than ever Process automation

important than ever Collaborative, reproducible computational work fl ows

important than ever Platform for scalable   climate and geoscience   analyses

important than ever Collaborative catalyst   research and evaluation

important than ever

important than ever 1/4

important than ever Transparent, open   review process 2/4

important than ever Transparent, open   review process 2

important than ever Process automation 3/4

important than ever Process automation 4/4

important than ever “reproducible by necessity” Fernando Perez (IPython, Jupyter, Berkeley) https://web.archive.org/web/20140214000007/http://blog.fperez.org/

important than ever Collaborative Modular, composable, reusable Transparency and inclusivity Process automation Documented High-quality review processes Well-structured, interoperable research methods and data (protocols, datasets, and   pipelines). Open frameworks to accelerate time to scienti fi c value. Open governance and community-driven science. Transparent decision making, sharing   of fi ndings, assigning of credit. Diverse contributors (academic, industry, citizen science). Automated reproducibility and validation: Automated work fl ows for data analysis,   hypothesis testing, and updating research outputs and ensuring accuracy. Comprehensive and accessible scienti fi c knowledge. Open protocols, how-to guides, and tutorials for reproducing experiments. Step-by-step instructions simplify access. Rigorous peer review and testing. Community-led, version-controlled review systems ensure that research is thoroughly tested, validated, and reproducible. Collaborative, Versioned Science

important than ever Generative AI is here

important than ever Some working assumptions Based on the last ~2.5 years actively building with them The people around you are likely using it There is a “there” there “Moore’s law for LLMs” will continue to hold People are trying all sorts of crazy things As in fl uencers of the future, we should have opinions Whether it’s ChatGPT or GitHub Copilot, these are technologies people are using. Generative AI can be genuinely useful when applied to the right problems. Models will likely become more capable, costs will reduce, more will be possible for less. Just check your favourite tech news site. And I’m sharing mine with you today.

important than ever Quick survey of the room Using generative AI in their daily work? Building a system that incorporates LLMs? Exploring capabilities as part of their work? Living with the consequences of LLMs in their work? Building their own base model? e.g., GitHub Copilot (or other code tool), ChatGPT, Claude, something else. Building a net-new piece of infrastructure or building something new? e.g., Evaluating models for existing or future workloads? e.g., A collaborator using generative AI tools. Advanced mode… With a show of hands…

important than ever What are they capable of? Natural language processing tasks Synthesizing information Personalization (ELI5, ELIM, ELIDKCVW*) Fine tuning for domain-speci fi c tasks or behaviours Exploring topics, generating ideas. Text classi fi cation, sentiment analysis, named entity recognition, intent detection. Especially when combined with techniques like Retrieval Augmented Generation (RAG) Customized responses based on individual preferences/background/knowledge. Conversational agents, code generation, tool calling. With some caveats, can be excellent tools for brainstorming and learning. * Explain Like I Don’t Know C Very Well Current capabilities being leveraged widely.

important than ever What are some challenges? Hallucinations Biases, safety, security Inconsistency, non-determinism, and evaluation Structured outputs Unjusti fi ed certainty in outputs They are always hallucinating, it’s just sometimes they are useful. Re fl ecting, and sometimes amplifying biases present in training data or fi ne tuning. Di ff erent answers for the same questions due to random sampling and model state. Although many models have now been trained for this speci fi cally. Generate detailed responses without any sense of reality.

important than ever Generative AI is here Meta-level capabilities Amplifying human cognition Summarizing large amounts of content (e.g., through retrieval augmentation generation), providing a mechanism for rapid ideation through conversational chatbots, retrieving information from a broad search space. Accelerating work through assistance Summarizing content into more digestible forms, generating new representations of information. Supporting analysts generating SQL queries to support business reporting. Extending creativity and adapting to individual needs Content generation (e.g., creative writing, event planning), exploration new topics, and personalization of outputs speci fi c to the individual needs or preferences of the user.

important than ever Generative AI is here More speci fi cally (ex. software engineering) Amplifying human cognition Generating new code based on training. Explaining existing code, explaining the call chain, retrieving relevant context across the entire software development lifecycle, and summarising. Accelerating work through assistance Authoring new code, generating documentation, providing fi rst code review based on business rules that would have otherwise occupied human time. Extending creativity and adapting to individual needs Translating, content generation (e.g., creative writing, event planning), exploration new topics, and personalization of outputs speci fi c to the individual needs or preferences of the user.

important than ever Adoption and perception of AI tools Signi fi cant adoption in many fi elds 76% are using or are planning to use AI tools 81% cite increased productivity as biggest bene fi t Trust in output/accuracy of tools mixed 70% of developers do not perceive AI as a job threat 200M+ ChatGPT users, 1M+ GitHub Copilot users. Up from 70% last year. Novice developers cite accelerating learning as biggest bene fi t. 43% feel positive, 31% skeptical. Learners trust more than experienced developers. Those who *do* is marginally higher for learners (15% vs 12%). Stack Overflow

important than ever Signi fi cant value in many fi elds Models, architectures, tools, and platforms maturing

important than ever Generative AI is here Models are getting more and more capable ~5% month over month improvements in last year (on new benchmarks) SWE-Bench Veri fi ed

important than ever Generative AI is here We’re learning how to use them more e ff ectively Re fl ecting on a task with the LLM Iterative generation and evaluation of outputs with feedback via self-critiques and/or new information (e.g., running tests and inspecting outputs). Tool use / function calling Models fi ne-tuned to use tools, extracting parameters from interactions and passing them to tools to augment responses (e.g., search the web, execute code, detect objects). Planning and reasoning Ask the model to build a plan/sequence of actions to take to solve the incoming request/task before executing on them. Multi-agent collaboration Prompt model(s) to take on di ff erent roles. LLM operates with a degree of autonomy to complete a task, often managing its own internal processes and sub-tasks to achieve a goal. https://www.promptingguide.ai/research/llm-agents

important than ever Generative AI is here Plethora of platforms, tools, and technologies https://www.sequoiacap.com/article/llm-stack-perspective/

important than ever Generative AI (in science) is here

important than ever Generic tools are already useful Signi fi cant value with ‘o ff the shelf tools’ for many research activities Amplifying human cognition Summarizing large amounts of content (e.g., through retrieval augmentation generation), providing a mechanism for rapid ideation through conversational chatbots, retrieving information from a broad search space. Accelerating work through assistance Summarizing content into more digestible forms, generating new representations of information. Supporting analysts generating SQL queries to support business reporting. Extending creativity and adapting to individual needs Content generation (e.g., creative writing, event planning), exploration new topics, and personalization of outputs speci fi c to the individual needs or preferences of the user.

important than ever Domain-level models* AstroLLaMA: Towards Specialized Foundation Models in Astronomy. arXiv:2309.06126 Fine-tuned LLaMA-2 variant Improved context-awareness over LLaMA-2 + GPT4 Higher- fi delity embeddings Tuned on 300,000+ abstracts on the arXiv. ‘Completions’ of abstracts show a deep(er) understanding of astronomical concepts. Capable of facilitating better document retrieval and semantic analysis. * Typically fine-tuned

important than ever Domain-level specializations Scientific Large Language Models: A Survey on Biological & Chemical Domains. arXiv: 2401.14656

important than ever Generative AI (in science) is here Scienti fi c process Inquiring Hypothesize Experiment Evaluate

important than ever Generative AI (in science) is here Inquiring Knowledge sourced via retrieval or model training Model knowledge derived from training, retrieved knowledge derived from various levels of retrieval (RAG) complexity. Retrieval can be as simple as web search See ChatGPT and Bing as examples. User question triggers custom web searches. Sophisticated agents can exceed human abilities e.g., PaperQA2 from Future House exceeds subject matter experts on realistic literature research tasks. Relatively low cost ($1-3 per query) Prompt model(s) to take on di ff erent roles. LLM operates with a degree of autonomy to complete a task, often managing its own internal processes and sub-tasks to achieve a goal. LANGUAGE AGENTS ACHIEVE SUPERHUMAN SYNTHESIS OF SCIENTIFIC KNOWLEDGE Michael D. Skarlinski1 Sam Cox1,2 Jon M. Laurent1 James D. Braza1 Michaela Hinks1 Michael J. Hammerling1 Manvitha Ponnapati1 Samuel G. Rodriques1,3⇤ Andrew D. White1,2⇤ 1FutureHouse Inc., San Francisco, CA 2University of Rochester, Rochester, NY 3 Francis Crick Institute, London, UK ⇤These authors jointly supervise technical work at FutureHouse. Correspondence to: {sam,andrew}@futurehouse.org ABSTRACT Language models are known to “hallucinate” incorrect information, and it is unclear if they are sufficiently accurate and reliable for use in scientific research. We developed a rigorous human-AI comparison methodology to evaluate language model agents on real-world literature search tasks covering information retrieval, summarization, and contradiction detection tasks. We show that PaperQA2, a frontier language model agent optimized for improved factuality, matches or exceeds subject matter expert performance on three realistic literature research tasks without any restrictions on humans (i.e., full access to internet, search tools, and time). PaperQA2 writes cited, Wikipedia- style summaries of scientific topics that are significantly more accurate than existing, human-written Wikipedia articles. We also introduce a hard benchmark for scientific literature research called LitQA2 that guided design of PaperQA2, leading to it exceeding human performance. Finally, we apply PaperQA2 to identify contradictions within the scientific literature, an important scientific task that is challenging for humans. PaperQA2 identifies 2.34 ± 1.99 (mean ± SD, N = 93 papers) contradictions per paper in a random subset of biology papers, of which 70% are validated by human experts. These results demonstrate that language model agents are now capable of exceeding domain experts across meaningful tasks on scientific literature. 1 Introduction Large language models (LLMs) have the potential to assist scientists with retrieving, synthesizing, and summarizing the literature1,2,3, but still have several limitations for use in research tasks. Firstly, factuality is essential in scientific research, and LLMs hallucinate4, confidently stating information that is not grounded in any existing source or evidence. Secondly, science requires extreme attention to detail, and LLMs can overlook or misuse details when faced with challenging reasoning problems5. Finally, benchmarks for retrieval and reasoning across the scientific literature today are underdeveloped. They do not consider the entire literature, but instead are restricted to abstracts6, retrieval on a fixed corpus7, or simply provide the relevant paper directly8. These benchmarks are not suitable as performance proxies for real scientific research tasks, and, more importantly, often lack a direct comparison to human performance. Thus, it remains unclear whether language models and agents are suitable for use in scientific research. We therefore set out to develop a rigorous comparison between the performance of AI systems and humans on three real-world tasks: a retrieval task involving searching the entire literature to answer questions; a summarization task involving producing a cited, Wikipedia-style articles on scientific topics; and a contradiction-detection task, involving extracting all claims from papers and checking them for contradictions against all of literature. This is, to our knowledge, arXiv:2409.13740v2 [cs.CL] 26 Sep 2024

important than ever Generative AI (in science) is here Hypothesis forming Leverage relatively simple RAG architecture Retrieving information from a subset of the NASA ADS (journal index database) Multiple GPT-4 instances working ‘adversarially’ Generation, critiquing, feedback moderation provided by separate models. Promise of ‘lowering barriers to realizing value’ Authors cite relatively simple ‘human in the loop’ architecture may allow many to realize signi fi cant value. Harnessing the Power of Adversarial Prompting and Large Language Models for Robust Hypothesis Generation in Astronomy Ioana Ciuc˘ a * 1 2 Yuan-Sen Ting * 1 2 Sandor Kruk 3 Kartheik Iyer 4 Abstract This study investigates the application of Large Language Models (LLMs), specifically GPT-4, within Astronomy. We employ in-context prompting, supplying the model with up to 1000 papers from the NASA Astrophysics Data System, to explore the extent to which performance can be improved by immersing the model in domain- specific literature. Our findings point towards a substantial boost in hypothesis generation when using in-context prompting, a benefit that is further accentuated by adversarial prompting. We illustrate how adversarial prompting empowers GPT-4 to extract essential details from a vast knowledge base to produce meaningful hypotheses, signaling an innovative step towards employing LLMs for scientific research in Astronomy. 1. Introduction Significant strides in Natural Language Processing (NLP) have been made possible through attention mechanisms and transformer architecture, leading to the development of Large Language Models (LLMs) such as GPT-4 (Vig, 2019; Brown et al., 2020; Ouyang et al., 2022). These models exhibit extraordinary aptitude in understanding, generating, and interacting with human language. They go beyond de- ciphering complex linguistic patterns to making non-trivial deductions and forming relationships across diverse contexts (e.g., Devlin et al., 2018; Elkins & Chun, 2020). Two intriguing facets of these models have stirred excite- *Equal contribution 1Research School of Astronomy & Astro- physics, Australian National University, Cotter Rd., Weston, ACT 2611, Australia 2School of Computing, Australian National Uni- versity, Acton, ACT 2601, Australia 3European Space Astronomy Centre, European Space Agency, Villafranca del Castillo, Madrid 28692, Spain 4Columbia Astrophysics Laboratory, Columbia University, New York, NY 10027, USA. Correspondence to: Ioana Ciuca <[email protected]>, Yuan-Sen Ting <yuan- [email protected]>. Proceedings of the 40th International Conference on Machine Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright 2023 by the author(s). ment for their potential that surpasses their initial intended applications. Firstly, despite LLMs’ propensity to sample posterior means of languages—a factor that can occasion- ally result in non-trivial hallucination problems—improved performance has been witnessed through in-context prompting (Wang et al., 2022; Wei et al., 2022; Zhang et al., 2022). This enhancement enables them to handle complex, domain-specific tasks (e.g., Radford & Narasimhan, 2018; Brown et al., 2020; Lu et al., 2022). Secondly, these models, when combined with revolutionary technologies like Langchain1 to provide extensive context to the LLMs, expand their functionality across a wide range of fields. While methods like the use of adapters (He et al., 2021; Karimi Mahabadi et al., 2021; Hu et al., 2021) can re- markably augment performance for domain-specific tasks through fine-tuning the LLMs, these approaches often prove challenging for institutions without sufficient resources. In this study, we delve into the application of low-cost in- context prompting (Chen et al., 2021; Xie et al., 2021) in the realm of astronomy. Astronomy offers a compelling case study due to three key reasons. Firstly, although the field is rich in literature, the inclusion of such text in the vast corpus used to train GPT models is probably limited. This lack leads to noticeable hallucination problems when employing naive versions of LLMs (Ciuc˘ a et al., 2023). Secondly, unlike domains that focus more on intensive, detailed studies, advancements in astronomy often stem from “connecting the dots” across dif- ferent subfields due to the universality of underlying phys- ical processes at various scales. This feature fosters the hypothesis that extensive in-context prompting could significantly enhance hypothesis generation if LLMs are initially exposed to a broad range of literature. Lastly, astronomy’s longstanding “open sky” policy makes it an ideal candidate for in-context prompting research. This policy ensures that most data sets are publicly available im- mediately or after a short proprietary period (Almeida et al., 2023; Fabricius et al., 2021). Further, the field possesses a comprehensive, well-curated literature database. The internet has enabled the archiving of astronomical knowledge, 1https://python.langchain.com 1 arXiv:2306.11648v1 [astro-ph.IM] 20 Jun 2023

important than ever Generative AI (in science) is here Experimentation ‘AI Scientist’ going from idea → ‘reviewed’ paper Generates ideas, writes code, executes experiments, analyzes results, authors paper and simulates reviews. Costs about $15 per outputted paper. Experiments currently limited to machine learning Limited to domains where ‘experiment’ is code/numerical-based (di ff usion modeling, learning dynamics, transformer-based language modeling). Results can be interesting, but with many caveats Creates interesting ideas but struggles with implementation, rigor, and accuracy due to limitations in computation, experimental depth, and current model capabilities*. 2024-9-4 The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Chris Lu1,2,*, Cong Lu3,4,*, Robert Tjarko Lange1,*, Jakob Foerster2,Ü , Je Clune3,4,5,Ü and David Ha1,Ü *Equal Contribution, 1Sakana AI, 2FLAIR, University of Oxford, 3University of British Columbia, 4Vector Institute, 5Canada CIFAR AI Chair, ÜEqual Advising One of the grand challenges of artificial general intelligence is developing agents capable of conducting scientific research and discovering new knowledge. While frontier models have already been used as aides to human scientists, e.g. for brainstorming ideas, writing code, or prediction tasks, they still conduct only a small part of the scientific process. This paper presents the first comprehensive framework for fully automatic scientific discovery, enabling frontier large language models (LLMs) to perform research independently and communicate their findings. We introduce T AI S , which generates novel research ideas, writes code, executes experiments, visualizes results, describes its findings by writing a full scientific paper, and then runs a simulated review process for evaluation. In principle, this process can be repeated to iteratively develop ideas in an open-ended fashion and add them to a growing archive of knowledge, acting like the human scientific community. We demonstrate the versatility of this approach by applying it to three distinct subfields of machine learning: di usion modeling, transformer-based language modeling, and learning dynamics. Each idea is implemented and developed into a full paper at a meager cost of less than $15 per paper, illustrating the potential for our framework to democratize research and significantly accelerate scientific progress. To evaluate the generated papers, we design and validate an automated reviewer, which we show achieves near-human performance in evaluating paper scores. T AI S can produce papers that exceed the acceptance threshold at a top machine learning conference as judged by our automated reviewer. This approach signifies the beginning of a new era in scientific discovery in machine learning: bringing the transformative benefits of AI agents to the entire research process of AI itself, and taking us closer to a world where endless a ordable creativity and innovation can be unleashed on the world’s most challenging problems. Our code is open-sourced at https://github.com/SakanaAI/AI-Scientist. 1. Introduction The modern scientific method (Chalmers, 2013; Dewey, 1910; Jevons, 1877) is arguably one of the greatest achievements of the Enlightenment. Traditionally, a human researcher collects background knowledge, drafts a set of plausible hypotheses to test, constructs an evaluation procedure, collects evidence for the di erent hypotheses, and finally assesses and communicates their findings. Afterward, the resulting manuscript undergoes peer review and subsequent iterations of refinement. This procedure has led to countless breakthroughs in science and technology, improving human quality of life. However, this iterative process is inherently limited by human researchers’ ingenuity, background knowledge, and finite time. Attempting to automate general scientific discovery (Langley, 1987, 2024; Waltz and Buchanan, 2009) has been a long ambition of the community since at least the early 70s, with computer-assisted works like the Automated Mathematician (Lenat, 1977; Lenat and Brown, 1984) and DENDRAL (Buchanan and Feigenbaum, 1981). In the field of AI, researchers have envisioned the possibility of automating AI research using AI itself (Ghahramani, 2015; Schmidhuber, 1991, 2010a,b, 2012), leading to “AI-generating algorithms” (Clune, 2019). More recently, foundation models have seen tremendous advances in their general capabilities (Anthropic, 2024; Google DeepMind Gemini Team, 2023; Llama Team, 2024; OpenAI, 2023), but they have only been shown to accelerate individual parts of the research pipeline, e.g. the writing of scientific manuscripts (Altmäe et al., 2023; Corresponding author(s): Chris Lu ([email protected]), Cong Lu ([email protected]), and Robert Tjarko Lange ([email protected]) arXiv:2408.06292v3 [cs.AI] 1 Sep 2024 * “GPT-4o in particular frequently fails to write LaTeX that compiles.”

important than ever Generative AI (in science) is here Evaluation

important than ever Making predictions What might the future look like for science?

important than ever Making predictions (Lots) more science outputs AI agents capable of ‘doing science’ will mature Agents will be able to generate and critique hypotheses, source data to test them, generate results, and evaluate the potential value of the results. What we are seeing today is very very early. AI tools will enable more ‘results’ to be generated by more people AI tools will make existing scientists more productive and enable more people to participate in the scienti fi c process. This is already to 50% more code written by developers, but also no/low-code solutions (e.g., CHat Oriented Programming (CHOP*) – “coding via iterative prompt re fi nement”). Cost of generating an output that resembles a novel result will trend to zero Cost today of writing AI-generated paper is around $15, over time this will trend towards zero. Cost to achieve a minimum MMLU scores (e.g.,~40 / ~80) decreasing 10x year over year.

important than ever Making predictions Possible consequences As the pace of publishing accelerates, journals (in their current form) will break An increasing fraction of scienti fi c outputs will come from AI agents. Humans will not be able to keep up. Traditional mechanisms of sharing (and evaluating) science are going to be insu ffi cient Either we’ll spend all of our time reviewing AI-generated outputs, or we’ll need to fi nd a better way… Papers will become increasingly unsatisfactory way of sharing information Papers are designed for humans to read, include lots of (arguably) unnecessary information, and aren’t machine readable.

important than ever How will we still do science? By instilling and enshrining   practices of Collaborative,   Versioned Science.

important than ever CVS as our lodestar Modular, composable, reusable Standarize frameworks Promote universal frameworks (e.g., FAIR principles for data) to ensure interoperability between AI tools, human researchers, and datasets. Build AI model registries Maintain version-controlled registries of AI models used in research. Each model version should link to speci fi c studies to ensure reproducibility and transparency. Invest in composable research pipelines Lean on existing work fl ow technologies and place them at the heart   of AI-enabled research code generation.

important than ever CVS as our lodestar Transparency and inclusivity Diversify collaboration platforms With results can coming from more diverse participants, the venues in which they convene will need to evolve. Automated credit attribution Mechanisms for automated tracking and recording of contributions will become essential, for both transparency, but also fairness. Open, peer-led governance In order to seek out diverse opinions in the direction of AI-driven research studies.

important than ever CVS as our lodestar Process automation Automated reproduction checks Automated validation pipelines that rerun experiments, check for statistical signi fi cance, and compare outputs across di ff erent environments. Versioned outputs Increased use of versioning technologies and AI agents to track changes in research outputs, maintaining the latest version of outputs. Focus AI for ‘lower-order’ scienti fi c work Leverage AIs for repetitive tasks such as data cleaning, documenting protocols, literature reviews.

important than ever CVS as our lodestar Documentation Automated documentation generation Leverage AI to generate comprehensive, standardized documentation of protocols, methods, and results, in multiple forms. AI-generated research tutorials/onboarding Use AI to generate custom tutorials for each study, thereby lowering the time taken for humans to onboard into research domain. Semantic versioning of experiments Similar to software, adopt SemVer concept in science ensuring increased historical traceability. Machine-readable experimental metadata Document whole research cycle into a machine readable metadata to enable AI-driven discovery and cross-disciplinary reuse.

important than ever CVS as our lodestar High-quality review process AI-augmented peer review AI assisting reviewers by fl agging methodological errors, inconsistencies, or potential biases while preserving human judgment. Automated reproducibility checks Ahead of publication (or perhaps even submission), AI tooling automatically re-running experiments to validate results. Community-enabled peer review If outputs are versioned and incremental, perhaps review becomes this way too?

Schmidt Sciences Thanks! [email protected]

Why Generative AI makes collaborative, versione...

Why Generative AI makes collaborative, versioned science more important than ever

More Decks by Arfon Smith

Featured

Transcript