LLMs in Text-Based Virtual Worlds • Simulating Human-like Daily Activities with Desire-driven Autonomy 推論 • MISR: Measuring Instrumental Self-Reasoning in Frontier Models • RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios 学習 • Training Agents with Weakly Supervised Feedback from Large Language Models • MALT: Improving Reasoning with Multi-Agent LLM Training • Personalized Multimodal Large Language Models: A Survey Agent Framework • Practical Considerations for Agentic LLM Systems • Challenges in Human-Agent Communication • Specifications: The missing link to making the development of LLM systems an engineering discipline
Agents • Enhancing LLMs for Impression Generation in Radiology Reports through a Multi-Agent System Digital Agent • Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction • AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials • The BrowserGym Ecosystem for Web Agent Research • PAFFA: Premeditated Actions For Fast Agents • Generalist Virtual Agents: A Survey on Autonomous Agents Across Digital Platforms Data Agent • DataLab: A Unified Platform for LLM-Powered Business Intelligence • AutoDCWorkflow: LLM-based Data Cleaning Workflow Auto-Generation and Benchmark • Towards Agentic Schema Refinement
Programming through LLM Multi-Agent Collaboration Embodied Agent • Navigation World Models • From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons Multi Agent System • GENMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration • A Survey on Large Language Model-Based Social Agents in Game-Theoretic Scenarios • From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents • LMAgent: A Large-scale Multimodal Agents Society for Multi-user Simulation Agentic RAG • Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models • A Collaborative Multi-Agent Approach to Retrieval-Augmented Generation Across Diverse Data
a Multi-Agent System • 放射線科レポートにおける所見から印象を生成するタスクを支援するマルチエージェントシステム 「RadCouncil」を提案 印象とは所見を要約し、臨床医が患者の診断や治療を迅速に判断するための要となる内容 1. Retrieval:類似過去レポートをベクトルDBから検索 2. Radiologist:所見を基に印象を生成 3. Reviewer:印象の一貫性と正確性を検証し、修正を提案 エージェントのワークフロー Agentic AI Systems
AI エージェントを実現 • Pydantic AI • 「Copilot Vision」プレビュー版公開。AIとの二人三脚が実現しそう • OpenAI o1 System Card • Introducing Gemini 2.0: our new AI model for the agentic era • The next chapter of the Gemini era for developers • Google が Project Mariner を発表: ユーザーに代わってWebを使用する AI エージェント • Introducing Google Agentspace: Bringing AI agents and AI-powered search to enterprises • グーグル、AIエージェント搭載「メガネ型デバイス」発表 Gemini 2.0採用 • Devin is generally available today! リポジトリ • awesome-llm-apps
AI Applications • Magentic-One, AutoGen, LangGraph, CrewAI, or OpenAI Swarm: Which Multi-AI Agent Framework is Best? • GenAIOps: Operationalize Generative AI - A Practical Guide • From SaaS to Vertical AI Agents • How to Build a General-Purpose LLM Agent • エージェンティックAI:ビジネスにおける6つの有望なユースケース • How to use AI for Prototyping as a PM • What is AI Engineering? • Outcome-based pricing for AI agents