limited opportunity to verify the results by direct observation. Users of the analysis have no option but to trust the analysis, and by extension the software that produced it. Both the data analyst and the software provider therefore have a strong responsibility to produce a result that is trustworthy, and, if possible, one that can be shown to be trustworthy. This obligation I label the Prime Directive.
Transparency: The methods of the analysis can be inspected ✅ Reproducibility: The analysis can be repeated on the same data, hopefully producing the same results
LLMs are infamous for giving convincing but wrong answers ❌ Transparency: Nobody understands (yet) how/why LLMs do what they do ❌ Reproducibility: LLMs are nondeterministic black boxes
rmly inside the LLM’s capability frontier • Augment the LLM with (safe, deterministic) tools to increase its usefulness • Instruct the LLM to stick to the prescribed task • Resist the urge to feature-creep to the edge of the capability frontier • Example: LLM -> SQL -> Dashboard
it quite well • Transparency: Every SQL query is displayed to user • Reproducibility: The SQL is reproducible The “SQL chatbot applied to data dashboard” approach worked so well, we introduced an open-source package querychat to let anyone recreate the experience with their own data and visualizations
Outcomes that are pretty obviously right or wrong (or subjective) • Human micromanages the AI so closely that mistakes are all but guaranteed to be caught • Example: Plot tweaking tool
fewer mistakes than a human does when fumbling through a visualization; mistakes are usually easy to catch • Transparency: The user is directing, and can see the R code at all times • Reproducibility: The R code is generally reproducible Somehow feels far lower stakes. Helps that a lot of aspects of data viz are subjective.
the AI, but with a looser hand • Be aware of what it’s doing and why, but don’t closely scrutinize its work for errors and hallucinations • Enjoy fast progress/exploration, while piling up “review debt” • Before “shipping” your work, stop and carefully review • Akin to working on a git branch and getting a code review before merging • Example: Databot
take the time to review) and expertise (to spot problems in the analysis); or, rapidly improving models • Transparency: There’s R code, but it goes by pretty fast • Reproducibility: Databot will generate a reproducible report for you on demand High risk of misuse. But so incredibly useful...
{ellmer}: Easily call LLMs from R • {querychat}: Enhance Shiny data dashboards with LLMs that speak SQL • Databot: Exploratory Data Analysis agent for Positron