time communicating in writing – With collaborators, a general public, future you – About data cleaning, analyses, results – In formal reports, brief summaries, replies to questions • Time to get good Writing is important
tools that combine your code and text • Greatly facilitates reproducibility, which is a big concept – In short, someone you don’t know or work with should be able to reproduce each step of your analysis – As a part of this, they should understand why you did what you did – (Again, this someone is often future you) • We’ll use R Markdown to write reproducible reports Tools
– How many details do they want / need? • Say exactly what you did – Don't leave any thing important out – Not the same as a step-by-step list of what you typed into R General tips
File names – Summary statistics – Exploratory analysis – Formal analysis • Results • Discussion • Some version of these exist in almost everything I write • Sometimes these are long, sometimes they’re a sentence General structure
Reorganizing into usable form – Identifying missing values – Recoding and creating variables • Summary statistics – Sample size – Means or proportions of major variables Data
missing values? data distributions? notable features?) • What happened in your modeling? • What is your final model, and what are the important quantities? Results
you hoped to answer? • What were the limitations of your data or your analysis? • What open questions remain? Are any of these solvable with the current data? • What are your next steps? Discussion
can be easily converted to HTML or another format (PDF, Word) • R Markdown lets you combine formatted text with code chunks and the results of those chunks • Having text and code in the same place, and having the combined output be user-friendly, is huge for your workflow R Markdown? R for Data Science
can be easily converted to HTML or another format (PDF, Word) • R Markdown lets you combine formatted text with code chunks and the results of those chunks • Having text and code in the same place, and having the combined output be user-friendly, is huge for your workflow R Markdown? R for Data Science