Data Science gains new insights from business data. As software developers, why don't we use Data Science to analyze our data from our software systems, too?
In this session, I will talk about approaches to mine software data based on the many ideas from the Data Science field. We'll also look at the standard tools used in this area to analyze and communicate software development problems easily. With tools such as computational notebooks, data analysis frameworks, visualization, and machine learning libraries, we make hidden issues visible in a data-driven way.
Attendees will learn how to leverage scientific thinking, manage the analysis process, and apply literate statistical programming to analyze software data in an understandable way.
The main part will be hands-on live coding with Open Source tools like Jupyter notebook, Python, pandas, jQAssistant, and Neo4j. I'll show which new insights we can gain from data sources such as Git repositories, performance measurements, or directly from source code.