When machine learning models make decisions that affect people’s lives, how can you be sure those decisions are fair? When you build a machine learning product, how can you be sure your product isn't biased? What does it even mean for an algorithm to be ‘fair’? As machine learning becomes more prevalent in socially impactful domains like policing, lending, and education these questions take on a new urgency.
In this talk I’ll introduce several common metrics which measure the fairness of model predictions. Next I’ll relate these metrics to different notions of fairness and show how the context in which a model or product is used determines which metrics (if any) are applicable. To illustrate this context-dependence I'll describe a case study of anonymized real-world data. Next, I'll highlight some open source tools in the Python ecosystem which address model fairness. Finally, I'll conclude by arguing that if your job involves building these kinds models or products then it is your responsibility to think about the answers to these questions.