Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

Algorithmic Bias in Machine Learning

Jill Cates
November 16, 2019

Algorithmic Bias in Machine Learning

Machine learning algorithms are susceptible to both intentional and unintentional bias. Relying on biased algorithms to drive decisions can lead to unfair outcomes that have serious consequences affecting underrepresented groups of people. In this talk, we'll walk through examples of algorithmic bias in machine learning algorithms and explore tools (in Python) that can measure this bias.

Jill Cates

November 16, 2019
Tweet

More Decks by Jill Cates

Other Decks in Technology

Transcript

  1. Algorithmic Bias in Machine Learning Jill Cates Data Scientist at

    BioSymetrics November 16, 2019 PyCon Canada, Toronto Image source: Wired
  2. Algorithmic Bias “systematic and repeatable errors in a computer system

    that create unfair outcomes, such as privileging one arbitrary group of users over others” “Our own values and desires influence our choices, from the data we choose to collect to the questions we ask. Models are opinions embedded in mathematics.” - Cathy O’Neil (Weapons of Math Destruction) An unfortunate by-product of machine learning
  3. Algorithmic Bias An unfortunate by-product of machine learning Online Advertising

    Legal System HR Recruitment Facial Recognition Healthcare Credit Scores
  4. Algorithmic Bias An unfortunate by-product of machine learning Future Implications:

    MIT Media Lab's Moral Machine: http://moralmachine.mit.edu/ Source: MIT Technology Review
  5. ??? Machine Learning Algorithm Building a President Classifier Model Training

    (Zuzana Čaputová, president of Slovakia) Biased data = Biased model
  6. Clinical Decision-Making • Recent paper published in Science (Oct 25,

    2019) • Assessed a U.S. healthcare decision-making algorithm • Looked at 50,000 records from an academic hospital • Found that white patients were given higher risk scores and received more care than black patients • Clinical diagnostic tools and early detection of disease • When to admit or discharge a patient from the hospital • Automated triaging of patients • Assessing patient risk
  7. • MIMIC-III is a widely used healthcare dataset in machine

    learning research • Developed and de-identified by the MIT Laboratory for Computational Physiology • Contains electronic medical record data of 50,000 hospital admissions for 40,000 critical care patients • Collected at Beth Israel Deaconess Medical Centre between 2001 and 2012 Clinical Decision-Making Some papers that use MIMIC III:
  8. 70% of patients are white 47% of patients are insured

    by Medicare 35% of patients are Catholic 41% of patients are married MIMIC-III Dataset Demographics (mean age of adult patients is 62.5 years old)
  9. Language Translation Translating gender neutral languages • Gender neutral languages:

    Malay, Farsi (Persian), Hungarian, Armenian, Tagalog, etc. • Google Translate determines which gender should be assigned to which role • Trained on examples of translations from the web
  10. Language Translation • Words are represented as vectors • Similar

    words have similar representations Word Embeddings Word2Vec • Generates word embeddings (gensim) • Reveals semantic relations (associations) between words
  11. Language Translation • gender bias is captured by a direction

    in the word embedding • gender-neutral words are distinct from gender-definition words in the word embedding number of generated analogies number of stereotypic analogies
  12. Mitigating the Risk of Bias • Contains metrics to test

    for “unwarranted associations between an algorithm's outputs and certain user subpopulations identified by protected features” • Identifies subpopulations with disproportionately high error rates, assesses offensive labeling, and detects uneven rates of algorithmic error FairTest by Columbia University Some included metrics: Normalized Mutual Information, Normalized Conditional Mutual Information, Binary Ratio, Binary Difference
  13. Mitigating the Risk of Bias • Fairness metrics to test

    for biases (e.g., Generalized Entropy Index evaluates inequality in a dataset) • Bias mitigation algorithms (e.g., Adversarial Debiasing, Learning Fair Representations, Disparate Impact Remover, etc.) AIF360 by IBM AI 360
  14. Mitigating the Risk of Bias • Deep learning models are

    difficult to interpret (can’t extract feature importances) • Lime provides explanations of a black-box model’s predictions • Can be used to interpret image and text classifiers Explaining Models with Lime