Upgrade to Pro — share decks privately, control downloads, hide ads and more …

What does Fairness in Information Access Mean and Can We Achieve It?

wing.nus
March 13, 2023
400

What does Fairness in Information Access Mean and Can We Achieve It?

Abstract: Bias in data as well as lack of transparency and fairness in algorithms are not new problems, but with the increasing scale, complexity, and adoption, most AI systems are suffering from these issues at a level unprecedented. Information access systems are not spared since these days, almost all large-scale systems of information access are mediated by algorithms. These algorithms are optimized not only for relevance, which is subjective to begin with, but also for measures of engagement and impressions. They are picking up signals of what may be 'good' from individuals and perpetuating that through learning methods that are opaque and hard to debug. Considering 'fairness' and introducing more transparency can help, but it can also backfire or create other issues. We also need to understand how and why users of these systems engage with content. In this talk, I will share some of our attempts for bringing fairness in ranking systems and then talk about how the solutions are not that simple.

Speaker Bio: Dr. Chirag Shah is a Professor in Information School, an Adjunct Professor in Paul G. Allen School of Computer Science & Engineering, and an Adjunct Professor in Human Centered Design & Engineering (HCDE) at University of Washington (UW). He is the Founding Director of InfoSeeking Lab and the Founding Co-Director of RAISE, a Center for Responsible AI. His research revolves around intelligent systems. He received his PhD in Information Science from University of North Carolina (UNC) at Chapel Hill.

wing.nus

March 13, 2023
Tweet

More Decks by wing.nus

Transcript

  1. What does Fairness in Information Access Mean and Can We

    Achieve It? Chirag Shah @chirag_shah
  2. We live in a biased world • We are biased.

    • Any dataset can be biased. • Any model can be biased. • No data is a perfect representation of the world; tradeoffs are made during data collection, storage, and analysis.
  3. Fairness = Lack of Bias? • Bias is not always

    bad • Three definitions of fairness: • Statistical parity • Disparate impact • Disparate treatment
  4. Addressing fairness through diversity • Took a sliver of search

    data (queries, top results). • Clustered the results and quantified the amount of topical bias. • Designed new algorithms to re-rank those results to have a fairer ranking. • Two forms of fairness: ◦ Statistical parity ◦ Disparate impact Ruoyuan Gao Amazon Gao, R. & Shah, C. (2020). Toward Creating a Fairer Ranking in Search Engine Results. Journal of Information Processing and Management (IP&M), 57(1). Gao, R. & Shah, C. (2019). How Fair Can We Go: Detecting the Boundaries of Fairness Optimization in Information Retrieval. In Proceedings of ACM International Conference on Theory of Information Retrieval (ICTIR). pp. 229-236. October 2-5, 2019. Santa Clara, CA, USA. Gao, R., Ge, Y., & Shah, C. (2022). FAIR: Fairness-Aware Information Retrieval Evaluation. Journal of the Association for Information Science and Technology (JASIST).
  5. Datasets • Google ◦ From Google Trends (June 23-June 29,

    2019) ◦ 100 queries ◦ Top 100 results per query • New York Times ◦ 1.8M articles published by NYT ◦ 50 queries ◦ Top 100 results per query • Clustering with two subtopics madden shooting hurricane lane update jacksonville shooting video shanann watts holy fire update fortnite galaxy skin new deadly spider stolen plane …
  6. Page-wise with Disparate Impact Problem: we are not getting enough

    diversity by sampling from the tops. 70% 30%
  7. ε-greedy • Explore the results with ε probability, exploit with

    1-ε • ε=0.0 → No exploration • ε=1.0 → Full exploration (randomness) • Non-fair (naïve) ε-greedy: with probability ε, randomly select from entire rank-list (100) with probability 1-ε, pick from the top • Fair ε-greedy with probability ε, randomly select a cluster, then pick top from the cluster with probability 1-ε, pick the “fair” cluster, then pick top from the cluster Statistical parity | Disparate impact
  8. Translating systems to experiences • Most people can’t tell the

    difference between original Google results and those with ε=0.3. • But they can if ε>0.5. • Lesson: We can provide diversity in search results in a careful way that helps reduce bias while keeping user satisfaction.
  9. Feng, Y. & Shah, C. (2022). Has CEO Gender Bias

    Really Been Fixed? Adversarial Attacking and Improving Gender Fairness in Image Search. AAAI Conference on Artificial Intelligence. February 22-March 1, 2022. Vancouver, Canada.
  10. Reducing bias using “fair-greedy” approach Feng, Y. & Shah, C.

    (2022). Has CEO Gender Bias Really Been Fixed? Adversarial Attacking and Improving Gender Fairness in Image Search. AAAI Conference on Artificial Intelligence. February 22- March 1, 2022. Vancouver, Canada.
  11. But… • [Technical] Multi-objective optimization (fairness in marketplace) is hard

    and not always well-defined. • [Business] Re-ranking brings additional costs. • [Social] Our notions of what’s biased, what’s fair, and what’s good keep changing.
  12. Summary • Large scale information access systems suffer from problems

    of bias, unfairness, and opaqueness – some due to technical issues, some due to business objectives, and some are social issues. • We could audit these systems and create education, awareness, and advocacy around them. • Ideally, we need a multifaceted approach similar to curbing smoking.