research-article

Open access

Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits

Authors:

Wesley Hanwen Deng,

Manish Nagireddy,

Michelle Seng Ah Lee,

Jatinder Singh,

Zhiwei Steven Wu,

Kenneth Holstein, and

Haiyi ZhuAuthors Info & Claims

FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency

June 2022

Pages 473 - 484

https://doi.org/10.1145/3531146.3533113

Published: 20 June 2022 Publication History

All formats PDF

Abstract

Recent years have seen the development of many open-source ML fairness toolkits aimed at helping ML practitioners assess and address unfairness in their systems. However, there has been little research investigating how ML practitioners actually use these toolkits in practice. In this paper, we conducted the first in-depth empirical exploration of how industry practitioners (try to) work with existing fairness toolkits. In particular, we conducted think-aloud interviews to understand how participants learn about and use fairness toolkits, and explored the generality of our findings through an anonymous online survey. We identified several opportunities for fairness toolkits to better address practitioner needs and scaffold them in using toolkits effectively and responsibly. Based on these findings, we highlight implications for the design of future open-source fairness toolkits that can support practitioners in better contextualizing, communicating, and collaborating around ML fairness efforts.

References

[1]

2017. Facets - visualizations for ML datasets.arXiv:1810.01943https://pair-code.github.io/facets/

[2]

2021. People AI Guidebook. (2021). https://pair.withgoogle.com/guidebook/

[3]

Martín Abadi and Ashish Agarwal et al.2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org.

[4]

Julius A Adebayo 2016. FairML: ToolBox for diagnosing bias in predictive modeling. Ph. D. Dissertation. Massachusetts Institute of Technology.

[5]

Yongsu Ahn and Yu-Ru Lin. 2019. Fairsight: Visual analytics for fairness in decision making. IEEE transactions on visualization and computer graphics 26, 1(2019), 1086–1095.

[6]

Muhammad Ali, Piotr Sapiezynski, Miranda Bogen, Aleksandra Korolova, Alan Mislove, and Aaron Rieke. 2019. Discrimination through optimization: How Facebook’s Ad delivery can lead to biased outcomes. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–30.

Digital Library

[7]

Oscar Alvarado and Annika Waern. 2018. Towards algorithmic experience: Initial efforts for social media contexts. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[8]

Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N. Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz. 2019. Guidelines for Human-AI Interaction. Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3290605.3300233

Digital Library

[9]

McKane Andrus, Elena Spitzer, Jeffrey Brown, and Alice Xiang. 2021. What We Can’t Measure, We Can’t Understand: Challenges to Demographic Data Procurement in the Pursuit of Fairness. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 249–260.

Digital Library

[10]

Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine bias. ProPublica, May 23(2016), 2016.

[11]

Matthew Arnold, Rachel KE Bellamy, Michael Hind, Stephanie Houde, Sameep Mehta, Aleksandra Mojsilović, Ravi Nair, K Natesan Ramamurthy, Alexandra Olteanu, David Piorkowski, 2019. FactSheets: Increasing trust in AI services through supplier’s declarations of conformity. IBM Journal of Research and Development 63, 4/5 (2019), 6–1.

[12]

Joshua Asplund, Motahhare Eslami, Hari Sundaram, Christian Sandvig, and Karrie Karahalios. 2020. Auditing race and gender discrimination in online housing markets. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 24–35.

[13]

Niels Bantilan. 2018. Themis-ml: A fairness-aware machine learning interface for end-to-end discrimination discovery and mitigation. Journal of Technology in Human Services 36, 1 (2018), 15–30.

[14]

Rachel KE Bellamy, Kuntal Dey, Michael Hind, Samuel C Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilović, 2019. AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM Journal of Research and Development 63, 4/5 (2019), 4–1.

[15]

Emily M Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 610–623.

Digital Library

[16]

Sarah Bird, Miro Dudík, Richard Edgar, Brandon Horn, Roman Lutz, Vanessa Milan, Mehrnoosh Sameki, Hanna Wallach, and Kathleen Walker. 2020. Fairlearn: A toolkit for assessing and improving fairness in AI. Technical Report MSR-TR-2020-32. Microsoft. https://www.microsoft.com/en-us/research/publication/fairlearn-a-toolkit-for-assessing-and-improving-fairness-in-ai/

[17]

Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems 29 (2016), 4349–4357.

[18]

Nigel Bosch, Sidney K D’Mello, Ryan S Baker, Jaclyn Ocumpaugh, Valerie Shute, Matthew Ventura, Lubin Wang, and Weinan Zhao. 2016. Detecting student emotions in computer-enabled classrooms. In IJCAI. 4125–4129.

[19]

Karen Boyd. 2021. Datasheets for Datasets help ML Engineers Notice and Understand Ethical Issues in Training Data. Proceedings of the ACM on Human-Computer Interaction 5 (2021), 1 – 27.

Digital Library

[20]

Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77–101.

[21]

Sue Lacey Bryant, Andrea Forte, and Amy Bruckman. 2005. Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia. In GROUP ’05.

[22]

Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency(Proceedings of Machine Learning Research, Vol. 81), Sorelle A. Friedler and Christo Wilson (Eds.). PMLR, 77–91. https://proceedings.mlr.press/v81/buolamwini18a.html

[23]

Barbara Katrina Burian. 2006. Design Guidance for Emergency and Abnormal Checklists in Aviation. Proceedings of the Human Factors and Ergonomics Society Annual Meeting 50(2006), 106 – 110.

[24]

Ángel Alexander Cabrera, Abraham J Druck, Jason I Hong, and Adam Perer. 2021. Discovering and Validating AI Errors With Crowdsourced Failure Reports. Proceedings of the ACM on Human-Computer Interaction 5, CSCW2(2021), 1–22.

Digital Library

[25]

Ángel Alexander Cabrera, Will Epperson, Fred Hohman, Minsuk Kahng, Jamie Morgenstern, and Duen Horng Chau. 2019. FairVis: Visual analytics for discovering intersectional bias in machine learning. In 2019 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, 46–56.

[26]

Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data 5, 2 (2017), 153–163.

[27]

Victoria Clarke and Virginia Braun. 2014. Thematic analysis. In Encyclopedia of critical psychology. Springer, 1947–1952.

[28]

Jennifer Cobbe, Michelle Seng Ah Lee, and Jatinder Singh. 2021. Reviewable Automated Decision-Making: A Framework for Accountable Algorithmic Systems. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event, Canada) (FAccT ’21). Association for Computing Machinery, New York, NY, USA, 598–609. https://doi.org/10.1145/3442188.3445921

Digital Library

[29]

Paulo Cortez. [n. d.]. Student Performance Dataset. https://archive.ics.uci.edu/ml/datasets/Student+Performance

[30]

Paulo Cortez and Alice Maria Gonçalves Silva. 2008. Using data mining to predict secondary school student performance. (2008).

[31]

Sophia T. Dasch, Vincent Rice, Venkat R. Lakshminarayanan, Taiwo A. Togun, C. Malik Boykin, and Sarah M. Brown. 2020. Opportunities for a More Interdisciplinary Approach to Perceptions of Fairness in Machine Learning.

[32]

Maria De-Arteaga, Riccardo Fogliato, and Alexandra Chouldechova. 2020. A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores. Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3313831.3376638

Digital Library

[33]

Emily Denton, Alex Hanna, Razvan Amironesei, Andrew Smart, and Hilary Nicole. 2021. On the genealogy of machine learning datasets: A critical history of ImageNet. Big Data & Society 8, 2 (2021), 20539517211035955.

[34]

Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. 214–226.

Digital Library

[35]

Brian Ellis, Jeffrey Stylos, and Brad Myers. 2007. The factory pattern in API design: A usability evaluation. In 29th International Conference on Software Engineering (ICSE’07). IEEE, 302–312.

Digital Library

[36]

Motahhare Eslami, Aimee Rickman, Kristen Vaccaro, Amirhossein Aleyasen, Andy Vuong, Karrie Karahalios, Kevin Hamilton, and Christian Sandvig. 2015. ” I always assumed that I wasn’t really that close to [her]” Reasoning about Invisible Algorithms in News Feeds. In Proceedings of the 33rd annual ACM conference on human factors in computing systems. 153–162.

Digital Library

[37]

Andre Esteva, Brett Kuprel, Roberto A Novoa, Justin Ko, Susan M Swetter, Helen M Blau, and Sebastian Thrun. 2017. Dermatologist-level classification of skin cancer with deep neural networks. nature 542, 7639 (2017), 115–118.

[38]

Avi Feller, Emma Pierson, Sam Corbett-Davies, and Sharad Goel. 2016. A computer program used for bail and sentencing decisions was labeled biased against blacks. It’s actually not that clear. The Washington Post (2016).

[39]

Lincoln H. Forbes and Syed M. Ahmed. 2010. Modern Construction : Lean Project Delivery and Integrated Practices.

[40]

Sorelle A Friedler, Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary, Evan P Hamilton, and Derek Roth. 2019. A comparative study of fairness-enhancing interventions in machine learning. In Proceedings of the conference on fairness, accountability, and transparency. 329–338.

Digital Library

[41]

Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. 2021. Datasheets for datasets. Commun. ACM 64, 12 (2021), 86–92.

Digital Library

[42]

Soumya Ghosh, Q Vera Liao, Karthikeyan Natesan Ramamurthy, Jiri Navratil, Prasanna Sattigeri, Kush R Varshney, and Yunfeng Zhang. 2021. Uncertainty Quantification 360: A Holistic Toolkit for Quantifying and Communicating the Uncertainty of AI. arXiv preprint arXiv:2106.01410(2021).

[43]

Ben Green and Yiling Chen. 2019. The principles and limits of algorithm-in-the-loop decision making. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019). https://doi.org/10.1145/3359152

Digital Library

[44]

Nina Grgic-Hlaca, Elissa M. Redmiles, Krishna P. Gummadi, and Adrian Weller. 2018. Human Perceptions of Fairness in Algorithmic Decision Making: A Case Study of Criminal Risk Prediction. In Proceedings of the 2018 World Wide Web Conference(Lyon, France) (WWW ’18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 903–912. https://doi.org/10.1145/3178876.3186138

Digital Library

[45]

Philip Guo. 2021. Ten Million Users and Ten Years Later: Python Tutor’s Design Guidelines for Building Scalable and Sustainable Research Software in Academia. In The 34th Annual ACM Symposium on User Interface Software and Technology. 1235–1251.

[46]

Brigette M. Hales and Peter J. Pronovost. 2006. The checklist–a tool for error management and performance improvement.Journal of critical care 21 3 (2006), 231–5.

[47]

Aaron Halfaker, R Stuart Geiger, Jonathan T Morgan, and John Riedl. 2013. The rise and decline of an open collaboration system: How Wikipedia’s reaction to popularity is causing its decline. American Behavioral Scientist 57, 5 (2013), 664–688.

[48]

Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. Advances in neural information processing systems 29 (2016), 3315–3323.

[49]

Douglas D. Heckathorn. 2011. Comment: Snowball versus Respondent-Driven Sampling. Sociological Methodology 41, 1 (2011), 355–366. https://doi.org/10.1111/j.1467-9531.2011.01244.x arXiv:https://doi.org/10.1111/j.1467-9531.2011.01244.xPMID: 22228916.

[50]

Sarah Holland, Ahmed Hosny, Sarah Newman, Joshua Joseph, and Kasia Chmielinski. 2018. The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards. ArXiv abs/1805.03677(2018).

[51]

Kenneth Holstein and Vincent Aleven. 2021. Designing for human-AI complementarity in K-12 education. arXiv preprint arXiv:2104.01266(2021).

[52]

Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miro Dudik, and Hanna Wallach. 2019. Improving fairness in machine learning systems: What do industry practitioners need?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–16.

Digital Library

[53]

Naja Holten Møller, Irina Shklovski, and Thomas T. Hildebrandt. 2020. Shifting Concepts of Value: Designing Algorithmic Decision-Support Systems for Public Services. NordiCHI (2020), 1–12. https://doi.org/10.1145/3419249.3420149

Digital Library

[54]

Knut T Hufthammer, Tor H Aasheim, Sølve Ånneland, Håvard Brynjulfsen, and Marija Slavkovik. 2020. Bias mitigation with AIF360: A comparative study. In Norsk IKT-konferanse for forskning og utdanning.

[55]

Brittany Johnson, Jesse Bartola, Rico Angell, Katherine Keith, Sam Witty, Stephen J Giguere, and Yuriy Brun. 2020. Fairkit, Fairkit, on the Wall, Who’s the Fairest of Them All? Supporting Data Scientists in Training Fair Models. arXiv preprint arXiv:2012.09951(2020).

[56]

Michael Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. 2018. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In International Conference on Machine Learning. PMLR, 2564–2572.

[57]

Jon Kleinberg. 2018. Inherent trade-offs in algorithmic fairness. In Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems. 40–40.

Digital Library

[58]

Andrew J Ko, Robin Abraham, Laura Beckwith, Alan Blackwell, Margaret Burnett, Martin Erwig, Chris Scaffidi, Joseph Lawrance, Henry Lieberman, Brad Myers, 2011. The state of the art in end-user software engineering. ACM Computing Surveys (CSUR) 43, 3 (2011), 1–44.

Digital Library

[59]

Allison Koenecke, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R Rickford, Dan Jurafsky, and Sharad Goel. 2020. Racial disparities in automated speech recognition. Proceedings of the National Academy of Sciences 117, 14(2020), 7684–7689.

[60]

Julia Kupis, Sydney Johnson, Gregory M. Hallihan, and Dana Lee Olstad. 2019. Assessing the Usability of the Automated Self-Administered Dietary Assessment Tool (ASA24) among Low-Income Adults. Nutrients 11(2019).

[61]

Min Kyung Lee, Daniel Kusbit, Anson Kahng, Ji Tae Kim, Xinran Yuan, Allissa Chan, Daniel See, Ritesh Noothigattu, Siheon Lee, Alexandros Psomas, 2019. WeBuildAI: Participatory framework for algorithmic governance. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–35.

Digital Library

[62]

Michelle Seng Ah Lee, Luciano Floridi, and Jatinder Singh. 2021. Formalising trade-offs beyond algorithmic fairness: lessons from ethical philosophy and welfare economics. AI and Ethics 1, 4 (2021), 529–544.

[63]

Michelle Seng Ah Lee and Jat Singh. 2021. The landscape and gaps in open source fairness toolkits. In Proceedings of the 2021 CHI conference on human factors in computing systems. 1–13.

Digital Library

[64]

Michelle Seng Ah Lee and Jatinder Singh. 2021. Risk Identification Questionnaire for Detecting Unintended Bias in the Machine Learning Development Lifecycle. Association for Computing Machinery, New York, NY, USA, 704–714. https://doi.org/10.1145/3461702.3462572

Digital Library

[65]

Lorelei A Lingard, Sherry Espin, Barbara Rubin, Sarah Whyte, Marcela Colmenares, G. Ross Baker, Diane Doran, Ethan D. Grober, Beverley A. Orser, John Bohnen, and Richard Reznick. 2005. Getting teams to talk: development and pilot implementation of a checklist to promote interprofessional communication in the OR. Quality and Safety in Health Care 14 (2005), 340 – 346.

[66]

Michael Madaio, Lisa Egede, Hariharan Subramonyam, Jennifer Wortman Vaughan, and Hanna Wallach. 2021. Assessing the Fairness of AI Systems: AI Practitioners’ Processes, Challenges, and Needs for Support. arXiv preprint arXiv:2112.05675(2021).

[67]

Michael A Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach. 2020. Co-Designing Checklists to Understand Organizational Challenges and Opportunities around Fairness in AI. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[68]

Wes McKinney 2011. pandas: a foundational Python library for data analysis and statistics. Python for high performance and scientific computing 14, 9 (2011), 1–9.

[69]

Jacob Metcalf, Emanuel Moss, 2019. Owning ethics: Corporate logics, silicon valley, and the institutionalization of ethics. Social Research: An International Quarterly 86, 2 (2019), 449–476.

[70]

Milagros Miceli, Julian Posada, and Tianling Yang. 2022. Studying Up Machine Learning Data: Why Talk About Bias When We Mean Power?Proceedings of the ACM on Human-Computer Interaction 6, GROUP(2022), 1–14.

[71]

Milagros Miceli, Martin Schuessler, and Tianling Yang. 2020. Between subjectivity and imposition: Power dynamics in data annotation for computer vision. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2(2020), 1–25.

Digital Library

[72]

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency. 220–229.

Digital Library

[73]

Shira Mitchell, Eric Potash, Solon Barocas, Alexander D’Amour, and Kristian Lum. 2018. Prediction-based decisions and fairness: A catalogue of choices, assumptions, and definitions. arXiv preprint arXiv:1811.07867(2018).

[74]

Deirdre K Mulligan, Joshua A Kroll, Nitin Kohli, and Richmond Y Wong. 2019. This thing called fairness: disciplinary confusion realizing a value in technology. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–36.

Digital Library

[75]

Lauren Murphy, Mary Beth Kery, Oluwatosin Alliyu, Andrew Peter Macvean, and Brad A. Myers. 2018. API Designers in the Field: Design Practices and Challenges for Creating Usable APIs. 2018 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) (2018), 249–258.

[76]

Brad A. Myers, Amy J. Ko, Thomas D. LaToza, and YoungSeok Yoon. 2016. Programmers Are Users Too: Human-Centered Methods for Improving Programming Tools. Computer 49(2016), 44–52.

[77]

Brad A. Myers and Jeffrey Stylos. 2016. Improving API usability. Commun. ACM 59(2016), 62 – 69.

Digital Library

[78]

Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 6464 (2019), 447–453.

[79]

Samir Passi and Solon Barocas. 2019. Problem formulation and fairness. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 39–48.

Digital Library

[80]

Samir Passi and Steven Jackson. 2017. Data vision: Learning to see through algorithmic abstraction. In Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing. 2436–2447.

Digital Library

[81]

Samir Passi and Steven J Jackson. 2018. Trust in data science: Collaboration, translation, and accountability in corporate data science projects. Proceedings of the ACM on Human-Computer Interaction 2, CSCW(2018), 1–28.

Digital Library

[82]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017).

[83]

Amandalynne Paullada, Inioluwa Deborah Raji, Emily M Bender, Emily Denton, and Alex Hanna. 2021. Data and its (dis) contents: A survey of dataset development and use in machine learning research. Patterns 2, 11 (2021), 100336.

[84]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.

Digital Library

[85]

Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2018. Data lifecycle challenges in production machine learning: a survey. ACM SIGMOD Record 47, 2 (2018), 17–28.

Digital Library

[86]

R Core Team. 2017. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/

[87]

Bogdana Rakova, Jingying Yang, Henriette Cramer, and Rumman Chowdhury. 2020. Where Responsible AI meets Reality: Practitioner Perspectives on Enablers for shifting Organizational Practices. arXiv preprint arXiv:2006.12358(2020).

[88]

Bogdana Rakova, Jingying Yang, Henriette Cramer, and Rumman Chowdhury. 2021. Where responsible AI meets reality: Practitioner perspectives on enablers for shifting organizational practices. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1(2021), 1–23.

Digital Library

[89]

Brianna Richardson, Jean Garcia-Gathright, Samuel F Way, Jennifer Thom, and Henriette Cramer. 2021. Towards Fairness in Practice: A Practitioner-Oriented Rubric for Evaluating Fair ML Toolkits. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–13.

Digital Library

[90]

Niloufar Salehi, Lilly C Irani, Michael S Bernstein, Ali Alkhatib, Eva Ogbe, and Kristy Milland. 2015. We are dynamo: Overcoming stalling and friction in collective action for crowd workers. In Proceedings of the 33rd annual ACM conference on human factors in computing systems. 1621–1630.

Digital Library

[91]

Pedro Saleiro, Benedict Kuester, Loren Hinkson, Jesse London, Abby Stevens, Ari Anisfeld, Kit T Rodolfa, and Rayid Ghani. 2018. Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:1811.05577(2018).

[92]

Nithya Sambasivan, Erin Arnesen, Ben Hutchinson, Tulsee Doshi, and Vinodkumar Prabhakaran. 2021. Re-imagining algorithmic fairness in india and beyond. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 315–328.

Digital Library

[93]

Nithya Sambasivan, Shivani Kapania, Hannah Highfill, Diana Akrong, Praveen Kumar Paritosh, and Lora Mois Aroyo. 2021. ”Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI.

[94]

Morgan Klaus Scheuerman, Alex Hanna, and Emily Denton. 2021. Do datasets have politics? Disciplinary values in computer vision dataset development. Proceedings of the ACM on Human-Computer Interaction 5, CSCW2(2021), 1–37.

Digital Library

[95]

Andrew D Selbst, Danah Boyd, Sorelle A Friedler, Suresh Venkatasubramanian, and Janet Vertesi. 2019. Fairness and abstraction in sociotechnical systems. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 59–68.

Digital Library

[96]

Hong Shen, Wesley H Deng, Aditi Chattopadhyay, Zhiwei Steven Wu, Xu Wang, and Haiyi Zhu. 2021. Value Cards: An Educational Toolkit for Teaching Social Impacts of Machine Learning through Deliberation. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 850–861.

Digital Library

[97]

Hong Shen, Alicia DeVos, Motahhare Eslami, and Kenneth Holstein. 2021. Everyday Algorithm Auditing: Understanding the Power of Everyday Users in Surfacing Harmful Algorithmic Behaviors. Proc. ACM Hum.-Comput. Interact. 5, CSCW2, Article 433 (oct 2021), 29 pages. https://doi.org/10.1145/3479577

Digital Library

[98]

Korsuk Sirinukunwattana, Shan e Ahmed Raza, Yee-Wah Tsang, David R. J. Snead, Ian A. Cree, and Nasir M. Rajpoot. 2016. Locality Sensitive Deep Learning for Detection and Classification of Nuclei in Routine Colon Cancer Histology Images. IEEE transactions on medical imaging 35 5 (2016), 1196–1206.

[99]

Megha Srivastava, Hoda Heidari, and Andreas Krause. 2019. Mathematical Notions vs. Human Perception of Fairness: A Descriptive Approach to Fairness for Machine Learning. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining(2019).

Digital Library

[100]

Jeffrey Stylos and Brad A Myers. 2008. The implications of method placement on API learnability. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering. 105–112.

Digital Library

[101]

Joshua Sushine, James D Herbsleb, and Jonathan Aldrich. 2015. Searching the state space: A qualitative study of API protocol usability. In 2015 IEEE 23rd International Conference on Program Comprehension. IEEE, 82–93.

Digital Library

[102]

Florian Tramer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu, Jean-Pierre Hubaux, Mathias Humbert, Ari Juels, and Huang Lin. 2017. FairTest: Discovering unwarranted associations in data-driven applications. In 2017 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 401–416.

[103]

Niels van Berkel, Jorge Goncalves, Daniel Russo, Simo Hosio, and Mikael B Skov. 2021. Effect of Information Presentation on Fairness Perceptions of Machine Learning Predictors. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–13.

Digital Library

[104]

Guido Van Rossum and Fred L Drake Jr. 1995. Python reference manual. Centrum voor Wiskunde en Informatica Amsterdam.

[105]

Maarten van Someren, Yvonne Barnard, and Jacobijn A. C. Sandberg. 1994. The think aloud method: a practical approach to modelling cognitive processes. Knowledge Based Systems(1994).

[106]

Neil Vigdor. 2019. Apple card investigated after gender discrimination complaints. The New York Times (2019).

[107]

Dakuo Wang, Q Vera Liao, Yunfeng Zhang, Udayan Khurana, Horst Samulowitz, Soya Park, Michael Muller, and Lisa Amini. 2021. How Much Automation Does a Data Scientist Want?arXiv preprint arXiv:2101.03970(2021).

[108]

Ruotong Wang, F Maxwell Harper, and Haiyi Zhu. 2020. Factors influencing perceived fairness in algorithmic decision-making: Algorithm outcomes, development procedures, and individual differences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[109]

James Wexler, Mahima Pushkarna, Tolga Bolukbasi, Martin Wattenberg, Fernanda Viégas, and Jimbo Wilson. 2019. The what-if tool: Interactive probing of machine learning models. IEEE transactions on visualization and computer graphics 26, 1(2019), 56–65.

[110]

Chamila Wijayarathna and Nalin AG Arachchilage. 2019. An empirical usability analysis of the google authentication api. In Proceedings of the Evaluation and Assessment on Software Engineering. 268–274.

Digital Library

[111]

Richmond Y Wong and Tonya Nguyen. 2021. Timelines: A World-Building Activity for Values Advocacy. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.

Digital Library

[112]

Bowen Yu, Ye Yuan, Loren G. Terveen, Zhiwei Steven Wu, Jodi Forlizzi, and Haiyi Zhu. 2020. Keeping Designers in the Loop: Communicating Inherent Algorithmic Trade-offs Across Multiple Objectives. Proceedings of the 2020 ACM Designing Interactive Systems Conference (2020).

Digital Library

[113]

Amy X. Zhang, Michael Muller, and Dakuo Wang. 2020. How do Data Science Workers Collaborate? Roles, Workflows, and Tools. Proc. ACM Hum.-Comput. Interact.CSCW (Oct. 2020).

Digital Library

[114]

Z Zhong. 2018. A Tutorial on Fairness in Machine Learning. https://towardsdatascience.com/a-tutorial-on-fairness-in-machine-learning-3ff8ba1040cb

Cited By

Sarkar PLiem C(2024)"It's the most fair thing to do but it doesn't make any sense": Perceptions of Mathematical Fairness Notions by Hiring ProfessionalsProceedings of the ACM on Human-Computer Interaction10.1145/36373608:CSCW1(1-35)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637360
Kaur HConrad MRule DLampe CGilbert E(2024)Interpretability Gone Bad: The Role of Bounded Rationality in How Practitioners Understand Machine LearningProceedings of the ACM on Human-Computer Interaction10.1145/36373548:CSCW1(1-34)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637354
Whitney CNorman J(2024)Real Risks of Fake Data: Synthetic Data, Diversity-Washing and Consent CircumventionProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659002(1733-1744)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3659002
Show More Cited By

Index Terms

Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits
1. Computing methodologies
  1. Machine learning
2. Human-centered computing

Index terms have been assigned to the content through auto-classification.

Recommendations

The Landscape and Gaps in Open Source Fairness Toolkits
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

With the surge in literature focusing on the assessment and mitigation of unfair outcomes in algorithms, several open source ‘fairness toolkits’ recently emerged to make such methods widely accessible. However, little studied are the differences in ...
Read More
Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need?
CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems

The potential for machine learning (ML) systems to amplify social inequities and unfairness is receiving increasing popular and academic attention. A surge of recent work has focused on the development of algorithmic tools to assess and mitigate such ...
Read More
Towards Fairness in Practice: A Practitioner-Oriented Rubric for Evaluating Fair ML Toolkits
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

In order to support fairness-forward thinking by machine learning (ML) practitioners, fairness researchers have created toolkits that aim to transform state-of-the-art research contributions into easily-accessible APIs. Despite these efforts, recent ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency

June 2022

2351 pages

ISBN:9781450393522

DOI:10.1145/3531146

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

ACM: Association for Computing Machinery

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2022

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Carnegie Mellon University Block Center for Technology and Society Award
Aviva and the UK Engineering and Physical Science Research Council
Jacob Foundation for CERES network
National Science Foundation

Conference

FAccT '22

Sponsor:

ACM

FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency

June 21 - 24, 2022

Seoul, Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

31
Total Citations
View Citations
2,153
Total Downloads

Downloads (Last 12 months)1,071
Downloads (Last 6 weeks)92

Other Metrics

View Author Metrics

Citations

Cited By

Sarkar PLiem C(2024)"It's the most fair thing to do but it doesn't make any sense": Perceptions of Mathematical Fairness Notions by Hiring ProfessionalsProceedings of the ACM on Human-Computer Interaction10.1145/36373608:CSCW1(1-35)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637360
Kaur HConrad MRule DLampe CGilbert E(2024)Interpretability Gone Bad: The Role of Bounded Rationality in How Practitioners Understand Machine LearningProceedings of the ACM on Human-Computer Interaction10.1145/36373548:CSCW1(1-34)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637354
Whitney CNorman J(2024)Real Risks of Fake Data: Synthetic Data, Diversity-Washing and Consent CircumventionProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659002(1733-1744)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3659002
Madaio MKapania SQadri RWang DZaldivar ADenton RWilcox L(2024)Learning about Responsible AI On-The-Job: Learning Pathways, Orientations, and AspirationsProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658988(1544-1558)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3658988
Vengroff D(2024)Impact Charts: A Tool for Identifying Systematic Bias in Social Systems and DataProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658965(1187-1198)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3658965
Johnson NMoharana SHarrington CAndalibi NHeidari HEslami M(2024)The Fall of an Algorithm: Characterizing the Dynamics Toward AbandonmentProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658910(337-358)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3658910
Wang ZMunechika DLee SChau D(2024)SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational NotebooksExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650848(1-17)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3650848
Xiao ZDeng WLam MEslami MKim JLee MLiao Q(2024)Human-Centered Evaluation and Auditing of Language ModelsExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3636302(1-6)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3636302
Harrison GBryson KBamba ADovichi LBinion ABorem AUr B(2024)JupyterLab in Retrograde: Contextual Notifications That Highlight Fairness and Bias Issues for Data ScientistsProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642755(1-19)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642755
Kommiya Mothilal RGuha SAhmed S(2024)Towards a Non-Ideal Methodological Framework for Responsible MLProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642501(1-17)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642501
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents