skip to main content
10.1145/3531146.3533221acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article
Public Access

Human-Algorithm Collaboration: Achieving Complementarity and Avoiding Unfairness

Published: 20 June 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Much of machine learning research focuses on predictive accuracy: given a task, create a machine learning model (or algorithm) that maximizes accuracy. In many settings, however, the final prediction or decision of a system is under the control of a human, who uses an algorithm’s output along with their own personal expertise in order to produce a combined prediction. One ultimate goal of such collaborative systems is complementarity: that is, to produce lower loss (equivalently, greater payoff or utility) than either the human or algorithm alone. However, experimental results have shown that even in carefully-designed systems, complementary performance can be elusive. Our work provides three key contributions. First, we provide a theoretical framework for modeling simple human-algorithm systems and demonstrate that multiple prior analyses can be expressed within it. Next, we use this model to prove conditions where complementarity is impossible, and give constructive examples of where complementarity is achievable. Finally, we discuss the implications of our findings, especially with respect to the fairness of a classifier. In sum, these results deepen our understanding of key factors influencing the combined performance of human-algorithm systems, giving insight into how algorithmic tools can best be designed for collaborative environments.

    References

    [1]
    Gagan Bansal, Besmira Nushi, Ece Kamar, Eric Horvitz, and Daniel S Weld. 2021. Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 11405–11414.
    [2]
    Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, and Daniel Weld. 2021. Does the whole exceed its parts? The effect of AI explanations on complementary team performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–16.
    [3]
    Emma Beede, Elizabeth Baylor, Fred Hersch, Anna Iurchenko, Lauren Wilcox, Paisan Ruamviboonsuk, and Laura M Vardoulakis. 2020. A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.
    [4]
    Bo Cowgill and Megan T Stevenson. 2020. Algorithmic social engineering. In AEA Papers and Proceedings, Vol. 110. 96–100.
    [5]
    Abir De, Nastaran Okati, Ali Zarezade, and Manuel Gomez-Rodriguez. 2021. Classification under human assistance. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 5905–5913.
    [6]
    Maria De-Arteaga, Riccardo Fogliato, and Alexandra Chouldechova. 2020. A case for humans-in-the-loop: Decisions in the presence of erroneous algorithmic scores. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.
    [7]
    Luca Didaci, Giorgio Fumera, and Fabio Roli. 2013. Diversity in classifier ensembles: Fertile concept or dead end?. In International workshop on multiple classifier systems. Springer, 37–48.
    [8]
    Berkeley J Dietvorst and Soaham Bharti. 2020. People reject algorithms in uncertain decision domains because they have diminishing sensitivity to forecasting error. Psychological science 31, 10 (2020), 1302–1314.
    [9]
    Berkeley J Dietvorst, Joseph P Simmons, and Cade Massey. 2018. Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Science 64, 3 (2018), 1155–1170.
    [10]
    Cynthia Dwork and Christina Ilvento. 2018. Fairness Under Composition. In 10th Innovations in Theoretical Computer Science Conference (ITCS 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.
    [11]
    Cynthia Dwork, Christina Ilvento, and Meena Jagadeesan. 2020. Individual Fairness in Pipelines. In 1st Symposium on Foundations of Responsible Computing (FORC 2020). Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
    [12]
    Aaron Fraenkel. 2020. Fairness and Algorithmic Decision Making. https://afraenkel.github.io/fairness-algo-decision
    [13]
    Robert Geirhos, Kristof Meding, and Felix A Wichmann. 2020. Beyond accuracy: quantifying trial-by-trial behaviour of CNNs and humans by measuring error consistency. Advances in Neural Information Processing Systems 33 (2020), 13890–13902.
    [14]
    Talia Gillis, Bryce McLaughlin, and Jann Spiess. 2021. On the Fairness of Machine-Assisted Human Decisions. arXiv preprint arXiv:2110.15310(2021).
    [15]
    Joel Hestness, Sharan Narang, Newsha Ardalani, Gregory F. Diamos, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, and Yanqi Zhou. 2017. Deep Learning Scaling is Predictable, Empirically. CoRR abs/1712.00409(2017). arXiv:1712.00409http://arxiv.org/abs/1712.00409
    [16]
    Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. 2020. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361(2020).
    [17]
    Gavin Kerrigan, Padhraic Smyth, and Mark Steyvers. 2021. Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration. (2021).
    [18]
    Vijay Keswani, Matthew Lease, and Krishnaram Kenthapadi. 2021. Towards unbiased and accurate deferral to multiple experts. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 154–165.
    [19]
    Vijay Keswani, Matthew Lease, and Krishnaram Kenthapadi. 2022. Designing Closed Human-in-the-loop Deferral Pipelines. arXiv preprint arXiv:2202.04718(2022).
    [20]
    Ludmila I Kuncheva. 2014. Combining pattern classifiers: methods and algorithms. John Wiley & Sons.
    [21]
    Sarah Lebovitz, Natalia Levina, and Hila Lifshitz-Assaf. 2021. Is AI ground truth really “true”? The dangers of training and evaluating AI tools based on experts’ know-what. Management Information Systems Quarterly(2021).
    [22]
    Sarah Lebovitz, Hila Lifshitz-Assaf, and Natalia Levina. 2020. To incorporate or not to incorporate AI for critical judgments: The importance of ambiguity in professionals’ judgment process. Collective Intelligence, The Association for Computing Machinery (2020).
    [23]
    David Madras, Toni Pitassi, and Richard Zemel. 2018. Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Vol. 31. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2018/file/09d37c08f7b129e96277388757530c72-Paper.pdf
    [24]
    Paul E Meehl. 1954. Clinical versus statistical prediction: A theoretical analysis and a review of the evidence.(1954).
    [25]
    Vahid Balazadeh Meresht, Abir De, Adish Singla, and Manuel Gomez-Rodriguez. 2020. Learning to switch between machines and humans. ICML Workshop on Human-AI Collaboration in Sequential Decision-Making. (2020).
    [26]
    Nastaran Okati, Abir De, and Manuel Rodriguez. 2021. Differentiable learning under triage. Advances in Neural Information Processing Systems 34 (2021).
    [27]
    Chinasa T Okolo, Srujana Kamath, Nicola Dell, and Aditya Vashistha. 2021. “It cannot do all of my work”: Community Health Worker Perceptions of AI-Enabled Mobile Health Applications in Rural India. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–20.
    [28]
    Forough Poursabzi-Sangdeh, Daniel G Goldstein, Jake M Hofman, Jennifer Wortman Wortman Vaughan, and Hanna Wallach. 2021. Manipulating and measuring model interpretability. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–52.
    [29]
    Maithra Raghu, Katy Blumer, Greg Corrado, Jon Kleinberg, Ziad Obermeyer, and Sendhil Mullainathan. 2018. The algorithmic automation problem: Prediction, triage, and human effort. NeurIPS Workshop on Machine Learning for Health (ML4H) (2018).
    [30]
    Rory Sayres, Ankur Taly, Ehsan Rahimy, Katy Blumer, David Coz, Naama Hammel, Jonathan Krause, Arunachalam Narayanaswamy, Zahra Rastegar, Derek Wu, Shawn Xu, Scott Barb, Anthony Joseph, Michael Shumski, Jesse Smith, Arjun B. Sood, Greg S. Corrado, Lily Peng, and Dale R. Webster. 2019. Using a deep learning algorithm and integrated gradients explanation to assist grading for diabetic retinopathy. Ophthalmology 126, 4 (2019), 552–564.
    [31]
    Megha Srivastava, Besmira Nushi, Ece Kamar, Shital Shah, and Eric Horvitz. 2020. An empirical analysis of backward compatibility in machine learning systems. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3272–3280.
    [32]
    Eleni Straitouri, Lequn Wang, Nastaran Okati, and Manuel Gomez Rodriguez. 2022. Provably Improving Expert Predictions with Conformal Prediction. arxiv:2201.12006 [cs.LG]
    [33]
    Sarah Tan, Julius Adebayo, Kori Inkpen, and Ece Kamar. 2018. Investigating human+ machine complementarity for recidivism predictions. NeurIPS 2018 Workshop on Ethical, Social and Governance Issues in AI (2018).
    [34]
    I. Valera, A. Singla, and M. Gomez Rodriguez. 2018. Enhancing the Accuracy and Fairness of Human Decision Making. In Advances in Neural Information Processing Systems 31 (NeurIPS 2018). Curran Associates, Inc., 1774–1783. http://papers.nips.cc/paper/7448-enhancing-the-accuracy-and-fairness-of-human-decision-making.pdf
    [35]
    Kailas Vodrahalli, Tobias Gerstenberg, and James Zou. 2021. Do Humans Trust Advice More if it Comes from AI? An Analysis of Human-AI Interactions. arXiv preprint arXiv:2107.07015(2021).
    [36]
    Xuezhi Wang, Nithum Thain, Anu Sinha, Flavien Prost, Ed H Chi, Jilin Chen, and Alex Beutel. 2021. Practical Compositional Fairness: Understanding Fairness in Multi-Component Recommender Systems. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 436–444.
    [37]
    Qian Yang, Alex Scuito, John Zimmerman, Jodi Forlizzi, and Aaron Steinfeld. 2018. Investigating how experienced UX designers effectively work with machine learning. In Proceedings of the 2018 Designing Interactive Systems Conference. 585–596.
    [38]
    Ming Yin, Jennifer Wortman Vaughan, and Hanna Wallach. 2019. Understanding the effect of accuracy on trust in machine learning models. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.
    [39]
    Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. 2017. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment. In Proceedings of the 26th International Conference on World Wide Web (Perth, Australia) (WWW ’17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 1171–1180. https://doi.org/10.1145/3038912.3052660

    Cited By

    View all
    • (2024)"It depends": Configuring AI to Improve Clinical Usefulness Across ContextsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660707(874-889)Online publication date: 1-Jul-2024
    • (2023)A Framework for Human-Algorithm Teaming in Biometric Identity WorkflowsProceedings of the Human Factors and Ergonomics Society Annual Meeting10.1177/2169506723119269267:1(523-528)Online publication date: 21-Oct-2023
    • (2023)Toward Operationalizing Pipeline-aware ML Fairness: A Research Agenda for Developing Practical Guidelines and ToolsProceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization10.1145/3617694.3623259(1-11)Online publication date: 30-Oct-2023
    • Show More Cited By

    Index Terms

    1. Human-Algorithm Collaboration: Achieving Complementarity and Avoiding Unfairness
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency
      June 2022
      2351 pages
      ISBN:9781450393522
      DOI:10.1145/3531146
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 June 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      FAccT '22
      Sponsor:

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)446
      • Downloads (Last 6 weeks)44

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)"It depends": Configuring AI to Improve Clinical Usefulness Across ContextsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660707(874-889)Online publication date: 1-Jul-2024
      • (2023)A Framework for Human-Algorithm Teaming in Biometric Identity WorkflowsProceedings of the Human Factors and Ergonomics Society Annual Meeting10.1177/2169506723119269267:1(523-528)Online publication date: 21-Oct-2023
      • (2023)Toward Operationalizing Pipeline-aware ML Fairness: A Research Agenda for Developing Practical Guidelines and ToolsProceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization10.1145/3617694.3623259(1-11)Online publication date: 30-Oct-2023
      • (2023)Something Borrowed: Exploring the Influence of AI-Generated Explanation Text on the Composition of Human ExplanationsExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585727(1-7)Online publication date: 19-Apr-2023
      • (2023)Harnessing human and machine intelligence for planetary-level climate actionnpj Climate Action10.1038/s44168-023-00056-32:1Online publication date: 17-Aug-2023

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media