— Alpha 5 (Power Rangers, 2017)
Peeking into the Black Box – Trust in AI – Part 2
— Alpha 5 (Power Rangers, 2017)
Part 1 of this series, has introduced the concept of explainability of AI-systems (XAI) as a vital component of AiX Design. This article looks at one of the primary goals of explainability: Trust in AI.
The high-level “Ethics Guidelines for Trustworthy AI” published by the European Commission (High-level Expert Group on Artificial Intelligence, 2019) emphasize the connection between explainability and trust, stating: “Explicability is crucial for building and maintaining users’ trust in AI-systems.”, indicating that not only is trust necessary but understanding AI-systems will enable it.
Complexity makes the understanding of AI-driven automation often impractical or even impossible, confronting users and other stakeholders with uncertainty and risk. In situations of vulnerability, trust becomes the enabling factor for collaboration. In fact, trust is a fundamental requirement for the wide acceptance of AI-systems.
Consequently, when developing AI-enabled products, product managers, software developers and designers often face the question: “How can we increase our users’ trust in our AI-enabled products?” To successfully design for trust in AI, the assumptions underlying this question need to be scrutinized: Does trustworthiness determine trust? In turn, does trust determine usage? And is trust always desirable?
This article is intended to encourage product managers, developers, and designers to take a differentiated perspective on trust in AI that goes beyond asking for increase of trust by understanding what factors influence the dynamics of trust and user behavior.
What makes AI-systems trustworthy?
The default position of humans is to trust – we trust unless we have reason to believe that trust is inappropriate. Research shows that humans put high trust into unfamiliar automated decision aids, expecting them to be reliable and outperform human aids (Dzindolet et al., 2002 and 2003).
This circumstance is less beneficial for product developers than it may seem. In practice, an advance in trust means users will approach products with expectations that will easily be violated – making even well but not perfectly performing systems seem not trustworthy as users overcompensate for their disappointment. Unfortunately, trust tends to be conditioned on a system’s worst behavior rather than its overall performance (Muir and Moray, 1996). Trust, once lost, only builds back up slowly. Thus, moderating users’ expectations through the framing of a product but also through its design is crucial to counteract this effect.
To design for trust, it is necessary to first understand what trust is built upon. What attributes of an AI-system need to be considered? There are three fundamental bases on which humans assess the trustworthiness of automation (Lee and Moray, 1992):
- Performance – What is the system doing?
(What is its behavior, reliability, predictability, competence, expertise?)
- Process – How is the system operating?
(What are its mechanisms? Is the system’s algorithm appropriate? Does it accept feedback to improve?)
- Purpose – Why does the system exist?
(What is its intended use? Are its creator’s motives benevolent?)
Humans make inferences between the three bases (E.g., a transparent process indicates good intentions.) Thus, trust founded on only one of them tends to be fragile (Lee and See, 1994). Consequently, providing test metrics on performance is not sufficient to instill robust trust, but transparency of the working mechanisms (process) and the purpose of an AI-system are required as well.
Does trust lead to usage?
The initial question posed in the introduction “How can we increase our users’ trust in our AI-enabled products?” implies that trusting in an AI-system is equivalent to relying on it, but research disproves this assumption. Trust is a belief about the trustworthiness of an AI-system that only moderates but does not determine a user’s behavior, that is to rely on and use the AI-system – or not to. There are a few factors that may implore users to use a distrusted AI-system or not to use a trusted system.
In the diagram above, Chancey et al. introduce risk as one of the moderating factors, but additional ones need to be considered:
- Attentional capacity (workload, motivation, stress, boredom)
High workload, especially multitasking under time constraint, forces users to rely on an otherwise distrusted AI-system.
- Availability of alternatives
Users might not use a trusted AI-system because other choices are available.
- Effort to set up and engage the AI-system
When the cost to use an AI-system outweighs expected benefits, users might refrain from using an AI-system they otherwise trust.
- Investment in unaided performance
Users might have personal reasons not to delegate tasks to an otherwise trusted AI-system (e.g., reputation, the value of challenge, etc.).
- Perceived risk
With an increase in perceived risk, reliance on automated aids over human aids tends to increase (Lyons and Stokes, 2012).
Users more confident in their own capabilities tend to rely less on otherwise trusted AI-system.
- Subject matter expertise
Experts tend to rely less on otherwise trusted AI-system than novices.
These factors should be kept in mind, especially during user research as they might not take effect in a lab setting. Users who trust an AI-system might still decide not to use it under certain circumstances.
Is more trust always better?
As shown, trust is a belief based on perceptions of trustworthiness. These perceptions need not reflect reality. Jacovi et al. (2021) define “warranted” vs. “unwarranted” trust. This reveals another issue with the question: “How can we increase our users’ trust in our AI-enabled products?” Product developers cannot just uncritically strive to enhance trust. They naturally want to prevent users distrusting and consequently not using their products. They also need to help users to not overtrust and thus misuse AI-systems. Only appropriate trust can reliably improve joint human-AI performance above the performance of each alone (Sorkin and Woods, 1985; Wickens et al., 2000). Accepting that AI-system are imperfect, product developers need to guide users to trust appropriately to help optimize the outcome of their decision-making processes.
If trust is not properly calibrated, unwarranted distrust (where trust falls short of an AI-system’s capabilities) may lead to disuse, such that benefits of AI support remain untapped. Equally important, overtrust (trust that exceeds the system’s capabilities) may lead to overreliance and misuse, putting users and other stakeholders at risk (Lee and Moray, 1994; Muir, 1987). For an extreme example: in 2018, an Uber car in fully automated driving mode caused a deadly accident that may have been avoided had the safety-driver not overly relied on the automation and had paid appropriate attention to the road (BBC, 2020).
When calibrating trust, two factors need to be considered (Lee and See, 2004; see the diagram below):
How closely does a user’s trust reflect the AI-system’s actual capabilities?
To what degree can a user assign trust to components of an AI-system rather than to the system as a whole? How quickly does a user’s trust change with changing capabilities of the AI-system?
The challenge of appropriate trust also applies to explainability. Explanations on why an AI-system might be mistaken have been shown to increase trust (e.g., Dzindolet et al., 2003) and in turn reliance, as they make the process of the system observable. But this holds true independent of actual performance. Thus, explanations can lead to overtrust and misuse as well. A study by Eiband (2019) shows that the gain in trust from detailed factual explanations is comparable to that of “placebic” explanations – those that pretend to explain without providing information, like “We need you to provide this data because the algorithm needs it to work.” Consequently, explanations need to be designed carefully, so to help users build informed, calibrated trust.
Product managers, software developers and designers need to be aware that automation does not simply reduce user errors but replaces them with designer errors. This creates responsibility for product management that needs to be taken seriously. Hence, the first requisite to gain the trust of users is to genuinely strive to create trustworthy products and services – and to act trustworthily and responsibly as producers.
Humans are prone to automation bias (Parasuraman and Riley, 1997). They tend to disregard information that contradicts an automated solution they have already accepted as correct. While product developers want users to benefit from the opportunities of AI-technologies, at the same time, they need to enable users to make informed decisions about when it is prudent to rely on automation and when it is not. This requires to not only to prove that AI-systems work, but also to be transparent about their limitations and intentions in developing them.
The Independent High-Level Expert Group on Artificial Intelligence set up by the European Commission provides an Assessment List for Trustworthy Artificial Intelligence, that you can use as a starting point to assess trustworthiness of your AI systems in practice.
- Peeking into the Black Box – A Design Perspective on Comprehensible AI – Part 1
- The Need for AiX Design
- The Need for an AiX Design System
- BBC (with analysis by Cellan-Jones R) (2020). ‘Uber's self-driving operator charged over fatal crash’, BBC News [online], accessible at: https://www.bbc.com/news/technology-54175359, BBC. (Accessed: 2 December 2021)
- Chancey E T, Bliss J P, Yamani Y, Handley H A H (2017). ‘Trust and the Compliance-Reliance Paradigm: The Effects of Risk, Error Bias, and Reliability on Trust and Dependence’, Human Factors, vol 59, no 3, pp 333–345, Human Factors and Ergonomics Society.
- Dzindolet M T, Peterson S A, Pomranky R A, Pierce L G, Beck H P (2003), ‘The role of trust in automation reliance’, International Journal of Human-Computer Studies, no 58, pp 697–718, Elsevier Science Ltd.
- Dzindolet M T, Pierce L, Beck H P, Dawe L (2002). ‘The Perceived Utility of Human and Automated Aids in a Visual Detection Task’, Human Factors, vol 44, pp 79–94, Human Factors and Ergonomics Society.
- Eiband M, Buschek D, Kremer A, Hussmann H (2019), ‘The Impact of Placebic Explanations on Trust in Intelligent Systems’, CHI’19 Extended Abstracts, Association for Computing Machinery (ACM).
- Jacovi A, Marasović A, Miller T, Goldberg Y (2021), ‘Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI’, ACM Conference on Fairness, Accountability, and Transparency ‘21 (FAccT ’21), Association for Computing Machinery (ACM).
- High-level Expert Group on Artificial Intelligence (2019). ‘Ethics Guidelines for Trustworthy AI’, European Commission.
- Lee J, Moray N (1992). ‘Trust, control strategies and allocation of function in human-machine systems’, Ergonomics, vol 35, no 10, pp 1243–270, Taylor & Francis Ltd.
- Lee J, Moray N (1994). ‘Trust, self-confidence, and operators’ adaptation to automation’, International Journal for Human-Computer Studies, iss 40, pp 153–184, Academic Press Ltd.
- Lee J D, See K A (2004). ‘Trust in Automation: Designing for Appropriate Reliance’, Human Factors, vol 46, no 1, pp 50–80, Human Factors and Ergonomics Society.
- Lyons J B, Stokes C (2012). ‘Human-Human Reliance in the Context of Automation’, Human Factors, vol 54, no 1, pp 112–121, Human Factors and Ergonomics Society.
- Muir B M (1987). ‘Trust between humans and machines, and the design of decision aids’, International Journal of Man-Machine Studies (Continued as International Journal of Human-Computer Studies), vol 7, iss 5–6, pp 527–539, Elsevier Ltd.
- Muir B M, Moray N (1996), ‘Trust in Automation. Part II. Experimental Studies of Trust and Human Intervention in a Process Control Simulation’, Ergonomics, vol 39, no 3, pp 429–460, Taylor & Francis Ltd.
- Parasuraman R, Riley V (1997), ‘Humans and Automation: Use, Misuse, Disuse, Abuse’, Human Factors, vol 39, no 2, pp 230–253, Human Factors and Ergonomics Society.
- Sorkin R D, Woods D D (1985). ‘Systems with Human Monitors: A Signal Detection Analysis’, Human-Computer Interaction, vol 1, pp 49–75, Lawrence Erlbaum Associates, Inc.
- Wickens C D, Gempler K, Morphew M E (2000). ‘Workload and Reliability of Predictor Displays in Aircraft Traffic Avoidance’, Transportation Human Factors, vol 2, iss 2, pp 99–126, Taylor & Francis Ltd.
- Adams B D (2005), ‘Trust vs. Confidence’, Defence Research and Development Canada (DRDC) Toronto.
- Alarcon G M, Gibson A M, Jessup S A (2020). ‘Trust Repair in Performance, Process, and Purpose Factors of Human-Robot Trust’, 2020 IEEE International Conference on Human-Machine Systems (ICHMS), pp 1–6, IEEE.
- Antifakos S, Kern N, Schiele B, Schwaninger A (2005). ‘Towards Improving Trust in Context-Aware Systems by Displaying System Confidence’, Mobile HCI’05, Association of Computing Machinery (ACM).
- Coeckelbergh M (2011). ‘Can we trust robots?’, Ethics and Information Technology, 14 (2012), pp 53–60, Springer Nature.
- D’Cruz J (2020). ‘Trust and Distrust’, The Routledge Handbook of Trust and Philosophy, 1st edition, ch 3, Routledge.
- Ess C M (2020). ‘Trust and Information and Communication Technologies’, The Routledge Handbook of Trust and Philosophy, 1st edition, ch 31, Routledge.
- Foroughi C K, Devlin S, Pak R, Brown N L, Sibley C, Coyne J T (2021). ‘Near-Perfect Automation: Investigating Performance, Trust, and Visual Attention’, Human Factors, Human Factors and Ergonomics Society.
- Fox J E, Boehm-Davis D A (1998). ‘Effects of Age and Congestion Information Accuracy of Advanced Traveler Information Systems on User Trust and Compliance’, Transportation Research Record: Journal of the Transportation Research Board, vol 1621, iss 1, pp 43–49, The National Academies of Sciences, Engineering, Medicine – Transportation Research Board.
- Gille F, Jobin A, Ienca M (2020). ‘What we talk about when we talk about trust: Theory of trust for AI in healthcare’, Intelligence-Based Medicine, Elsevier B.V.
- Goldberg S C (2020). ‘Trust and Reliance’, The Routledge Handbook of Trust and Philosophy, 1st edition, ch 8, Routledge.
- Grodzinsky F, Miller K, Wolf M J (2020). ‘Trust in Artificial Agents’, The Routledge Handbook of Trust and Philosophy, 1st edition, ch 23, Routledge.
- Hoff K A, Bashir M (2015), ‘Trust in Automation: Integrating Empirical Evidence on Factor That Influence Trust’, Human Factors, vol 57, no 3, pp 407–434, Human Factors and Ergonomics Society.
- Jiang H, Kim B, Guan M Y (2018). ‘To Trust Or Not To Trust A Classifier’, 32nd Conference on Neural Information Processing Systems, Neural Information Processing Systems (NeurIPS).
- Kunkel J, Donkers T, Michael L, Brabu C-M, Ziegler J (2019). ‘Let Me Explain: Impact of Personal and Impersonal Explanations on Trust in Recommender Systems’, CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Association of Computing Machinery (ACM).
- Mayer R C, Davis J H, Schoorman F D (1995), ‘An Integrative Model of Organizational Trust’, Academy of Management Review, vol 20, no 3, pp 709–734, Academy of Management.
- Muir B M (1994). ‘Trust in automation: Part I. Theoretical issues in the study of trust and human intervention in automated systems’, Ergonomics, vol 37, no 11, pp 1905–1922, Taylor & Francis Ltd.
- Reeves B, Nass C (1996). The media equation: how people treat computers, television, and new media like real people and places, Cambridge University Press.
- Sullins J P (2020). ‘Trust in Robots’, The Routledge Handbook of Trust and Philosophy, 1st edition, ch 24, Routledge.
- Troshani I, Hill S R, Sherman C, Arthur D (2020). ‘Do We Trust in AI? Role of Anthropomorphism and Intelligence’, Journal of Computer Information Systems, vol 61, iss 5, pp 481–491, Taylor & Francis Online.
- Tschopp M (2020), ‘AI and Trust: Stop asking how to increase trust in AI’, scip AG blog [online], accessible at: https://www.scip.ch/en/?labs.20200220, SCIP. (Accessed: 1 November 2021)
- Wang N, Pynadath D V Rovira E, Barnes M J, Hill S G (2018), ‘Is it My looks? Or Something I Said? The Impact of Explanations, Embodiment, and Expectations on Trust and Performance in Human-Robot Teams’, Persuasive Technology: 13th International Conference, pp 56–69, Springer.
- Wieringa M (2020). ‘What to account for when accounting for algorithms’, FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp 1–18, Association of Computing Machinery (ACM).
- Yin M, Vaughan J W, Wallach H (2019). ‘Understanding the Effect of Accuracy on Trust in Machine Learning Models’, CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Association of Computing Machinery (ACM).
- Zhang Y, Lia Q V, Bellamy R K E (2020). ‘Effect of Confidence and Explanation on Accuracy on Trust Calibration in AI-Assisted Decision Making’, FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp 295–305, Association for Computing Machinery (ACM).