Development of a measurement scale to assess Automated Driving System HMIs
report
While the introduction of Automated Driving Systems (ADS) is intended to make driving safer and more comfortable, it may also make the interaction between user and vehicle more complex and prone to confusion. Users will sometimes be the active driver and at other times function as supervisor of the automation system. These different roles come with different responsibilities for the user in terms of vehicle control and traffic monitoring. This may create unintended safety risks in case the user is not aware of what his/her responsibilities are and what the capabilities of the automation system are. It is therefore important to be able to assess to what extent ADS provide their users with clear and unambiguous information concerning system state and concerning allowed or expected actions from the side of the user. In this project, TNO has developed a questionnaire to measure whether the information provided by an ADS to its user is experienced as clear and unambiguous. The questionnaire consists of nine items, which were shown to discriminate between good and bad Human Machine Interface (HMI) designs in terms of user perception and comprehension. The development and evaluation of this questionnaire was done in four steps. First, a summary was compiled of existing methods and knowledge on the evaluation of HMis with respect to "clear and unambiguous" and similar concepts. This was done based on a literature review and expert interviews, with a particular focus on automation systems. Knowledge from other relevant domains, such as automation in the medical and military domains as well as in aviation, was also included. The results showed that, while design guidelines and principles are often fairly concrete and specific for the low level perceptual characteristics of HMis (e.g., use of colour and symbol size), they quickly become more general and abstract when it comes to how well the user can comprehend the information provided by a system. Moreover, quantitative measures of HMI quality are rare and no clear criteria for HMI evaluation are available. Hence, the current state of affairs is that each individual HMI has to be assessed on an ad-hoc basis with methods that need to be defined and implemented specifically for that HMI and are often qualitative rather than quantitative. The second step within the project was to start defining questionnaire items for a more general HMI evaluation method, which is not only applicable to HMis for different ADS, but also for a variety of user groups and use contexts. Initial items were formulated based on the literature review and expert interviews and were then ranked by a group of HMI design experts. This resulted in a set of 15 questionnaire items that could be presented as Likert scale items on a 7 point scale. In order to test the reliability and internal consistency of the questionnaire as well as explore underlying factors, an online survey was conducted as a third step. In total 99 participants evaluated six HMI displays for six different ADAS using the set of 15 questionnaire items defined in step 2. The HMis were designed to vary in the degree to which they provided clear and unambiguous information. Data analysis showed that responses on some of the items were highly correlated. A subset of 9 items was found to be able to discriminate well between HMis with different levels of clarity and (un)ambiguity. Exploratory factor analysis revealed two underlying factors, which could be interpreted to relate to perception and comprehension of HMI information. In the fourth and final step, the 9-item questionnaire was tested in an experimental setup. 23 Participants were presented with movie clips of transitions from or to automated driving, including a view of the traffic situation (through the windshield) and of the instrument cluster and steering wheel. Again, HMis were designed to be either clear and unambiguous or unclear/ambiguous. The results showed that the questionnaire discriminated reliably between both versions of the HMI. As in the on line survey, high reliability and internal consistency were observed. Additional questions intended to measure user comprehension more objectively turned out to be less useful. In contrast, qualitative measurement of user experience by means of Product Reaction Cards also showed clear differences between the two different HMI versions. While the questionnaire discriminated successfully between clear and unclear HMis, it only measured user perception and comprehension, not user response, as this was beyond the scope of the project. Obviously, in real life user response is relevant, as this determines the safety impact of an HMI. Therefore, further research should focus on the extent to which an HMI promotes adequate and timely user responses. Several other relevant factors, such as user diversity, user attention and traffic situation should be considered as well. Moreover, in order to use the questionnaire in HMI assessment, a reference point should be established by means of a benchmark HMI.
Topics
TNO Identifier
987831
Publisher
TNO
Collation
66 p. (incl. bijl.)
Place of publication
Helmond