Spoken Natural Language Dialog SystemsA Practical Approach$
Ronnie W. Smith and D. Richard Hipp

Print publication date: 1995

Print ISBN-13: 9780195091878

Published to Oxford Scholarship Online: November 2020

DOI: 10.1093/oso/9780195091878.001.0001

Experimental Results

(p.181) Chapter 7 Experimental Results
Spoken Natural Language Dialog Systems

Ronnie W. Smith

D. Richard Hipp

Oxford University Press

One of the main goals of this research was to develop a computational model that could be implemented and tested. Testing could serve at least two purposes: (1) Demonstrate the viability of the Missing Axiom Theory for dialog processing; and (2) Determine the ways that varying levels of dialog control influence the interaction between user and computer. Consequently, an experiment involving use of the system was constructed to test the effects of different levels of dialog control. The format and results of this experiment are reported in this chapter. The following hypotheses are proposed as performance differences by users as they gain experience and have the initiative. • Task completion time will decrease. • The number of utterances per dialog will decrease. • The percentage of “non-trivial” utterances will increase (a nontrivial utterance is any utterance longer than one word). • The average length of a non-trivial utterance will increase. • The rate of speech (number of utterances per minute) will decrease. These hypotheses are consistent with the intuition that as the user has more initiative, the user will put more thought into the process, reducing the rate of interaction. In addition, it is expected that when the user has more initiative, there would be an attempt to convey more detailed information in each non-trivial utterance. Finally, it is also believed that increased user initiative will be more helpful when the user gains experience and has more knowledge about performing the task independent of computer guidance. Two graduate students in computer science volunteered to use the system. Each subject received about 75 minutes of training on the speech recognizer with the 125 word vocabulary. The subjects then participated in three sessions on differing days. Each session consisted of four different problems where each problem consisted of a single missing wire. The results from these subjects tended to support our hypotheses. However, the experimental control for this testing was not well-defined. The two subjects are involved in AI and NL research and consequently have strong preconceptions about NL systems and what constitutes “proper” behavior toward such systems.

Keywords:   Audio tape transcription, Computer-human interaction, Experimental design, Human-computer interaction, Metadialog, Non-trivial utterances, Pilot subjects, Radio Shack, Subject vocabulary, Verbex

