Back in September
Audio-CASI and its telephone counterpart, interactive voice response (IVR, also called telephone ACASI), have been shown to increase the reporting of socially undesirable behaviors relative to interviewer-administered surveys. But the development of the recorded voice files is a costly and time-consuming undertaking, and may in fact reintroduce some social presence, with respondents reacting to characteristics of the voice such as gender. One potential solution to both these problems may be the use of computer-generated voices (text-to-speech systems). We conducted an experiment to explore these issues, using an IVR survey on sensitive topics. We contrasted live interviewers (CATI) and recorded human voices with two different text-to-speech (TTS) systems, one sounding more human-like, the other more machine-like. We crossed this with gender of the voice, yielding a 4*2 experiment. Equal numbers of male and female subjects were recruited by telephone from list-based samples of Michigan residents and randomly assigned to mode, yielding almost 1,400 completes. We examined the effect of gender and “humanness” of voice on the reporting of socially desirable and undesirable behaviors. We also examined respondents’ reactions to the different voices and compared break-off rates across the different conditions to explore whether TTS systems could be a reasonable alternative to recorded human voices for audio-CASI and IVR applications.