Voice AI

a collection of our works on the topic Voice AI.

Singing for the Missing: Bringing the Body Back to AI Voice and Speech Technologies

The ever-evolving capabilities of AI technologies to generate and synthesise voice material poses unique challenges when it comes to understanding how these technologies implicate the Body. Our paper seeks to uncover new practices for bringing back the Body in voice-Bodies, whilst also nurturing a space for the constructive development with participation of practitioners, informed use and consensual deployment of generative voice and speech technology.

This approach is informed by interdisciplinary perspectives from science and technology studies, philosophy of technology, critical technology studies in feminism, AI and machine learning, sound studies, and musical practices. From this interdisciplinary perspective we probe how the Body is made invisible by connecting with concepts from sound studies and electroacoustic music compositional theory and practice.

Informed with this interdisciplinary perspective, we propose a series of recommendations for how to take the Body back when engaging with AI speech and voice technologies based upon similar progress in adjacent media fields.

The paper is now accessible via the following link: https://dl.acm.org/doi/10.1145/3658852.3659065

Sounding out extra-normal AI voice: Non-normative musical engagements with normative AI voice and speech technologies

This paper delves into SpeechBrain, OpenAI and CoquiTTS voice and speech models within a research-through-design inspired, exploratory research engagement with pre-trained speech synthesis models to find their musical affordances in an experimental vocal practice.

This paper discusses the subversion of the normative function of speech recognition and speech synthesis models to provoke nonsensical AI-mediation of human vocality. Emerging from a research-through-design inspired proceess, we uncover the generative potential of non-normative usage of normative AI voice and speech models and contribute with insights about the affordances of Research-through-Design to inform artistic working processes with AI models. How do AI-mediations reform our understandings of human vocality? How can artistic perspectives and practice guide the uncovering of knowledge when working with technology?

The paper is accessible via the following link: https://aimc2024.pubpub.org/pub/extranormal-aivoice/release/1

Acknowledgements

This work was supported by the Wallenberg AI, Autonomous Systems and Software Program – Humanities and Society (WASP-HS) funded by the Marianne and Marcus Wallenberg Foundation and the Marcus and Amalia Wallenberg Foundation.