CompSci students help singers voice their vowels

Voice students who want to perfect how they sing their vowels could get help from a new simple, free application developed by a group of University of Rochester students who developed it as part of their Human-Computer Interaction computer science class.

Ehsan Hoque, an assistant professor in computer science who recently joined the University and was teaching the class for the first time in fall 2013, wanted students to take away from the class that it is important to consider people first when computing to solve real-life problems.

His own research led to the connections that inspired this project. Katherine Ciesinski, a mezzo-soprano and professor at the University’s Eastman School of Music, had read in a University newsletter about an award Hoque had received for his work on human nonverbal behavior analysis. Hoque’s research focus is on improving computers’ understanding of human emotions from voice and facial cues and leveraging that to help people in a range of situations.

“Singing is in great part conveying emotions,” said Ciesinski. “Learning how to do that is part of the learning process of becoming a singer.” She thought there might be areas of common interest between the two departments and reached out to Hoque. He then invited her to speak to his class about possible challenges in voice training that a computer could help solve.

With that in mind, Ciesinski and her voice and opera colleagues at the Eastman School of Music put their heads together and came up with a series of issues that they thought might pose interesting, useful problems for the computer science students to work on.

“We were motivated to solve a real-life problem,” said Cynthia Ryan, a graduate student in the class. “When Professor Ciesinski showed us how learning to sing vowels is challenging, it caught our attention.” Team Moose, as the group that created the vowel singing computer application “Vowel Shapes” named themselves, worked on developing a program that would address some of the challenges voice students face.

Currently, students learn how to sing their vowels by listening to their teacher sing and trying to match the sound. With their application Vowel Shapes, Team Moose planned to add an extra sense to their learning experience – vision. The application automatically analyzes the vowel sounds produced by a singer and generates a visual representation of the sound in real-time.

The students from the Human-Computer Interaction class also needed to ensure that their application would offer advantages over existing systems. Existing speech training systems are not only expensive, but also not designed with singers in mind, as they require singers to wear some form of apparatus around their throat, which constrains the way they sing.

Figure 1 – Circle as in the “eh” sound flattened blue oval

Figure 2 – Wide and short oval,
as in the “ee” sound

The students ran iterative experiments with the singers with different visualizations and found that depicting the sung vowels as an oval was the best way for voice students to quickly learn how to use the applications. The oval shapes generated by the application vary depending on the sounds – from a circle, to a flattened out wide and short oval, or to a tall and narrow one. For example, an “eh” sound yields something close to a circle (figure 1). On the other hand, the “ee” sound would be described by a wide but short oval (figure 2).

Vowel Shapes allows the teacher to be a central part of the learning process. The application records the teacher singing the required vowel sounds. The students and teacher can collect a whole library of sounds the student needs to practice. Any of these vowel sounds can be recalled from the library and will be shown as a blue oval on a screen. The student will then sing into the microphone, trying to match the teacher’s sound. As the student sings, the program automatically generates an oval shape on the screen, shown in yellow. The shape of the oval dynamically changes as the students vary their vowel sound. When the program establishes that the student matched the teacher’s vowel, it changes the color of the oval to green.

To validate Vowel Shapes, students set up a study with 11 voice students and compared their performance as they used Vowel Shapes and traditional methods. Results demonstrated that students were able to produce vowels more effectively in less time using Vowel Shapes than using the traditional method (i.e., practice with a professor only).

Using Vowel Shapes provided an added level of feedback. The students could see how slight changes in the shape of their mouth, or the position of their tongue would lead to changes in then displayed oval shape and therefore in the vowel they were singing.

One of the main advantages of Vowel Shapes is its portability and accessibility. The students have made it publically available and it can be downloaded onto a laptop, for example, to test it out. This means students could try this at home, or in a practice studio, having previously recorded the teacher’s vowels they want to practice. Ciesinski explains that this better suits the needs of training singers, as they often only get an hour a week with their teachers and a lot of the practice needs to come in their own time.

The students know that the application can still be improved. “There was only so much we could fine tune in six weeks,” Ryan explains. “But we’ve had great feedback from voice students who tried it so we want to continue making some changes.” For example, they want to improve the signal-to-noise ratio, and they want to continue to experiment with the tolerance levels – the point at which the program decides the vowels match and the oval becomes green. They have also made the application open source so that anyone who is interested can adjust it to their needs.

The Human Computer Interaction class students that form Team Moose all have very different background. The team was formed by Veronika Alex, a senior studying economics, computer science, and media studies; Josh Bronstein, a senior pursuing a B.A. in political science and legal studies; Nathan Buckley, a sophomore who is interested in computer science and linguistics; Tait Madsen, who began his undergraduate studies at the Eastman School of Music as a classical bass trombonist, but who has now changed his major to computer science; and Cynthia Ryan, who is pursuing a masters degree at the University and also has been generating applications and firmware for about 30 years.

En español: Alumnos de Informática Ayudan a Cantantes a Aprender las Vocales

Science & Technology