NRT Trainee Team Presentations

The Practicum in Augmented and Virtual Reality (ECE 501-1) is the third course offered as part of the PhD training program on augmented and virtual reality (AR/VR). The goal of the course is to provide interdisciplinary collaborative project experience in AR/VR. The course involves small teams of students from multiple departments working together on semester-long projects on AR/VR with the guidance of one or more faculty involved in the PhD training program. The expected end products of this practicum course are tangible artifacts that represent what the students have learned, discovered, or invented. Types of artifacts include research papers; patent applications; open-source software; as well as online tutorials and videos for undergraduates, K-12 students, or the general public.

At the end of the semester, trainee teams of 2-3 have a presentation session for fellow trainees, faculty, and invited guests. The teams are also expected continue testing and refining their research and to present their practicum projects at NRT organized/sponsored events in the future.

Fall 2021 Presentation Abstracts

An acoustic and computer graphics perspective for HRTF
personalization for spatial audio in AR/VR

Yuxiang Wang and Neil Zhang

The head-related Transfer Function (HRTF) describes how humans perceive spatial audio. HRTFs cover various directions are individual. Most current spatial audio displays use generic HRTF to provide spatial cues for customers, but it may result in vague or misplaced spatial perception. Personalizing HRTF has wide applications in AR/VR games and virtual scene reconstruction.

To efficiently achieve this goal, we need the research background in both acoustics and machine learning, as well as some inputs from computer vision. We are combining our strengths in different backgrounds to find a unique approach to this research question.

We first found efficient representations to reduce the dimension of both the HRTF data and the hear-torse geometry. Then we built a machine learning framework to find the relation between the compact presentation of both the acoustic data and geometric data. By adapting these methods, we were able to predict global HRTFs with decent error across all spatial and temporal dimensions.

Augmented Pianoroll

Frank Cwitkowitz, Eleni Patelaki, & Jeremy Goodsell

Learning to play the piano can be a difficult, lengthy, and frustrating task. What’s more, this process typically involves learning how to read sheet music, which can be cumbersome and unintuitive for many people. Recently, in the digital era, the need to understand sheet music has been circumvented by employing animated “pianoroll”, a binary time-frequency representation that indicates the piano keys which should be active across time. This representation of music makes playing piano more accessible, but it is still disconnected from the instrument, as one must follow the animation on a separate device. In this project, we propose to overlay animated pianoroll directly atop a physical piano using the HoloLens 2. We generalize our application such that, through a user interface, a user can select any MIDI file in the device’s storage to display as augmented pianoroll, anchoring it to the piano using QR code tracking. Furthermore, we provide real-time feedback on the user’s performance by comparing the expected key activity to the key activity estimated using an on-device piano transcription model. The application was successful and will undergo future development in preparation for some form of release.

An AR Solution for Blinds Navigation

Narges Mohammadi and Shadi Sartipi

This project entitled “Developing AR/VR solutions for blind people navigation” is about developing a user-friendly application which aim to transfer the visual information captured by the device cameras and sensor to a mix of audio signals to give the subject the sense of awareness for the surrounding environment. This applicationrequires scene understanding using spatial meshing, machine vision for object detection and generating spatial audio. In fact, this work tries to give a sound to silent objects in the environment and present the semantic information in the surrounding to users by combinations of spatial audio clues to not only avoid but also locate the interested objects while avoiding cognitive overloads to the user.