Xerox Engineering Research Fellows
2018 Research Opportunities
Research Project #1: Learning Models for Spatio-Temporal Video Analysis
The availability of large-scale annotated image datasets and the rapid growth of GPU computing capability have paved the way to the recent success in computer vision. Modern vision systems can now match or even exceed human performance in some image analysis domains. However, the same level of performance has not been observed for their counterparts in videos. Part of the reason is the lack of large-scale annotated video datasets, but sitting at the core is the weak harness of spatio-temporal structures induced by video. In this project, you will collect data from the web, e.g. YouTube, and build deep learning models for action recognition in videos.
Research Project #2: 3D Reconstruction of Live Performances
When attending a live musical performance, people often use smartphones or cameras to record exciting moments. However, these videos are often time not perfect: they suffer from fixed view angles, occlusions, lightings etc. Imagine a situation that these cameras can "talk" to each other. We can build a vision system that uses the multi-view video information to recognize the fine-grained actions and scene elements, register the space-time locations of recordings and reconstruct the whole live performance such that no single exciting moment is missed. In this project, you will collect data with wearable devices, such as GoPro, and build a vision system to analyze data collected from multiple devices.
How do we develop technologies that understand and respond to human emotion? Can a computer reliably understand facial expressions, spoken words, body language, intonation and make a prediction about the mental state of the human participant.
In addition to recognition of human mental state, what new possibilities can this technology enable? Can it help us to be more creative, be a better speaker, improve our language learning skills, give us feedback on group dynamics in a meeting, help individuals with social difficulties and much more?
We welcome students with an interest and expertise in machine learning, speech analysis, web programming, UX design, signal processing and running experiments with human participants. To learn more about the group, visit http://www.cs.rochester.edu/hci/ or follow us @rochci in twitter.
We are looking for students with strong math (MTH 161/165) and programming (CSC 171/172) skills to work on one of the following topics:
- Computer vision: recognition of objects, scenes, people, locations, actions and events from images and video
- Social media data mining: prediction, forecasting, profiling, and recommendation using open source data
- Machine learning: learning with large scale loosely labeled web data, cross-domain learning, language + vision
- Biomedical informatics: structural and topological analysis from 3D imaging data; surgical video analysis Mobile / Pervasive computing: context-aware applications; multimodal inference from multiple sensors