Skip to main content

Undergraduate Programs

Xerox Engineering Research Fellows

2018 Research Opportunities

Computer Science


Professor Chenliang Xu
Department of Electrical and Computer Engineering and Computer Science
chenliang.xu@rochester.edu

Research Project #1: Learning Models for Spatio-Temporal Video Analysis

The availability of large-scale annotated image datasets and the rapid growth of GPU computing capability have paved the way to the recent success in computer vision. Modern vision systems can now match or even exceed human performance in some image analysis domains. However, the same level of performance has not been observed for their counterparts in videos. Part of the reason is the lack of large-scale annotated video datasets, but sitting at the core is the weak harness of spatio-temporal structures induced by video. In this project, you will collect data from the web, e.g. YouTube, and build deep learning models for action recognition in videos.

Research Project #2: 3D Reconstruction of Live Performances

When attending a live musical performance, people often use smartphones or cameras to record exciting moments. However, these videos are often time not perfect: they suffer from fixed view angles, occlusions, lightings etc. Imagine a situation that these cameras can "talk" to each other. We can build a vision system that uses the multi-view video information to recognize the fine-grained actions and scene elements, register the space-time locations of recordings and reconstruct the whole live performance such that no single exciting moment is missed. In this project, you will collect data with wearable devices, such as GoPro, and build a vision system to analyze data collected from multiple devices.


Professor M. Ehsan Hoque
Department of Computer Science
mehoque@cs.rochester.edu

Research Project:

How do we develop technologies that understand and respond to human emotion? Can a computer reliably understand facial expressions, spoken words, body language, intonation and make a prediction about the mental state of the human participant.

In addition to recognition of human mental state, what new possibilities can this technology enable? Can it help us to be more creative, be a better speaker, improve our language learning skills, give us feedback on group dynamics in a meeting, help individuals with social difficulties and much more?

We welcome students with an interest and expertise in machine learning, speech analysis, web programming, UX design, signal processing and running experiments with human participants. To learn more about the group, visit http://www.cs.rochester.edu/hci/ or follow us @rochci in twitter.


Professor Jiebo Luo
Department of Computer Science
jluo@cs.rochester.edu

Research Project:

We are looking for students with strong math (MTH 161/165) and programming (CSC 171/172) skills to work on one of the following topics:

  1. Computer vision: recognition of objects, scenes, people, locations, actions and events from images and video
  2. Social media data mining: prediction, forecasting, profiling, and recommendation using open source data
  3. Machine learning: learning with large scale loosely labeled web data, cross-domain learning, language + vision
  4. Biomedical informatics: structural and topological analysis from 3D imaging data; surgical video analysis Mobile / Pervasive computing: context-aware applications; multimodal inference from multiple sensors

Professor Yuhao Zhu
Department of Computer Science
yzhu@rochester.edu

Research Project #1: Storage Optimization for Virtual Reality Videos

Virtual Reality will have profound social impact and enhance human abilities in transformative ways. For instance, VR offers the potential to solve the opioid epidemic as VR experience is shown to reduce patient pain more effectively than traditional medical treatments. However, today's VR experience is far from desirable because today's computer systems are fundamentally lagging behind the unprecedented computation requirement VR technologies entail. As a first step toward improving the VR experience, this project will focus on using machine learning techniques to optimize the delivery of 360-degree VR videos from the VR service providers (e.g., Youtube, Facebook) to end-user devices (e.g., Google Carboard and Samsung Gear VR). We will examine both algorithmic limitations as well as operating/storage system-level inefficiencies.

Research Project 2: Image Signal Processor Design for Stereo Camera Systems

Image Signal Processor (ISP) is at the heart of any modern camera. It converts raw camera sensor data to RGB frames that can then be processed by various computer and robotics vision algorithms. This project focuses on the ISP design for stereo camera systems that use multiple camera sensors to obtain depth information from the scene (e.g., Microsoft Xbox Kinetic and Intel RealSense). We will focus on understanding the computation inefficiencies in today's stereo camera ISPs, and design better image signal processing algorithms and hardware systems.