Fall 2018 Projects
Magic Paintbrush: Using Neural Style Transfer to convert an interactive painting to a masterpiece in real time
Zen Tang & Ngan Vu
🏆 Class Choice Award!Â
Being able to produce beautiful paintings is a dream that many of us have had. Recent works on style transferring have shown that translating sketches into art-like paintings is possible. In this project, we seek to create a real-time interactive system for generating high-fidelity images with low effort. The user will be able to use a painting interface to iteratively refine an image, and their changes will be reflected in the generated image. This goal of this project is to create a medium that feels like a canvas - due to the high-quality feedback, the user will believe that they are directly influencing what is being generated.
Learned babbling for more emotionally intelligent virtual agents
Trevor Buckner & Robert Baines
The ability for a human to connect emotionally with a virtual agent could largely be dependent on the agent’s ability to express and detect emotion of a user. A large part of emotional expression in humans comes from speech, and underlying tones and cadences within that speech. Current technology is not at a stage where a virtual agent can dynamically generate authentic, intelligible speech beyond pre-selected phrases. We propose a system based on deep neural networks that is trained to recognize and classify the emotion in a person’s speech, and then reciprocate that emotion by generating some babbling noises that approximate human speech in the same emotion. This would provide more relate-able emotional feedback to the user, and could perhaps be used for next-generation non-player-characters in video games, or as a practical tool to help people on the Autism spectrum learn how tonality on their voice might impact social perception.
Dance Dance Robolution: Detecting the beat of a dancer in a video stream
Julia Lu, Roland Huang & Valerie Chen
This paper outlines an optical tracking system in development for detecting the beat of a dancing person captured on a video stream. Our system uses a color detection algorithm to track the hand of a dancer and a novel approach to convert movement in the hand to beats per minute. We have prototyped two possible use modules for our beat detection system: one system aims to play back songs from a standard list of songs with an appropriate BPM matching the person’s rhythm, and the other is a multi-tracking system that aims to generate music based on varying pitch and tempo.
AR-Mirror
Jared Weinstein & Robert Gerdisch
In the last couple of years, “smart” mirrors have gained popularity as an easy DIY project. These systems add additional visual information on the mirror’s surface. Popular widgets include weather reports, calendars, or a running news feed. We seek to improve on previous work by creating a system that displays content in the mirror’s reflected 3-D space. The content moves in real time according to changes in the user’s perspective.
This paper describes our proof of concept system. We limit ourselves to same constraints as prior mirror projects. Our system captures information through a single camera source and must render content in real time. With these constraints in mind, we tackle three primary goals: (1) recognize and track a user’s position in real world coordinates, (2) display content superimposed on the real world based on the user’s perspective, and (3) detail a simple way of interacting with the mirror through head movement.
Creating an Autonomous System to Play Pool
Shivam Sarodia & Dibya Bhattacharjee
In this project, we will produce a perception and decision-making system to play the game of pool. Pool is a turn-based two-player game consisting of hitting a white ball to pot other balls of the player’s assigned color (stripes or solids) into one of six pockets in the pool table, until only the black ball is left, which must then be potted. We build our system to operate on an online video game of pool. Our system is given access to an API for mouse movement and an API for taking screenshots of the computer screen. Using this functionality, our autonomous system attempts to beat the opponent at the game of pool.
Perspective-Matching Background Replacement
Jason Chen and Tim Adamson
In this paper, we describe a system developed to perform perspective-matching background replacement. Our system utilizes a handful of hardware components including a robotic arm, a ZED stereo camera, three computers, and a Jetson TX2. The system uses the robotic arm to track a person as she moves around, allowing the camera to continually film her movement. A mask is then applied to the images received from the camera, segmenting out everything else in the room except for the person. This mask of a person is then displayed in a virtual environment using Unity. As the person moves around the room, her character in the virtual environment, which looks exactly like her, moves around the virtual environment. This allows for the 2D character of the person in the virtual environment to interact with the 3D world in a variety of ways such as moving around 3D objects. After the system was developed, it was tested with a variety of experiments, which included navigating around 3D objects in the virtual environment, walking through a virtual doorway, and moving the robotic arm through a range of motions. The experiments demonstrated the success of the system, but also revealed the system’s inability to run in real-time.
Target Tracking –: Using imitation learning to track moving targets
Kristina Shia & Anushree Agrawal
In this project, we use imitation learning to teach the simulated Shutter robot how to follow an object of interest that moves across its field of vision over time in a more natural, human-like way. Previously, we have done work on orienting Shutter to exactly center an object in its view; this may not be the most natural way to indicate that the object has attracted the robot’s attention, especially in social interaction settings, where precise target tracking may be uncanny. We use machine learning models to map absolute target position and relative target position coordinates to robot joint state positions and change in joint state positions respectively. Using these models, the robot still behaves in a way that is not human-like and precise.
We then use teleoperation to generate data, in which an expert directly controls the robot with a joystick to follow a target around in a natural way. We train a machine learning model on this data and find that the imitation learning mechanism is more natural than using the previous absolute and relative inverse kinematic methods to train our models. Further work includes conducting user evaluation studies to see if there is a preference in robot target tracking using an imitation learning-based model or an inverse kinematics-based one. Additionally, extending the system to use the real robot in a room to track human movement in its field of vision could be used to test if the imitation learning method is more natural in a real life setting.
Imitation Learning for Dexterous Manipulation with an Underactuated Hand
Yutaro Yamada & Seonghoon Noh
Advancement in dexterous robot manipulation continues to extend its application to di erent tasks. A particular type of robotic hand known as an underactuated robotic hand has demonstrated its ability to successfully grasp and manipulate objects of different sizes and geometries. While the hand’s passive compliance given by the underactuated mechanism makes it practical for many manipulation tasks, much work needs to be done in designing the control methodology for within-hand manipulation.
In this paper, we present a general within-hand controller for the Yale OpenHand Model T42, an underactuated robotic hand, for object reconfiguration. The general controller consists of locally optimal policies trained via imitation learning. The controller takes a pair of configurations (A and B) as an input and generates a sequence of low-level actuator commands called Precision Manipulation Primitives (PMPs) to manipulate an object from A to B within the hand.
The derivation of the PMPs is briefly presented followed by a description of the hardware setup, the data collection process, and the neural network. Preliminary evaluation of the general controller is discussed, and issues with the hand’s actuators which inhibited the complete evaluation are addressed.