Difference between revisions of "Stride Kinect 2012-13"

From CSclasswiki
Jump to: navigation, search
(A Video Capture Tool for Dancers & Choreographers)
(A Video Capture Tool for Dancers & Choreographers)
Line 43: Line 43:
Additional examples of videos:
Additional examples of videos:
[www.youtube.com/embed/kbtixRJltos An example of colored points with special effects (wisp) ]
[http://www.youtube.com/watch?feature=player_embedded&v=kbtixRJltos#t=28 An example of colored points with special effects (wisp)]
===Development System===
===Development System===

Revision as of 00:35, 3 October 2013

--Thiebaut 15:24, 19 September 2012 (EDT)

GUI Design for Kinect-Based Choreographic Tool
Julia Edwards


This Stride project is the continuation of In Kyung (Inky) Lee's Independent Study of Spring 12. Inky's project is cleaned up and a new GUI is generated to create a more user friendly tool for dancers and choreographers


The motivation for our project is to provide choreographers with a new software and hardware tool for composing and recording multi-dancer choreographies from a single dancer. We use a Microsoft Kinect camera/sensor to record 3D point-cloud sequences of a dancer in motion. In a 3D point-cloud, or sequence, a grid of 640x480 infrared dots are projected by the Kinect sensor, and a sensor records the position of these points in space and the (x, y, z) coordinates of each point are computed and recorded 30 times a second in a movie. The body of a dancer intersecting with the projected infrared dots create a 3D moving in space. When viewing this human surface on the computer, the viewing point can be chosen to be different from the point where the original kinect stood during the recording.

See more...


A Video Capture Tool for Dancers & Choreographers

The movies below, titled Breakdown and North Korea, were created by InKyung (Inky) Lee Independent Study in the Fall of 2012. Breakdown was presented at Dance New Amsterdam in New York City, NY, in the Frameworks Dance Film Series event, on February 12, 2012.


Citing Inky's poster abstract[1] for CCSCNE 2012:

The video component of the movie was generated from frames captured with a Microsoft Kinect sensor, and rendered using Shiffman’s Processing application[2] that generates real-time point-clouds, and provides a user interface (UI) allowing the user to reverse, concatenate, trim, and overlap videos sequences. Individual frames can be tagged as starting points for the starting overlap of other sequences. The merging of individual sequences maintains the 3D information so that the virtual view-point can be manipulated by the user on the resulting sequence. More over, live feed from the Kinect can be merged in real time over prerecorded choreography, allowing dancers to interact with previously recorded movies of themselves.

Additional examples of videos:

An example of colored points with special effects (wisp)

Development System

Our project uses a stand-alone xBox 360 Kinect and two Mac computers, one running OS X 10.8.2 with a 1.8 GHz Intel Core i7 processor, and the other running OS X 10.7.4 with a 2.8 GHz Intel Quad-Core Xeon processor. The stand-alone Kinect comes with a USB adaptor, which we use to connect the device to our computers. If the Kinect was purchased along with an xBox, the USB adaptor must be purchased separately. Currently, our project does not support Windows operating systems, as Daniel Shiffman's Point Cloud library (the code base for our project) only works with Mac OS X.

Other Notable Dance-Related Kinect Projects

Inky's Page

Check out In Kyun Lee's page on technology and the world of dance!


Alexiadis jbe.png

Alexiadis et al.[3] are using the Kinect to track a dancer (using a high level human-skeleton tracking module from the OpenNI SDK) and evaluate that dancer's performance against a gold standard. Their project provides feedback to the dancers by "aligning dance movements from two different users and quantitatively evaluating one performance against another" by comparing vectors of joint positions and joint velocities (outputs of the skeleton tracking module) and 3D-Flow Error, calculated from taking the inner product of a 3D velocity vector.

Similarly, the research of Huang et al.[4] from National Dong Hwa University of Taiwan also exploits the Kinect to evaluate a dancer's performance against the sample motions of the "teacher". Their criteria evaluates the performance based on posture and tempo accuracy. Unlike Alexiadis et al., this group created their own skeleton tracking module.

Liutkus et al.[5] from Institut Telecom use the Kinect to help aid in the decomposition of dance movements into elementary motions. They also use the OpenNI SDK to access Kinect depth maps, which they incorporate into a multimodal data set with video, audio, and inertial measurements (similar to Essid et al., see directly below).

Essid jbe.png
The Kinect skeletal tracking component for a dancer

Essid et al.[6] use the depth map generated by the Kinect, along with synchronized video cameras, synchronized multi-channel audio, and body motion sensors, to capture a dancer's performance. This multi-modal corpus provides extensive abilities to capture and synchronize this data (specifically tailored for a dance class), and is intended to be used as a tool for further research in real-time interaction between humans in virtual environments (both dance-related and other environments).

Jung jbe.png
An image of a dancer, as represented by energy "particles"

Jung et al.[7] use the Kinect's 3D sensor to capture input of a dancer, process that input, and feed the processed input back into the performance (so that the dancer can interact with past input). They track their dancers using the “Flexible Action and Articulated Skeleton Toolkit”, and pass the acquired data over to a Processing program, where it is interpreted and used to create new graphics/output. One output schema was based on the concept of energy, with the dancer being represented by energy particles that are partially lost into space. Other graphics schemas are currently under development.

Original GUI

The original GUI of our system is shown below, and its main featured highlighted:


Area 1
This area contains various counters, indicating how many frames are processed per second, as well as a frame counter for movies being played.
Area 2
This area contains sliders that can affect the frame rate, the depth axis of the Kinect camera, the depth of the view point, and the scale factor, giving the viewer the option of slowing down the movie and shrinking or enlarging the 3D space.
Area 3
In order to speed up real-time rendering of the points of the cloud, the software can skip every other point, or every 3rd point, 4th point, 5th point, or 6th point. While this makes the computation faster and increases the frame rate, it makes the point cloud thinner as the number of points skipped is increased.
Area 4
The input area is used to specify the index of the data file to be loaded or created
Area 5
These are the main buttons that can be used to start recording a Kinect movie, play a Kinect movie, or set the mode to continuous play.
Area 6
These are virtual joysticks which, when moved left, right, up or down, start moving he virtual view point in the same direction in a continuous movement.

While this GUI is functional and has been useful for creating several video projects, it is buggy and needs a complete overhall. The redesign is the subject of this Stride project.

The Bug Page is available here.



(From Top to Bottom)
Kinect Controls

This area controls the kinect live feed as well as the physical kinect:

Top Three Sliders
The same as Area 1 of the Original GUI above: This area contains various counters, indicating how many frames are processed per second, as well as a frame counter for movies being played.
Two Sliders (Teal Green)
Depth and Zoom sliders, to control shrinking or enlarging the 3D space.
Knob with 4 Buttons to Left
This is a new feature. It controls the degree of the tilt of the Kinect. The user can alter the tilt manually (using the knob), can reset the Kinect to a base tilt using the top button, the minimum tilt using the second, the maximum tilt using the third, and can make the Kinect continuously tilt up and down (like a wave) using the fourth (bottom-most) button.

Playback Controls

This area controls playback and recording of videos:

Blue Textbox
The same as Area 4 above, the input area is used to specify the index of the data file to be loaded or created
Record Button
Start or stop the recording of a new video
Play Button
Begin playing or stop playing a previously recorded video
Synchronous Record
Currently does nothing, see "What To Do Next" section below
Loop Toggle
Controls whether the video playing plays through only once or continuously

A Dissection of the Application

A block diagram of the application is shown below, along with a description of the various components.


This is the main class of the project, and it is written in Processing. It is in charge of initializing the PointCloudRingBuffer, all FileWriter threads, all FileReader threads, and the GUI. It receives and handles the raw data that is sent from the Kinect, and manages movie recording and playback. It works with FSM to communicate when events that affect the state of the GUI (such as movieStopped()) occur.
FSM (Finite State Machine)
Our implementation of a Finite State Machine is used to synchronize and communicate between the GUI and PointCloud. It is a static, Java class, that is initialized from PointCloud. It receives actions from both PointCloud and GUI that are processed and cause the FSM to transition from its current state to other states (if appropriate).
This class is in charge of instantiating the GUI in a separate (Processing) window from the Kinect stream. It extends the Control P5 Library[8], and informs with the FSM whenever an event (i.e. a button click) is received.
This is a circular buffer data structure that we use to store video frames in when either recording movies or playing back those movies. It is written in Java.
FileWriter Thread
This thread instructs PointCloudRingBuffer to save its contents to the hard disk while a movie is being recorded. It is written in Java.
FileReader Thread
This thread takes the Kinect frames from saved movies (by reading out of a ByteBuffer that is reading the file using a file channel) and puts those frames into the PointCloudRingBuffer. It is written in Java.
This file contains static variables that are necessary to record, retrieve, and display Kinect cloud points on the screen (for example, it contains a mask to retrieve a point's ID, which is useful information for when we want to properly overlap two cloud points). It is written in Java.
Parameter File
This file saves the user's playback preferences - such as the value of "ContinuousPlay" - to a file called "parameters.txt", which is located in the user's documents. It is written in Java.
XML Reader
This file reads and parses an XML file (default name "background.xml", resides in the directory of the project) that contains the name of the background files to use as overlay on cloud points, and the association of of cloud-point IDs to background files to use. It is written in Java.
This class allows for the manipulation of the viewpoint of the camera. It is written in Java.

Designing a New GUI

The new approach in designing a new GUI is to carefully describe the interaction between a user and the application and account for all possible click, double-click and drag the user may perform intentionally or not, and to make sure the application responds in a predictable and robust fashion. Some of our preliminary work is detailed in this page.

Looking at Other GUI Designs

The first example is from MapMyRun.com, and uses a basic GUI to allow runners to create/map different types of running routes. The GUI is simple and elegant, not very interesting or complicated visually (in terms of color/design). It is a good example of implementing a strait-forward, easy-to-use controller that is quite powerful.


The second example is from a music synthesis application called "Circle". The GUI is used to alter the sound/music being produced, and does not manipulate a separate screen (unlike our project). This is an example of a really complex, hard-to-understand GUI. It is extremely powerful once you know what all of the knobs, graphs, and buttons do, but is not very friendly to new users or users without the relevant audio programming knowledge. This might be a good example of what NOT to do, with the exception of the color scheme (which I think is pretty cool).

Circle JBE.png

The third example is an image I found from this Stanford Student's final Music Project. The author used the ControlP5 library to create this GUI, and it is a good example of a nice, clean layout with an appealing color scheme. Here we can see Buttons (any rectangle with a label on the inside), knobs, vertical slider bars, and labels.


Play/Start/Stop Buttons

We have decided to use a Finite State Machine (FSM) to code this interaction and start with a GUI with just the start and stop button, allowing the application to play a pre-recorded movie, stop it, or record a new one. The Continuous button will allow continuous looping of the movie or not.

Our original state diagram is shown below:



Added February, 2013

Not playing or recording anything, just displaying raw kinect video.
Play Once
Transitions from Idle state when the Play Button is clicked, and Continuous Play is false. Upon entering the state, PointCloud is told to play the movie, and the GUI's Play Button is set to "P-Stop". When either another play-click is received, or the movie has played through once completely, the Play Button is set back to "Play" and the movie is stopped (if it wasn't stopped by a play-click).
Play Continuously
Transitions from Idle state when the Play Button is clicked, and Continuous Play is true. Upon entering the state, PointCloud is told to play the movie, and the GUI's Play Button is set to "P-Stop". When another play-click is received, the Play Button is set back to "Play" and the movie is stopped.

Record Button

The next installment of the FSM is below (implementing the record button):



Added March 6, 2013

Not playing or recording anything, just displaying raw kinect video.
Record Synchronously
Transitions from Idle state after the delay() timer has elapsed. After the timer has elapsed, the GUI's Play Button and Record Button are set to "P-STOP" and "R-STOP" respectively. PointCloud is then instructed to begin playing the specified video, and then to begin recording a new video (this synchronizes the start of the video with the start of recording). PointCloud keeps recording until another record-click is received, at which point both recording and playback are stopped.
Record Asynchronously
Transitions from Idle state after the delay() timer has elapsed. After the timer has elapsed, PointCloud is instructed to begin recording a new video (will just be the live kinect video). PointCloud continues recording until another record-clicked is received.
A function called from Idle state when Record Button is clicked and used to implement a timer. If the Record Button is clicked again before the delay timer has run out, then the FSM does not transition to a new state. Otherwise, after the timer runs out, the FSM transitions to the desired record state.

Meeting Page

This page contains notes and To-Do items resulting from meetings conducted throughout the academic year 2012-13.

What To Do Next

Ideas for Further Development

As the motivation behind this program was to create multi-dancer choreographies out of a single dancer, a logical next step is to include video editing tools in the GUI to allow for the merging of videos. I imagine this functionality could be added to the GUI in several ways, but some considerations include:

  • two text boxes for the user to write the names of the videos to merge
  • a text box for the user to write the name of the output (merged) video
  • two "start frame" text boxes, signaling at which frame to begin merging each video (so that the user doesn't have to merge the two videos from the first frame of each)
  • two "end frame" text boxes, signaling at which frame to end merging each video
  • a textbox for the name of an image file to impose over the dancers (like the fire or flower previously used by Inky)

Note that the successful incorporation of these editing tools would likely remove the need for "Synchronous Record" (currently unimplemented), as this button is intended to allow for the synchronous recording of a previously recorded video with the live Kinect feed. With editing tools that could merge two videos, all that Synchronous record would have to offer is the ability for a dancer to see the previously recorded movie as she or he records the second video. This could indeed be helpful, although the difficulty in implementation might make it an undesirable move.

  • For example, an implementation difficulty could be: since Synchronous record would use both a FileReaderThread AND a FileWriterThread simultaneously (Reader to display the old movie on the screen, Writer to write both the new video out), the merging of the live Kinect frames and the old video frames might be off slightly, due to the delay in displaying the old frames on the screen and processing the newly acquired live frames. This means that a dancer might be dancing along to frames that were already added into the new video, but are displayed on the screen afterwards (because of the small lag).
  • A solution could be: Use Synchronous Record (consider changing the name) to play a previously recorded movie on the screen, but only record the live kinect feedback in the new video. This way, even though the lag issue still applies, the dancer can just alter the starting and ending frame numbers in the editing section to correct for the offset.

What Classes Will Be Needed?

While it is desirable that the next person working on this project be familiar with all of the classes, the main classes that one needs to know to alter the GUI are:

  • PointCloud - this is the main function. Don't worry too much about understanding how points are displayed on the screen, although this definitely could be a consideration if you need to use an equation to merge the two videos. Ask DT about how Inky would manually merge her videos. I imagine this merge will be the job of another class.
  • NewControlP5GUI - this is the class in charge of setting up the GUI. You add controls and change the layout (including colors of the GUI) here.
  • GUIFrame - this is the class that contains all of the widgets' listeners. Remember that almost every widget in the ControlP5 library needs its parent (i.e. GUIApplet) to contain a listener with the SAME NAME as is specified when that widget is created in NewControlP5GUI. Most of the new listeners created (for editing) will just make calls to the FSM class.
  • FSM - this is the finite state machine to manage all activity that both GUIFrame and PointCloud can do to alter each other or something (i.e. a Thread) that they both can update.
  • ParameterClass - this will be useful if you want to save the editing data. For example, if you want to save all of the information that the user inputs in the editing text boxes, this class is what you'll alter. Don't forget to make the call from your GUIFrame listeners to save the data.

Grace Hopper ACM Student Research Competition 2013

This project was presented in the ACM Student Research Competition at the 2013 Grace Hopper Celebration of Women in Computing. The poster (also shown at the top of this page) can be found here:

References and Related Material

  1. InKyung Lee, Kinect-Based Choreography, Poster abstract for CCSCNE 2012, cs.smith.edu/classwiki/images/9/98/InkyCCSCNE12_PosterAbstract.pdf , captured 3/1/2012
  2. Daniel Shiffman, Getting started with Kinect and Processing, from www.shiffman.net/p5/kinect/, captured 2/16/12
  3. Alexiadis et al., Evaluating a dancer's performance using kinect-based skeleton tracking, in Proc. 29th ACM Int’l Conf. Multimedia, New York, NY, 2011. http://doras.dcu.ie/16574/1/ACMGC_Doras.pdf, captured 4/2/2013
  4. Huang et al., Automatic Dancing Assessment Using Kinect, in Smart Innovation, Systems and Technologies Volume 21, 2013. http://link.springer.com/content/pdf/10.1007%2F978-3-642-35473-1_51 captured 4/3/2013
  5. Liutkus et al., Analysis of Dance Movements Using Gaussian Processes, in Proc. 20th ACM Int’l Conf. Multimedia, New York, NY, 2012. [1] captured 4/3/2013
  6. Essid S., et al., A multi-modal dance corpus for research into interaction between humans in virtual environments, J. Multimodal User Interfaces, March 2013, V. 7, N 1-2. http://doras.dcu.ie/16794/1/gc-2011-dataset.pdf, captured 4/2/2013
  7. Jung, D., et al., Requirements on Dance-driven 3-D Camera Interaction: A Collaboration between Dance, Graphic Design and Computer Science, in Proc. of the 12th Conference of the New Zealand Chapter of the ACM Special Interest Group on Computer-Human Interaction, New York, NY, 2011. [2], captured 4/2/2013
  8. ControlP5 Library, from www.sojamo.de/libraries/controlP5/, captured 3/13/13