Kinect generates a lot of data. For example, 1 second of video at a resolution of 640x480 pixels for both depth and color and at 30 frames per second generates approximately 70Mb of data (640 x 480 * 4 bytes per pixel * 30 frames per second * 2 = 73,728,000), together with a comparatively small amount of audio and skeletal tracking data.
One option is to compress the data using standard image compression. Using as lossy JPEG for color and lossless PNG for depth reduces the overall size of a typical recording by approximately 95%. A single frame is shown below in Figure 1.
I also wanted to package the recording files (i.e. color, depth, audio, skeletal tracking etc) into a single container. The .NET System.IO.Packaging namespace provides a convenient wrapper for packaging files according to the Open Packaging Conventions (OPC). Providing the packages are named with a .zip extension, they can be opened using Windows Explorer, and additional files can be added, providing they correspond to MIME types defined in the [Content_Types].xml file in the package.
Another advantage of using images to encode color and depth data is the ability to browse the data using thumbnail icons in Windows Explorer.