Surface Mapping

By
Dave
Project
Published
9 Dec 2012 00:49
Last Modified
13 Jan 2013 17:11

Now that I can generate a normal map from depth data, I can avoid mapping color pixels to back-facing surfaces (technically, surfaces which are facing away from the sensor), as shown below in Video 1.

Video 1. Back-face removal of color pixels in Kinect data.

This is important to avoid visual confusion when rotating the model, since it is difficult to distinguish between the inside and outside of a textured surface.

Normal Mapping

By
Dave
Project
Published
25 Nov 2012 18:40
Last Modified
13 Jan 2013 17:12

There are some excellent solutions to surface-reconstuction using Kinect, such as Kinect Fusion, however I was still keen to understand the feasibility of extracting a basic normal map from depth data.

In order to determine the normal vector for a given depth pixel, I simply sample surrounding pixels and look at the local surface gradient. However, as depth values are stepped, for small sample areas and particularly at larger depth values, this will result in a lot of forward-facing normals from the surfaces of the depth "planes", as shown below in Figure 1. Using a larger sample size improves things significantly, as shown in the second image.

Normal Map Normal Map

Figure 1. Normal maps from raw depth data, using smaller and larger sample areas.

The normal map then enables the point cloud to be rendered using directional lighting, as shown below in Figure 3.

Normal Map Normal Map

Figure 3. Diffuse and specular lighting applied to point cloud.

Note that the images above are still rendered as point clouds, rather than a surface mesh.

Smoothing Depth Data

By
Dave
Project
Published
25 Nov 2012 17:38
Last Modified
13 Jan 2013 17:14

The Kinect for Windows SDK exposes depth data as an array of 16-bit values, with the least-significant 3-bits used for player index.1 There are therefore 2^13 = 8192 values available to report depth within the supported range. A sample depth image is shown below in Figure 1. Note that the shape is a result of the sensor being angled downwards, and that black areas correspond to those pixels where no depth information was reported by the sensor.

Kinect Room

Figure 1. Kinect depth image.

If this image is rotated and viewed from above, as shown below in Figure 2, discrete depth bands become visible.

Kinect Room

Figure 2. Kinect depth image, rotated to highlight depth-banding.

The intervals between depth values for another sample image are plotted against depth in Figure 3 below. Note that the sample image used did not contain any data around 1.5m in depth, so there are some jumps in the data at this point. The graph shows how the depth intervals increase in size with distance from the sensor, from around a 2mm gap at 1m depth to around a 45mm gap at 4m depth.

Depth Step by Depth

Figure 3. Depth step by depth.

My initial attempt at a smoothing algorithm is shown below in Figure 4. This approach looks for horizontal and vertical lines of equal depth, and interpolates data between the discrete depth bands. Since these depth bands increase in size further away from the camera, smoothing is more effective for larger depth values.

Kinect Room Kinect Room

Figure 4. Raw and smoothed depth image.

1 As of version 1.6, the Kinect for Windows SDK exposes extended depth information.

Player Extraction

By
Dave
Project
Published
2 Nov 2012 23:16
Last Modified
13 Jan 2013 17:14

I thought it would be of interest to discuss how I obtained the player images shown in the previous post.

Stereo cameras have been around for some time. Kinect automates the process of extracting depth values from digital image frames, and while Kinect only provides informaiton at a resolution 640x480 pixels, it does so a very low cost, with relatively low computational resources, and at 30 frames per second. Figure 1 below shows a single Kinect "frame" which has been rotated and rendered as a point-cloud. The frame was captured facing the player, hence the shadows and degree of distortion in the rotated image.

Kinect Room

Figure 1. Single Kinect frame, rotated to highlight depth.

Amongst other things, the Kinect API also has the ability to identify people, and provide real-time informaiton on joint positions in 3D space. This is shown below in Figure 2, where the skeletal information has been included on the same Kinect frame as in Figure 1.

Kinect Room

Figure 2. Single Kinect frame with skeleton overlay, rotated to highlight depth.

When skeletal tracking is enabled, "player" information is included as part of the depth feed, allowing automatic seperation of pixels belonging to tracked indpividuals. This is shown below in Figure 3, where the same frame is rendered with non-player pixels removed.

Kinect Room

Figure 3. Single Kinect frame showing player only, rotated to highlight depth.

Depth Image Rendering

By
Dave
Project
Published
24 Oct 2012 19:01
Last Modified
13 Jan 2013 17:15

There are numerous ways to render depth data captured from Kinect. One option is to use a point-cloud, where each depth value is represented by a pixel positioned in 3D space. In the absence of a 3D display, one of the ways to convey depth for still images is the use of stereograms, as shown below in Figure 1.

Point-cloud stereogram

Figure 1. Point-cloud stereogram.1.

In case you are wondering, I'm holding a wireless keyboard to control the image capture. Next I needed to map the texture from the color camera onto the point-cloud, as shown below in Figure 2.

Point-cloud color stereogram

Figure 2. Point-cloud stereogram1 with color mapping.

Another approach to simulating 3D without special display hardware (but which does require special glasses2), which avoids the degree of training involved to "see" images such as stereograms, is the use of anaglyphs, as shown below in Figure 3.

Point-cloud color anaglyph

Figure 3. Point-cloud color anaglyph.2.

Anaglyphs can be adjusted to move the image plane "forwards" or "backwards" in relation to the screen, as shown by the grayscale anaglyphs in Figures 4-6 below.

Point-cloud grayscale anaglyph Point-cloud grayscale anaglyph Point-cloud grayscale anaglyph

Figures 4-6. Point-cloud grayscale anaglyphs2 "behind", "co-planar" with, and "in-front" of screen plane.

1In order to perceive a 3D image the viewer must decouple convergence and focusing of their eyes. Looking "through" the image results in four images. The eyes are correctly converged when the two centre images "overlap". At this point the eyes must be refocussed without changing their convergence.

2In order to perceive a 3D image the viewer must use coloured filters for each eye, in this case red (left) and cyan (right).

Depth Image Capture

By
Dave
Project
Published
30 Sep 2012 23:00
Last Modified
13 Jan 2013 17:16

I previously discussed a approach for visualising 3D on a Microsoft Surface device using autostereograms. This had the advantage of supporting more than a single user, since simultaneous depth-perception is posible from opposite sides of the device. However, it suffered from disadvantages that there was a degree of training involved to "see" the image (in particular when the image is animated and using a random dot pattern), and that these type of autostereograms are unable to convey color.

I thought I'd start a new project to explore the use of Microsoft Kinect to work with 3D.

Kinect is a great example of the powerful combination of both hardware (e.g. the depth camera) and software (skeletal tracking). Intriguingly, one way to think about how the depth sensor in Kinect actually works is to compare it to an autostereogram. These images allow depth perception since the human brain has a remarkable ability to infer depth from a random dot pattern when shifted in a particular way. The depth sensor in Kinect also uses shifts in position of a random dot pattern (due to parallax between the emitter and receiver) to infer depth values.

Capturing depth images using Kinect is straightforward, as demonstrated extensively in the Software Development Kit.

NUIverse Download

By
Dave
Project
Published
4 Sep 2012 20:41
Last Modified
13 Jan 2013 18:05

A beta build of NUIverse is now available for download at http://www.nuiverse.com, along with some brief documentation and additional data downloads.

Note that NUIverse is only available for installation on the Samsung SUR40 with Microsoft PixelSense, and that it is still one of my spare-time projects. As such, many features remain un-implemented and bugs remain to be fixed. However, I welcome feedback and will do my best to respond to any questions as soon as possible.

Surface 2 Physics Download

By
Dave
Project
Published
27 Jul 2012 14:30
Last Modified
13 Jan 2013 18:04

I've finally migrated the original Surface Physics v1 library and sample to .NET 4 and the Samsung SUR40 with Microsoft PixelSense.

For many apps, migrating from the Surface v1 to the SUR40 is very easy, and simply involves a search & replace of controls in the Surface v1 namaspace with their new versions. In my case, because I had to do some lower-level contact-handling, things were a little more complicated.

The sample is broadly similar to the previous version, except that I have removed the "interactions" page, which relied (amongst other things) on the API accurately reporting blob orientation. Blob orientations are now only reported as either 0 or 90°, and I didn't have time to implement the raw-image processing required to replicate the behaviour originally demonstrated on this page.

The following downloads are available:

  1. Surface Physics Sample (install), .msi (zip'd), 860Kb. The sample application for demonstrating the physics library and layout control.
  2. Surface Physics Sample (source code), Visual Studio 2010 Project (zip'd), 730Kb. Source code for the the sample application.
  3. Physics Library (binary), .dll (zip'd), 17Kb. The physics library and layout control.

The Readme for the v1 sample application may prove also prove useful.

You'll need the Microsft Surface 2 SDK, available from the MSDN site here, and access to a SUR40 or at least the Input Simulator in the SDK.

See the project archive for older posts, and the gallery for screenshots.

NUIverse Video Part 2

By
Dave
Project
Published
19 Jul 2012 09:59
Last Modified
13 Jan 2013 17:17

I had the opportunity to demo NUIverse at the Microsoft Worldwide Partner Conference last week, and I thought I'd share the video which shows some updates since the previous recording.

Video 1. NUIverse on Samsung SUR40 with Microsoft PixelSense.

Key things demonstrated in the video include:

  • Multi-touch to control complex camera motion
  • Multi-direction UI consistent with a horizontal display form-factor and multiple concurrent users
  • Level-of-Detail rendering for planetary bodies and backgrounds
  • Independant control of time and position
  • Control selection using just-in-time-chrome
  • Satellite model rendering

Planet Selection

By
Dave
Project
Published
7 Jul 2012 18:10
Last Modified
10 Jul 2012 03:25

While it is possible to select a planetary body using a touch-and-hold gesture on its label or the body itself, finding the planet can be tricky. At the very least the orbital lines and labels need to be visible. A bigger problem, however is that the "inner" and "outer" planet orbits have markedly different scales. Hence when in orbit around an outer planet it will not generally be possible to resolve an inner planet for touch-and-hold.

I therefore added a further control for selecting a planet, as shown below in Figure 1. Since moons are relatively close to a planet in comparion to other planets, touch-and-hold can be used to move to a moon in orbit around the currently selected planet. The control can be added from the tag selection bar, as shown below in Figure 1. Touching a planet switches the camera to orbit mode around the body.

Planet Selector

Figure 1. Planet Selector Control. Earth is the current planetary system.

The planet names and images are built at run time to ensure that they reflect any extensions to the base data.

Page