Agenda:
Intro
The Sensor
Data Source
Kinect Evolution
Data Source
Windows Store App
Body Frame
Coordinate Mapper
Kinect Studio
Gesture Recognition
Gesture Builder
2. Matteo Valoriani
CEO of
Speaker and Consultant
PhD at Politecnico of Milano
Microsoft MVP
Intel Software Innovator
mvaloriani@gmail.com
@MatteoValoriani
https://it.linkedin.com/in/matteovaloriani
Nice to Meet You
2
3. Clemente Giorio
Senior Developer at
Speaker, Author and Instructor
Microsoft MVP
email Clemente.Giorio@live.com
@Tinux80
http://it.linkedin.com/pub/clemente-giorio/11/618/3a
Nice to Meet You
3
4. Agenda
• Store App
• Body Frame
• Coordinate Mapper
• Kinect Studio
• Gesture Recognition
• Gesture Builder
• Intro
• The Sensor
• Data Source
• Kinect Evolution
• Data Source
13. 1920 x 1080 array of color pixels
• 30 or 15 fps, based on lighting
conditions
Elaborated Image Format:
RGBA, BGRA, YUY2, …
Raw Data: YUY2
ColorFrameSource
19. Version 1 Version 2
Depth range 0.4m → 4.0m 0.5m → 4.5m
Color stream 640×480@30fps 1920×1080@30fps
Depth stream 320×240 512×424
Infrared stream None 512×424
Type of Light Light coding ToF
Audio stream 4-mic array 16 kHz 4-mic array 48 kHz
USB 2.0 3.0
# Bodies Traked 2 (+4) 6
# Joints 20 25
Hand Traking External tools Yes
Face Traking Yes Yes+Expressions
FOV 57° H 43° V 70° H 60° V
Tilt Motorized Manual
20. System / Software Requirements
OS Windows 8, 8.1, Embedded 8, Embedded 8.1 (x64)
CPU Intel Core i7 (recommended)
RAM 4GB (o more reccomended)
GPU DirectX 11 (required)
USB USB 3.0 (Intel or Renesas chipsets)
Compiler Visual Studio 2012, 2013 (Supported Express)
Language Native (C++), Managed (C#,VB.NET), WinRT (C#,HTML)
Other Unity (Plugin), Cinder, openFrameworks (wrapper)
23. Basic Flow of Programming
Sensor Stream Frame Data
Sensor Source Reader Frame Data
Kinect for Windows SDK v1
Kinect for Windows SDK v2
Source independent to each Data
(e.g. ColorSource, DepthSource, InfraredSource, BodyIndexSource, BodySource, …)
Doesn’t depend on each other Source
(e.g. Doesn't need to Depth Source when retrieve Body Data)
24. In “New Project” create a new Windows Store app
Enable Microphone and Webcam capabilities
Add a reference to Microsoft.Kinect
Use the Microsoft.Kinect namespace in your code
Creating a new store app using Kinect
25. Represents a single physical sensor
Always valid: when device is disconnected no more frame are
generated.
Use IsAviable Property to verify if the device is connected
The KinectSensor class
this KinectSensor
this
// Make the world a better place with Kinect
this
27. Give access to frames
– Events
– Polling
Multiple readers may be
created on a single source
Readers can be paused
Readers
InfraredFrameReader reader =
sensor.InfraredFrameSource.OpenReader();
reader.FrameArrived +=
InfraredReaderFrameArrived;
...
29. • Gives access to the frame data
– Make a local copy or access the underlying buffer directly
• Contains metadata for the frame
– e.g. Color: format, width, height, etc.
• Important: Minimize how long you hold onto the frame
– Not Disposing frames will cause you to not receive more frames
Frames
30. • Allows the app to get a matched set of frames from multiple
sources on a single event
• Delivers frames at the lowest FPS of the selected sources
MultiSourceFrameReader
MultiSourceFrameReader MultiReader =
Sensor.OpenMultiSourceFrameReader(FrameSourceTypes.Color |
FrameSourceTypes.BodyIndex |
FrameSourceTypes.Body);
var frame = args.FrameReference.AcquireFrame();
if (frame != null) {
using (colorFrame = frame.ColorFrameReference.AcquireFrame())
using (bodyFrame = frame.BodyFrameReference.AcquireFrame())
using (bodyIndexFrame = frame.BodyIndexFrameReference.AcquireFrame()){
//
}
}
32. Range is 0.5-4.5 meters
Frame data is a collection of Body objects each
with 25 joints
Each joint has position in 3D space and an orientation
Up to 6 simultaneous bodies
30fps
Hand State on 2 bodies
Lean
BodyFrameSource
34. Improved reliability and accuracy
More reliable lock-on and more stable joints
More anatomically correct skeleton
Hips in the right place, new shoulder parent
Six players tracked at all times
Simplified engagement, bystander involvement
Hand-tip and thumb joints
Enables subtle and more nuanced hand gestures
Per-joint orientation
Great for character retargeting
Skeletal Tracking Features
NUI
35. ColorSpace (Coordinate System of the
Color Image)
… Color
DepthSpace (Coordinate System of the
Depth Data)
… Depth, Infrared, BodyIndex
CameraSpace (Coordinate System with
the origin located the Depth Sensor)
… Body (Joint)
Coordinate System
36. Three coordinate systems
Coordinate mapper provides conversions between each system
Convert single or multiple points
Coordinate mapping
Name Applies to Dimensions Units Range Origin
ColorSpacePoint Color 2 pixels 1920x1080 Top left corner
DepthSpacePoint Depth,
Infrared,
Body index
2 pixels 512x424 Top left corner
CameraSpacePoint Body 3 meters – Infrared/depth
camera
42. New tool, shipping with v2 SDK
Organize data using projects and solutions
Give meaning to data by tagging gestures
Build gestures using machine learning technology
Adaptive Boosting (AdaBoost) Trigger
• Determines if player is performing gesture
Random Forest Regression (RFR) Progress
• Determines the progress of the gesture performed by player
Analyze / test the results of gesture detection
Live preview of results
Gesture Builder
45. Heuristic
• Gesture is a coding problem
• Quick to do simple
gestures/poses (hand over head)
• ML can also be useful to find
good signals for Heuristic
approach
Machine Learning (ML) with G.B.
• Gesture is a data problem
• Signals which may not be easily
human understandable (progress
in a baseball swing)
• Large investment for production
• Danger of over-fitting, causes you
to be too specific – eliminating
recognition of generic cases
Gesture Recognition
46. General Info & Blog ->http://kinectforwindows.com
Purchase Sensor -> http://aka.ms/k4wv2purchase
Developer Forums -> http://aka.ms/k4wv2forum
Twitter Account -> @KinectWindows
A Facebook Group -> http://on.fb.me/1LSflbX
A LinkedIn Group -> http://linkd.in/1J9gFcY
A Twitter Account -> @KinectDevelop
A Google Plus Page -> http://bit.ly/1SHtduT
Kinect Resources
47. Slides and Code will be available on:
http://www.dotnetcampus.it
THANK YOU!
Q&A
Editor's Notes
1
The big success of Kinect is linked to the “magic” perceived by the user.
Kinect allows to create an immersive user experience where the user is swept into an other world and transformed in an avatar or in a magical elf
Wider/expanded field of view
The expanded field of view enables a larger area of a scene to be captured by the camera. As a result, users can be closer to the camera and still in view, and the camera is effective over a larger total area.
In a retail scenario, for example, the sensor could be placed on a wall and track more people within a larger area for more versatile and effective interactions.
Called “Streams” in V1
1080p color camera30 Hz (15 Hz in low light)
The color camera captures full, beautiful 1080p video that can be displayed in the same resolution as the viewing screen, allowing for a broad range of powerful scenarios. In addition to improving video communications and video analytics applications, this provides a stable input on which to build high quality, interactive applications.
Build high-quality, augmented reality scenarios that could be used with digital signage for retail, museums, lobbies, and public spaces.
Not heat – different frequency
People have used this for: night vision, computer vision applications that need consistent illumination like face recognition, retro reflective materials also show up very well
New active infrared (IR) capabilities512 x 42430 Hz
In addition to allowing the sensor to see in the dark, the new infrared (IR) capabilities produce a lighting-independent view—and you can now use IR and color at the same time.
The new IR capabilities enable new machine learning applications, such as face recognition in varying light conditions.
Depth sensing512 x 42430 HzFOV: 70 x 60One mode: .5–4.5 meters
With higher depth fidelity and a significantly improved noise floor, the sensor gives you improved 3D visualization, improved ability to see smaller objects and all objects more clearly, and improves the stability of body tracking.
Fitness, wellness, and entertainment scenarios can take advantage of the improved 3D visualization.
Could detect a body moved by comparing one frame to the next
Can be used directly for some background removalFull Skeleton Tracked:- v1 2- v2 6
Latency v1 ~90 ms with processingLatency v2 ~60 ms with processing
Unity Pro Support:For more than just gaming, Unity Pro offers cross-platform rapid prototyping.
Build and publish apps by using tools that you already know across multiple platforms.
From Unity 5.x version you don’t require Pro edition.
One instance of the service per session
All the same – not lower-higher level
Talk to USB3, Windows 8, GPU requirement
Everything is currently “WindowsPreview.Kinect” That, of course, will change before final RTM.
Simultaneous multi-app support
Improved multi-app support enables multiple applications to access a single sensor simultaneously.
For example, by allowing a retail app and a business intelligence app to access the same sensor, you can get business intelligence in real time in your retail space.
Replaces AllFramesReady event in V1
Windows Store support
You can now create Windows Store apps by using tools you already know, and offer them to a broad consumer audience.
You can list and sell your Kinect v2 applications in the Windows Store.
Skeleton Hoints Defined:
-v1 20 joints
-v2 25 jointsImproved body, hand and joint orientation
With the ability to track as many as six people and 25 skeletal joints per person—including new joints for hand tips, thumbs, and shoulder center—and improved understanding of the soft connective tissue and body positioning, you get more anatomically correct positions for crisp interactions, more accurate avateering, and avatars that are more lifelike.
New and better scenarios in fitness, wellness, education and training, entertainment, gaming, movies, and communications
Improved body tracking
The enhanced fidelity of the depth camera, combined with improvements in the software, have led to a number of body tracking developments. The latest sensor tracks as many as six complete skeletons (compared to two with the original sensor), and 25 joints per person (compared to 20 with the original sensor). The tracked positions are more anatomically correct and stable and the range of tracking is broader.
In interactive scenarios, your avatars will be more stable—with more accurate body position evaluation and crisper interactions—and you have the potential for bystanders to participate.
The first generation sensor never really understood hips before. In the new generation we’ve lowered the hips to be anatomically correct
Seated positions are far superior with the new generation (no more seated mode so no more toggling…you always get all 25 joints now)
“Strong Leans” (more then 30 degrees of lean…all the way to 90 degrees)
This gives us more reliable tracking and enables things like ST tracking for push-ups
Spine Mid joint –> used to be called “spine” and was lower
Spine Base joint was higher
Clavicle is new…called “Spine Shoulder”
Xbox 360 had 20 joints and joint orientations
CameraSpacePoint was named SkeletonPoint in the V1 API. But it applies more generally, not only to Body joint positions, but to any point that is within the depth camera’s field of view. Hence, the new name.
Powerful tooling
Kinect Studio provides enhanced recording and playback, and Visual Gesture Builder lets developers build their own custom gestures that the system recognizes and uses to write code by using machine learning, increasing productivity and cost efficiency.
Develop on the go without the need to bring the Kinect sensor with you. Create custom gestures that decrease the time to prototype and test solutions.
Applications can combine triggers and progress at runtime
Use trigger to determine context, e.g. punching, walking, etc.
Use progress to drive animation once we know the context
KinectRegion kinectRegion = new KinectRegion();
Grid grid = new Grid();
KinectUserViewer userViewer = new KinectUserViewer()
{
Height = 100,
Width = 121,
HorizontalAlignment = HorizontalAlignment.Center,
VerticalAlignment = VerticalAlignment.Top,
};
grid.Children.Add(kinectRegion);
grid.Children.Add(userViewer);
kinectRegion.Content = rootFrame;
// Place the frame in the current Window
Window.Current.Content = grid;