I spent the earlier part of this month finishing up a large interactive gesture wall which was exhibited at the Eular medical congress in Paris.
For the main attraction on the exhibition stand, I developed a gestural user interface which allowed visitors to guide molecules around a large video wall, measuring approximately 6 metres wide by 4 metres high, with the objective of joining them to other molecules. Uniting molecules would cause them to bind and expand to reveal a small pieces of information in the form of infographics.
The application for the wall was developed using the Processing framework, which is easily one of my favourite tools when it comes to larger scale interactive projects as it’s generally really solid, capable of handling rich visual content, and has loads of amazing libraries and hardware integration options. It’s a great framework with a really creative, inspirational and helpful community of developers backing it. I’ve donated to the foundation and would definitely urge anyone else using it, to do so too.
The hardware for the wall itself was comprised of 24 bezel-less screens, stacked 6×4, with each column of 4 running off of a PC, 6 of them in total. All 6 PCs were connected to a server which handled the synchronisation of content moving across the whole wall. Connected to each PC was a Microsoft Kinect which handled the user input by providing X and Y coordinates of the user’s hand. The interaction was very simple and allowed the user to guide molecules around, the molecules also had attraction and repulsion behaviours, allowing them to interact with other molecules, by snapping towards target molecules and avoiding other molecules to make space when information was revealed.
As with all of the best projects, there were a couple of really significant challenges that were faced in the development of the gesture wall, so I thought I’d cover a few that we came across, the solutions, and perhaps how we’d do things differently next time.
From the off, one of the first things we had to figure out was how we would display and run interactive content over such a large display area. Since we were using the Processing Framework, I quickly found out about Daniel Shiffman’s ‘Most Pixels Ever’ library. This excellent library allowed for content to be synchronised across separate applications so that we could create the impression that a swarm of molecules was drifting over the entire display area. Most Pixels Ever utilises a Java based server which each client in the overall display area connects to.
Once up and running, the server provides each client with information about the positions of on screen graphics and the current frame count. The clients then update accordingly and then return their positional info back to the server.
Most Pixels Ever allows for an amazing degree of felxibility, including the ability to define very specific screen layouts, even those in which screens a spaced apart, or don’t conform to a grid-like pattern. It also allows for settings like the frame rate to be adjusted over all clients, and provides really comprehensive debug output which helps when trying to set everything up. Can’t recommend it enough!
As mentioned, we used a series of Microsoft Kinects to provide the input for the gestural interface on the wall. I did a fair bit of experimentation with the Kinect in order to determine the best way to position it, and how best to translate the user’s arm movements into meaningful gestural input. During this experimentation process, we were considering ways to maximise the use of the Kinect by attempting to capture both, the area immediately in front of the wall to pick up a user’s gestures directly with the content on the wall, as well as the area behind the directly interacting users to catch people passing by the wall.
With this idea in mind, we initially we considered mounting the Kinects above the wall, looking down over the user from the top of the wall, angled at approximately 45 degrees so as to position the Kinect’s field of view to achieve the coverage described above. The only issue with this positioning of the Kinect is that the Y and Z axis are projected at angles which don’t run parallel with the real world, which makes them harder to utilise. The Y axis ascends in such a way as to be constantly rising up and away from the base of the wall and the Z axis extrudes downwards and away from the wall.
In the interest of keeping the development of the project going at a swift pace, we decided to opt for a different solution where we positioned the Kinect at the foot of the wall, situated behind a mirror, angled at 45 degrees, such that the Kinect was able to see both over the top of the mirror, to catch the background interaction, and upwards, where the bottom half of the Kinects’s view is bounced directly up in the air which, being situated at the foot of the wall, allows the bottom half of the Kinect’s view to capture gestures being performed directly in front of the wall. This solution worked fairly well, save for a few issues with lighting and a lot of tweaking to prevent the Kinect’s effective field of view from throwing too short – something which we realised is something that it tends to do, if the view is obstructed.
This final solution allowed us to capture user input directly in front of the wall and some background tracking of passers by, which didn’t work so well due to the blackouts in tracking where users were standing immediately within the field of view. I think this is a use-case where an overhead tracking method would have worked far better.
Whilst, unfortunately, we didn’t have time to pursue the overhead approach on this particular project, I’ve been considering ways that we could implement it. Ultimately, the X, Y and Z positions, observed within the view of the Kinect, angled 45 degrees down from above, would need to be translated in some way to provide a real world coordinate. I think this may require the use of the Polar coordinate system as opposed to the Cartesian coordinate system, in some way and although I haven’t completely sussed it out yet, I’m sure I’ll figure it out at some point in future!
I think that one of the biggest issues that we battled with was that it is essential that you are able to control the environment, as far as possible. As I mentioned, we experience some really strange problems with the Kinect having it’s range cut really short because of the fact that it’s view was obstructed too early on – it’s almost as if there is some form of adaptive ranging at play where the effectiveness is limited by how far the Kinect can see. We also found that lighting played a significant role in affecting how well the Kinect could see things, we often had dark patches where the Kinect appeared to be locking on to light sources in it’s view. Apparently though, with the various modifications and updates to the hardware, this won’t be so much of an issue with the next generation of Kinect, which will come as a huge relief.
On the whole, the large gesture wall was a great success and there was a lot to be learned, purely through the observation of visitors and how they interacted with it. Although devices like the Kinect, PlayStation Move, the Wiimote, and more recently, the Leap Motion, have been emerging over the past few years, I would estimate that gestural interfaces of this kind, i.e. ‘the failing of limbs in the air, in order to manipulate some form of digital application’ is still a relatively new and unfamiliar paradigm within the day to day experiences of an average user. I say this with caution as, of course, people are indeed using these interfaces, I think the real challenge, however, lies within the fact that a common gestural language is still in it’s infancy. Whereas, we can easily associate with the practice of using a computer mouse, where using left click allows us to select something and right click often allows us to explore the further options available from a particular item on screen in the form of a context menu, the same sort of conventions don’t yet exist for gestural interfaces. What this means is that, at this stage, we often have to adopt a straight forward and logical approach to gestural interaction. This also means that it’s a really exciting time to be working in this field as people developing these interfaces have a real opportunity to come up with novel and exciting ways to bring gestural interaction into a more public domain, through the development of usable interactions and feedback.