The SlideShare above titled Face Tracking & Face Recognition in AR, is from my presentation at Engage! Expo, at Jacob K. Javits Convention Center in New York, on February 16, 2011, as part of the panel, Augmented Reality: The State of the Market. I was also joined by Ori Inbar and Alpay Kasal. The slides can also be downloaded here as a PDF.
The SlideShare above is the deck from my presentation at are2010 (Augmented Reality Event), at the Santa Clara Convention Center. It is titled Mobile AR, OOH and the Mirror World, and is partially based on my lengthier video, Mobile Augmented Reality, The Ultimate Game Changer, with more emphasis on marketing, and the implications on advertising business structure, with a focus on the convergence of mobile and OOH.
Synopsis:
There is a convergence well underway in the marketing/advertising world between mobile and OOH/DOOH marketing, that is being accelerated by a technological convergence between Mobile Augmented Reality and Mirror Worlds (3D mapping of the real world, such as Street View). This development is explored from several angles, including a thorough, but easy to understand explanation of the relevant technologies. Subplot: For structural reasons, the large-network advertising agencies are not responding to this development.
This 20 minute presentation was given on Thursday morning, June 3, to an audience of slightly over a hundred people. It was in the “Business Track” of the three-track event (the other tracks being “Technology” and “Production”).
I was pleased that the presentations of both Earthmine CEO, Anthony Fassero (in the technology track) and Blaise Aguera y Arcas, Keynote speaker from Microsoft Bing Maps, covered some of the same territory, giving some qualified backing to my projections on the future shape of convergence between Mobile AR and Mirror Worlds. Excerpts from their presentations are in the videos below. Speaking on Photosynth, Microsoft Bing’s Street-Side View and how Mirror Worlds and Computer Vision will provide the tight visual registration to reality needed for strong mobile AR… in his own words, “That’s really… at the core of the vision that we’re trying to pursue.”
Blaise Aguera y Arcas:
Anthony Fassero:
Sources, References & Inspirations:
Selected Research: Oxford
Georg Klein: Website of PTAM researcher.
PTAM: Parallel Tracking and Mapping.
Selected Research: University of Washington
Noah Snavely: Team Leader, now at Cornell.
Photo Tourism: Precursor to Microsoft Photosynth.
Company Links: Computer Vision & Mirror Worlds:
QderoPateo: Chinese launched AR mobile patform.
Zenitum: Computer vision software.
Microsoft LiveLabs: Home of Photosynth.
Google Earth: Google’s Mirror World.
Earthmine: Mirror World urban mapping.
EveryScape: Mirror World urban mapping.
Part Three of Six
Mobile Augmented Reality, The Ultimate Game Changer
I’ve taken my Augmented Reality & Emerging Technologies presentation, and serialized it into six video episodes. The third of which is featured above. I am releasing them here, in sequence.
If you only watch one video in this series, this is the one to watch.
Sources, References & Inspirations:
Selected Research: Oxford
Georg Klein: Website of PTAM researcher.
PTAM: Parallel Tracking and Mapping.
Selected Research: University of Washington
Noah Snavely: Team Leader, now at Cornell.
Photo Tourism: Precursor to Microsoft Photosynth.
Selected Research: Carnegie Mellon University
Yaser Ajmal Sheikh: Assistant Research Professor.
Robotics Institute: Where Carnegie Mellon’s AR research is done.
Video: Keynote:
Eric Schmidt: Speaks at Mobile World Congress.
DataViz: Data Visualization:
Links to the websites of the Data Visualization artists mentioned:
Aaron Koblin and Jer Thorp.
Data: Sources, companies or organizations mentioned:
NY Data Mine: Public Datasets released by New York City.
NYC Big Apps: Winning entries, including “WayFinder” for Android.
Data.gov: Public Datasets as released by the US federal government.
DataSF: Public Datasets as released by San Francisco.
London Datastore: Public Datasets as released by London.
Pachu.be: Realtime environmental sensory data.
InfoChimps: Data Commons and Data Marketplace.
DataMarketplace: Marketplace for structured data.
Numbrary: Directory of structured data.
Factual: Data mashup platform.
Company Links: Some companies mentioned:
PolarRose: Facial recognition software.
Face.com: Facial recognition software.
TAT: (The Astonishing Tribe) Mobile software.
Comverse: Mobile & network software.
QderoPateo: Chinese launched AR mobile patform.
LiveLabs: Microsoft LiveLabs, home of Photosynth.
Leica Geosystems: Maker of 3D laser scanners.
Artescan: 3D scanning services company.
Earthmine: Mirror World urban mapping.
EveryScape: Mirror World urban mapping.
Google Earth: Google’s Mirror World.
Ustream: live streaming video.
Livestream: live streaming video.
Justin.tv: live streaming video.
Zenitum: Computer vision software.
Metaio: Augmented Reality development software.
Seac02: Augmented Reality development software.
I’ve taken my Augmented Reality & Emerging Technologies presentation, and serialized it into six video episodes. The first of which is featured above. The following five episodes have been animated and scripted. They will be released in sequence as I complete the voice-over.
On a side note, I have just returned from France where I was attending Laval Virtual and, thanks to Ben Thomas, had the opportunity to tour the impressive advanced technology showroom at Echangeur in Paris. In fact, it is thanks to Ben that I was able to make it to Laval at all. When the French train workers’ union went on strike, he rented a car to drive us down.
Sources, References & Inspirations:
VIDEO:Dennou Coil the complete series (links to episode 1)
PDF:Man-Computer Symbiosis by J.C.R. Licklider, 1960
(PDF also includes The Computer as a Communication Device by J.C.R. Licklider & Robert Taylor, 1968)
When discussing Location-Aware Mobile Augmented Reality with clients or friends they are often initially mystified by how it works without using any form of tagging or QR codes. In short, this video is a visualization of the first conversation I usually have when the subject comes up. I’ve created it as a simple explanation to demystify the technology for those who are just becoming familiar with it.
The visual shown is not of any specific AR application, it is only meant to be a general representation of the underlying technology.
Many of the links in this article are for video demos. Rather than having a string of 30+ videos cluttering and breaking up the article, I’ve chosen to set up a separate video page. When you click a video link, it will open a second window. You can view the related video as well as navigate all of the other videos from this window. If your monitor is large enough to permit, I would even suggest leaving the second window open for the videos to cue each video when needed, as you read through the article. To differentiate the video links from other links, links to videos are each followed by a “¤”. To open the window now, click here ¤.
While social media in general, and Facebook and Twitter specifically, have been monopolizing mainstream media’s coverage of online trends, augmented reality is getting a lot of inside-the-industry exposure, mostly for its undeniable wow factor. But that wow factor is a double edged sword, and advertising has a way of turning trends into fads, just before they move on to the next brand new thing. So for this article I wish to focus on practical applications and augmented reality with clear end user benefits. I’ve deliberately chosen not to address entertainment and gaming related executions as it is beyond the scope of this article and frankly merits dedicated attention all its own. And perhaps I’ll do just that in a future article.
Can it Save the Car? The automotive industry was an early adopter. Due to the manufacturing process, the CAD models already exist and the technology is very well adaptive to showing off an automobile from a god’s eye view. Mini ¤ may have been first off the pole position, with Toyota ¤, Nissan ¤ and BMW ¤ tailgating close behind. Some implementation of AR will soon replace (or augment) the “car customizer” feature that is in some form standard on all automobile websites.
The kind of augmentation that is so applicable to automotive is also readily adaptable to many other forms of retail. Lego ¤ is experimenting with in store kiosks that feature the assembled kit when the respective box is held before the camera. Because legos are a “kit” the technology is very applicable in-store, however I find the eCommerce opportunities much more compelling. Ray-Ban ¤ has developed a “virtual mirror” that lets you try on virtual sunglasses from their website. Holition ¤ is marketing a similar implementation for jewelry and watches. HairArt is a virtual hairstyle simulator developed for FHI Heat ¤, maker of hair-care products and hairstyling tools. While demonstrating potential, some attempts are less successful ¤ than others (edit: I just learned of a better execution of an AR Dressing Room by Fraunhofer Institut). One of the most practical, useful implementations I’ve seen is for the US Post Office ¤— A flat rate shipping box simulator (best seen). These kind of demonstration and customization applications will soon be pervasive in the eCommerce space and in retail environments. TOK&STOK ¤, a major Brazilian furniture retailer, is using in-store kiosks to view furniture arrangements, though I personally find theirs to be a poor implementation. A better method would be to use the same symbol tags to place the AR objects right into your home, from the camera connected to your PC. And that’s just what one student creative team has proposed as an IKEA ¤ entry for their Future Lions submission at this years Cannes Lions Advertising Festival. A quite sophisticated version of this same concept has also been developed by Seac02 ¤ of Italy.
To Tag, or not to Tag? A couple of years ago I wrote here about QR codes. A couple of weeks ago, while attending the Creativity and Technology Expo, I was given a private demo of Nokia’s Point & Find ¤. This is basically the same technology as QR codes, but uses a more advanced image recognition that doesn’t require the code. Candidly I wasn’t terribly impressed. The interface is poor and the implementation is so focused on selling to advertisers that they seemed oblivious to how people will actually want to use it, straightjacketing what could be a cool technology. Hopefully future versions will improve. Most implementations of augmented reality rely on one of two techniques— either a high degree of place-awareness, or some form of object recognition. Symbols similar to QR codes are most often used when the device is not place-aware, though some like Nokia’s Point & Find don’t require a symbol tag. Personally, even if the technology no longer requires it, I feel the symbol or tag-code is a better implementation when used for marketing. We are still far from a point where everything is tagged, so people won’t know to inspect if a tag-code is not present. Furthermore, the codes place around on posters and printed material help build awareness for the technology itself. Everything covered here thus far has been recognition based augmented reality.
Through the Looking Glass Location-aware augmented reality usually refers to some form of navigational tool. This is particularly noteworthy with new applications coming to market for smartphones. As BlackBerry hits back at the iPhone, Android’s list of licensees grows and the Palm brings a genuine contender back to the table with the new Palm Pre, there is huge momentum in the smartphone market that not even the recession can slow down. I personally think the name “smartphone” is misleading as these devices are far beyond being a mere ‘phone’. Even a very smart one. They are full-on computers that, among many other features, happen to include a phone. In my prior article on augmented reality I focused on the iPhone’s addition of a magnetometer (digital compass). This gave the iPhone the final piece of spacial self-awareness needed to develop AR applications like those coming fast and furious to the Android platform. Think of it like this— The GPS makes the phone aware of its own longitudinal and latitudinal coordinates on the earth, the compass tells it which direction it is facing, and the accelerometer (digital level-meter) determines the phone’s degree from perpendicular to the ground (this is what lets the phone’s browser know whether to be in portrait or landscape mode). Through this combination of measures the device can determine precisely where in the world it is looking. There is already a fierce race to market in this highly competitive space. Applications like Mobilizy’s Wikitude ¤ (Android), Layar ¤ (Android) and other proof of concepts seeking funding like Enkin ¤ (Android) and SekaiCamera ¤ (iPhone) are jockeying for the mindshare of early adopters. Others have developed proprietary AR navigational apps such as IBM’s Seer ¤ (Android) for the 2009 Wimbledon games. Two months ago when Nine Inch Nails released their NIN Access ¤ iPhone app, there was no iPhone on the market with a built-in compass, so the capability for this level of augmentation was not yet available, but a look at the application’s “nearby” feature gives a hint at the kind of utility and community that could be built around a band or a brand using this kind of AR. View a demo of Loopt ¤, and only a little imagination is needed to see how social networking can be enhanced by place awareness, now add person-specific augmentation tied to a profile and the creepy stalker potential is brought to full fruition, depending on your perspective. And there are other well established players in the automotive navigation space that have a high potential for crossover. The addition of a compass to the iPhone paved the way for an app version of TomTom ¤. Not to be outdone, a Navigon ¤ press release has announced that they too have an iPhone app in development. How long before location-aware automotive navigation developers choose to enter the pedestrian navigation space?
Some Assembly Required It seems everyone wants some AR business from IKEA. Another spec project by a student at the University of Singapore proposes an assembly instruction manual for IKEA ¤ furniture. In a more sophisticated application on the same line of thought, BMW ¤ is experimenting with augmented reality automotive maintenance and repair technology. Note in that video that he is not doing this in front of his laptop camera, nor is he holding his smartphone in front of his face. He’s wearing special AR eyewear. The potential for hands-free instruction and tutorial is as obvious as it is unlimited. Consider any product you purchase that comes with instructions (You do read the instructions, right?). A municipal construction crew repairing a broken water pipe could effectively have X-Ray vision, seeing where all the pipes are under the road, based on schematics supplied to their eyewear from city records.
Seeing is Believing When it comes to Virtual Reality, I’ve had a mantra that none of this will really take off until we’re in there versus looking at there. I believe augmented reality will be the catalyst that pushes digital eyewear into the marketplace. Virtual World applications are, by their nature, not location dependent. In many ways that’s the point— you can be anywhere. And sitting at your computer or game console and looking at a screen is a well established all-purpose interface. Place-aware augmented reality, on the other hand, is location dependent— Walking down the street holding your smartphone in front of your face is not a long-term solution. In only a short couple of years, a bluetooth earpiece ¤ has gone from being the goofy guy walking down the street who looks like he’s talking to himself, to a common everyday accessory, even fashionable. What works for your ears is now coming to your eyes— a hands-free visual interface in the form of eyewear. Some variation of this concept has been around for a long time ¤. Slow to improve, even most contemporary models are less fashionable than a Geordi LaForge’s visor, but slowly they are improving.
The Vuzix Wrap 920AV (at left), prototype premiered at the 2009 CES in Las Vegas, are the newest consumer class digital eyewear marketed for augmented reality applications. WIRED Magazine’s Gadget Lab feels their most significant feature, “comes from the fact that the company finally hired a designer aware of current aesthetic tastes.” Significant to the 920AV’s is that: A. They boast ‘see-thru’ video lens that readily lend themselves to augmented reality applications, and B. They are stereoscopic (meaning they have a separate video channel for each eye, required for 3D). They are meant to hit the market in the Fall, and are being pushed as an iPhone compatible device. If they are smart, they will do a bundled play with a “killer app” such as SekaiCamera or similar product. They have the potential to be the ‘must have’ gift for the 2009 holiday season. Not to oversell them, I have not personally demoed them yet so I don’t know if they will deliver on the hype, but they look as though they will be first to market, and their product will be the leading contender in the immediate future. Here is a demonstration of a prior Vuzix model ¤ (behold the fashion statement). Using symbol-tag based augmented reality, this man places a yacht in his living room.
If the quality of the user experience fails to live up to expectations, Vuzix has many pretenders to the crown. Fast followers like Lumus (at left) and others are trying to get products to market as well. Then there are MyVu, Carl Zeiss, i-O Display Systems and others who have video eyewear products and are likely candidates to come forward with AR offerings. Add to that a technical patent awarded to Apple last year for an AR eyewear solution of their own and it is clear this could quickly become a crowded and competitive product category. This video titled Future of Education ¤, while speculative, is a splendidly produced and rather accurate projection of where the technology is going.
Where to From Here? We’re moving in this direction at exponential speed, the pace of progress is only going to keep moving faster. As we see the convergence of augmented reality with mobile and mobile with ear and eyewear, there are another set of convergences just over the horizon. We’re on the threshold of realtime language translation ¤. This is an ingredient technology and, like a spell-checker, will soon be baked in to all communications devices, first of which will be our phones. The Nintendo Wii brought motion capture into our homes, and technologies like Microsoft’s Project Natal ¤ are converging motion capture with three dimensional optical recognition, so no device is needed. And everything, both real and virtual, will soon be integrated into the semantic web. Intelligent agents will assist us with many tasks. While most of this intelligence will occur behind the curtain, as humans we like to personify our technology. It won’t be long before our personal digital assistant could be given the human touch. How human?
NOTE: In the references below, I’ve included a list of firms that have created some of the pieces shown here or the technologies used.
Apple iPhone Apps reports on new iPhone features, attributing credit to an anonymous leak from inside Apple. I would like to focus on one specific feature. They report, with skepticism:
-Revolutionary combination of the camera, GPS, compass, orientation sensor, and Google maps
The camera will work with the GPS, compass, orientation sensor and Google maps to identify what building or location you have taken a picture of. We at first had difficulties believing this ability. However, such a “feature” is technically possible. If the next generation iPhone was to contain a compass then all of the components necessary to determine the actually plane in space for an image taken. The GPS would be used to determine the physical location of the device. The compass would be used to determine the direction the camera was facing. And the orientation sensor would be used to determine the orientation of the camera relative to the gravity. Additionally the focal length and focus of the camera could even assist is determining the distance of any focused objects in the picture. In other words, not only would the device know where you are, but it could determine how you are tilting it and hence it would know EXACTLY where in space your picture was composed. According to our source, Apple will use this information to introduce several groundbreaking features. For example, if you were to take a picture of the Staples Center in Los Angeles, you will be provided with a prompt directing you to information about the building, address, and/or area. This information will include sources such as wikipedia. This seems like quite an amazing service; and a little hard to believe, however while the complexity of such a service may be unrealistic, such is actually feasible with the sensors onboard the next generation iPhone.
And why “unrealistic”? Every piece of this technology already exists in the wild. This is not a great technological leap. This is merely smart convergence.
There are already two applications on the Google Android platform that have these features. One is a proof-of-concept called Enkin, developed by Max Braun and Rafael Spring (students of Computational Visualistics from Coblenz Germany, currently doing robotics research at Osaka University in Japan). The second, Wikitude by Mobilizy, is already in full-blown commercial release (an Austrian company, founded by Philip Breuss-Schneeweis and Martin Lechner).
WIKITUDE DEMONSTRATION:
ENKIN, PROOF-OF-CONCEPT:
It is only one short step further to let users geo-tag their photos. Many social photo/map applications available for the iPhone already incorporate such a feature. Building this into the realtime viewfinder would not be a great challenge. By example, the proof-of-concept for this already exists in the form of Microsoft’s Photosynth (silverlight browser plugin required).
Social Media apps could tap into this utility to network members in real space. At the most basic level, Facebook and/or LinkedIn apps could overlay member’s with their name and profile information.
The next logical extension of this will be to place the information directly into your field of vision.
The OOH marketing opportunities are immense. Recent campaigns for General Electric in the US, and the Mini Cooper in Germany show where this is going. Suddenly the work done by Wayne Piekarski at the University of South Australia’s Wearable Computer Lab is no longer so SciFi (now being commercialized as WorldViz). At January’s CES, Vuzix debuted their new 920AV Model of eyewear, which includes an optional stereoscopic camera attachment to combine virtual objects with your real environment. Originally scheduled for a Spring release, their ship-date has now been pushed back to Fall (their main competitor, MyVu, does not yet have an augmented reality model). If the trend finally takes, expect to see more partnerships with eyewear manufactures.
Initially through the viewfinder of your smartphone, and eventually through the lens of your eyewear, augmentation will be the point of convergence for mobile-web, local-search, social media, and geo-targeted marketing. Whether Apple makes the full leap in one gesture with the release of their Next-Gen iPhone, or gets there in smaller steps depends upon both the authenticity/acuracy of this leak, and the further initiative of third-party software and hardware developers to take advantage of it. Innovation and convergence will be the economic drivers that reboot our economy.
EDIT: The only capability Apple actually needs to add to the iPhone in order for this proposed augmented reality to be implemented is a magnetometer (digital compass). Google Android models already have this component. Charlie Sorrel of WIRED Magazine’s Gadget Lab has separately reported this feature through leaks of a developer screen shot, and on May 22nd Brian X. Chen, also reporting for WIRED Magazine’s Gadget Lab, put the probability of a magnetometer being included in the new iPhone at 90%. Once the iPhone has an onboard compass, augmented reality features will begin to appear, whether through Apple’s own implementation or from third party developers.
UPDATE: Since the time of this writing, the iPhone 3GS has been released, and it does indeed include an magnetometer.