Desktop vs traditional supercomputers

Desktop SupercomputerIt is just amazing how much faster computers are today versus just a few years ago (and cheaper too.) Moore's law of doubling processor complexity (and hence speed) every 24 months is still going strong even though today we are no longer focused on faster clock speeds on single chips but rather multi-processor architectures. There is a race among microchip manufacturers trying to be the first to release chips with multiple cores. Today one can buy single, double and quad core CPUs for their desktop computers.

Researchers at the University of Maryland's A. James Clark School of Engineering unveiled this week their prototype desktop supercomputer that consists of a number of chips on a single board. Their system is capable of speeds 100 times faster than current desktop systems driven both by the large number of processors but also the software architecture that makes it possible for these processors to share the workload. The researchers claim that their computer is easy to develop software for; software development for multi-processor systems requires a different way of thinking and many argue that the tools that will make the task easy are not there yet. The Maryland desktop supercomputer prototype uses 64 processors mounted on a board the size of a license plate. The team needs help naming their new computer and they are soliciting suggestions from the public; you can help here.

In case you are wondering how well this 64-processor computer stacks against IBM's supercomputers then let me point out that also this past week, Big Blue announced the new Blue Gene/P supercomputer that can achieve 3 petaflop performance. IBM's supercomputer can be configured to have anywhere from 4,096 to 884,736 processors; the company will soon install a 294,912-processor supercomputer for use by the U.S. Department of Energy.

One wonders what use could the average PC user have for a desktop supercomputer? I can understand the need of supercomputers for scientific applications but do people really need a 64+ multiprocessor or multi-core computer to read their email and play video games? Then again, we might finally develop home computers that are capable of artificial intelligence including natural language understanding, and face/gesture recognition.

iRobot partners with TASER International for new assault robot

iRobot explorerI guess it was due to happen sooner or later. The well known consumer and military robot manufacturer iRobot announced today that they will be joining forces with TASER International to deliver a Packbot equipped with the TASER X26 gun.

The TASER X26 uses a replaceable cartridge containing compressed nitrogen to deploy two small probes that are attached to the TASER X26 by insulated conductive wires with a maximum length of 35 feet (10.6 meters). The TASER X26 transmits electrical pulses along the wires and into the body affecting the sensory and motor functions of the peripheral nervous system. The energy can penetrate up to two cumulative inches of clothing, or one inch per probe.

Until now, iRobot has supplied remote-controlled robots to the military and law enforcement agencies for passive use such as the surveillance of hazardous areas and the handling of Improvised Explosive Devices (IED); chances are that if you watched the bomb squad remotely detonating a suspicious package they were probably using an iRobot Packbot.

As of today, the Packbot will be going on the offensive since from now on it can also be used to subdue a suspect or enemy combatant using TASER technology.

The two companies have said that they have already developed a prototype robot, the Packbot Explorer equipped with the TASER X26; they plan to unveil it to the public in two weeks time during the TASER Tactical Conference, July 9-10, at the Westin O'Hare in Chicago.

Preps everywhere take notice!
Taser x26

Photos are copyright TASER International and iRobot.

Miniature medical robot developed in Israel

Miniature medical robot from IsraelThe Jerusalem Post is reporting on a newly developed miniature medical robot that is a huge leap forward in this field. The 1 millimeter sized robot is the result of collaboration between Dr. Nir Schwalb of the Judea and Samaria College in Ariel and Oded Solomon of the mechanical engineering department of the Technion-Israel Institute of Technology. The tiny robot is designed to enter a patient's bloodstream and deliver medical treatment as necessary.

It is too early to know what medical uses the robot will have, but they suggest the possibility of being involved in brachytherapy, in which cancer patients are exposed to short-distance adiotherapy from a source placed inside or next to the area requiring treatment. Brachytherapy is commonly used to treat localized prostate cancer and cancers of the head and neck. In addition, numerous robots could be used simultaneously to deal with a large number of metastases (malignant tumors spread through the body).

One interesting aspect of this tiny remote controlled robot is the way it is powered. Instead of receiving power from an on board battery, the robot moves utilizing an external magnetic field that does not harm the patient. The end result is that the robot can operate for an unlimited amount of time before it must be removed. This makes it suitable for treatments over long periods of time.

I really like the idea behind the development of such medical robots. I can imagine (or at least, I wish of) a future where tiny robots enter our bodies and fight viruses or destroy cancer cells. Even better, given the proper materials, such robots could be used to reconstruct damaged tissue; imagine the ability to perform surgery without the need to cut people open!

Obviously the robot is still in the development stage but it is a large step forward. Scientists still have to equip the robot with sensors and actuators that it can use to perform the medical procedures they envision for it; and making it a bit smaller probably wouldn't hurt either.

Illustration of the miniature robot is probably copyright the Jerusalem Post.

Spatio-temporal reconstruction from images

4D cities example
Structure from motion (SFM) is the computer vision problem of extracting the 3D structure of a scene from a series of images taken from a variety of camera positions. One of the nicest implementations of SFM is Microsoft's Photosynth. But SFM applications assume that all images are from the same time-frame. What happens if you have collections of images of a city's downtown from different eras covering a span of many decades? How do you visualize the data in a coherent manner?

Researchers at Georgia Tech led by Frank Dellaert are working on adding a time dimension to 3D models of cities creating what they call 4D cities,

There is a growing need for novel ways to access the exponentially growing archives of historical imagery. It is imperative to go beyond cataloging, indexing, and keyword driven databases, to a paradigm where the computer at least partially understands the content of images. Pushing the state of the art in scene understanding and 3D modeling will enable radical new ways to view and experience historical and/or temporally varying imagery. The research described here aims at building time-varying 3D models that can serve to pull together large collections of images pertaining to the appearance, evolution, and events surrounding one place or artifact over time, as exemplified by the 4D Cities project: the completely automatic construction of a 4D database showing the evolution over time of a single city.

4D cities have applications in many areas including virtual tourism, historic preservation, urban planning and public education. The researchers hope to take Google Earth to the next level in what they like to call 4D Earth.

The team has done an incredible job building a 4D model of the city of Atlanta fusing old photographs from the Atlanta history center, from the 1996 Olympics and more recently obtained ones using a specially equipped pick-up track. The photographs span a time-frame of over 100 years from 1897 to 2006! You can watch a 4D fly-through of the reconstructed city in this video.

Free voice recognition software that works

Speech at CMUIf you are considering adding voice recognition to your AI or robotics project then you are probably looking for a free implementation that works well. Your other option would be to purchase a copy of what is currently considered the state-of-the-art in speech recognition, Nuance's Dragon Naturally Speaking suite. However, before spending any money, you can also try your luck with the open source Sphinx software.

Sphinx is speech recognition software that was developed at Carnegie Mellon University under DARPA funding. A few years ago, CMU released their software for free under a BSD license. The most recent version is currently available in sourceforge here.

Sphinx is a very well designed voice recognition suite that comes in many flavors. There is a non-real-time version for batch processing of voice data and a real-time version for live systems such as telephone spoken dialog and robots. As expected, the real-time version is not as accurate but this is the price you pay for the faster performance. There is also PocketSphinx which is a version of the software designed for hand held computers. All versions work using Hidden Markov Models similar to any modern speech recognition system.

I recently used the JAVA version of Sphinx with a limited vocabulary for a command and control type interface for a robotics system . In the past, I have also used IBM's ViaVoice software. Comparing the two, I have to admit that Sphinx performs equally well and it doesn't cost a penny so I switched completely to using it over ViaVoice.

If you are looking for free voice recognition software that works then Sphinx is your best choice.

Optical topography brain-machine interface

Hitachi brain-machine interfaceCBC reports on Hitachi's brain-machine interface technology that has been demonstrated allowing people to control a toy train by doing simple mathematical calculations such as adding two numbers. This is a passive brain-machine interface device with a person wearing a specially designed cap that records blood flow in the brain. Different mental activities induce different blood flow in the brain which can then be read and interpreted as a command to a computer. The device uses a method called optical topography which differently than the more popular Electroencephalography (EEG) and Magnetoencephalography (MEG) methods. EEG measures electrical activity in the brain and MEG measures the magnetic fields produced by the electrical activity in the brain.

This type of technology has many applications other than entertainment; think of a Wii with a brain-machine interface instead of a remote control. Most importantly, it could be used by amputees to control prosthetic limbs; or it could be useful for people suffering from Lou Gehrig's (amyotrophic lateral sclerosis) disease such as Stephen Hawking, or other motor neuron diseases, to control a computer or any other assistive device such as a wheelchair. Of course, before any of these happen, the devices have to become more accurate, smaller in size and much more affordable.

Photo copyright Shizuo Kambayashi/Associated Press.

Robot pet Pleo available for pre-order

Robot pet pleoLast week, those on UGOBE's mailing list received an email asking to keep checking their spam folders because they were going to be making a huge announcement the following week and they wanted to make sure we didn't miss out. Very thoughtful of them. It was clear at the time that the big announcement would have to do with the much delayed release of their robot pet Pleo.

And yes, this very morning, I and everybody else on their mailing list received the news that Pleo is finally available for pre-order. Excellent news for the much anticipated robot toy that is advertised as groundbreaking and life changing. If you want to get one, then you have to hurry because UGOBE is only making available 2,000 of the limited edition First Hatch Pleos until June 27th. In addition to making the robot available for sale, UGOBE has also lunched a new website for Pleo owners at PleoWorld.com.

Unfortunately, the announcement is not all good news. For one, the robot is only available to the U.S. Second, it is a bit more expensive than originally expected costing $349. UGOBE says that the higher price is the result of “ the higher quality components and features such as a replaceable battery.” The last bit of bad news is that even though one can pre-order Pleo today, the robot will not actually ship until October 15th, 2007! So, you still have to wait a bit longer although the October shipping date makes Pleo suitable as a Christmas present, if you want to surprise someone a couple of months later.

Finally, I want to talk about Pleo's battery. Apparently, it is only good for 1 hour requiring 4 hours to recharge it. This seems bad to me because I can't imagine a kid who would play with Pleo for 1 hour and then have the patience to wait 4 hours for the battery to recharge. Since the battery is replaceable, chances are that one would have to buy a couple (or more) to swap in and out. This would add to the robot's cost. But, on UGOBE's website at the moment, there is no information about buying a second battery. I hope that it will be possible to buy a second battery before they start shipping the robot in October.

That said, the email I received included the following one-time use First Hatch coupon which I can't use since I live in Canada. So, I make it available here to the first reader who wants to pre-order this robot pet: 5334982759.

You can buy Pleo here.

Robot pet Pleo Box

Images in this post are copyright UGOBE.

NASA releases CLARAty free robot programming software

NASA CLARAtyNASA has released a lite version of their Coupled Layer Architecture for Robotic Autonomy (CLARAty) framework for robot software development. The software is a collaborative effort among a number of institutions including JPL, Ames Research Center, Carnegie Mellon, and the University of Minnesota. The complete software suite includes a large number of software modules for robot programming but at the moment NASA is only releasing a subset of that functionality to the public.

Primary functionality in these modules includes math infrastructure, rotation matrices with Euler angles, quaternions, and coordinate transformations (interoperable homogeneous and quaternion transforms). It also includes the coordinate frame infrastructure that connect transformations and mechanisms with moving parts. Additionally, you will find mechanism models for wheeled, legged and hybrid vehicles. Other modules include device and device group infrastructure with support for generic digital and analog I/O, cameras, and motors. Several modules in this release provide vision infrastructure for images, color images, camera models, 3D point cloud, and surface normal image representations.

CLARAty is released under the JPL Open Source License which is a bit different than the well known Open Source License that most of us are familiar with. JPL's software license gives developers the right to create applications for non-commercial use only. This should not be a problem for most people who might want to use CLARAty since they will either be in academics or amateur roboticists.

NASA is releasing the software at a time when there is much competition in robot programming frameworks including among others the popular open source Player/Stage, Microsoft's Robotics Studio and Evolution Robotics' ERSP. Obviously all these players realize that robotics is going to be a big thing sooner or later (actually that's all the futurists tell us every day) and they want to be the ones that develop the platform that will run all of these robots. The problem is that we are getting fragmentation which for the time being will not help us move forward fast. For example, CLARAty has some nice components for visual tracking, path planning and 6DOF pose estimation. If one already has a project in progress that uses the Player/Stage platform then it would be nice if he could use these same components. It doesn't help that the code must be ported from one platform to the other which might or might not be a trivial job.

It will be interesting to watch how the market decides on a framework for the upcoming robot age.
CLARAty layers
Images are copyright NASA.

KUKA's heavy weight robot arm

KUKA KR 1000 titanLast week I talked about Barrett Technology's state-of-the-art cable-driven Whole-arm Manipulator (WAM) and its advantages over the more traditional gear-driven robots. Around the same time, KUKA Robotics introduced the KR 1000 titan 6-axis robot with a payload of 1000 Kg making it the strongest robot arm available for industrial use. The robot is powerful enough to do the work of two current model robots.

The KUKA KR 1000 titan 6- axis robot recently earned its place as the world’s strongest robot in the Guinness Book of Records. It has a total of nine motors, which together deliver the power of a mid-sized car. It features a robust steel base frame and a new drive concept. In axes 1 and 3, two motors feed into a single gear unit. Axis 2 is powered by two motors, each with its own gear unit. The KUKA titan can withstand a static torque of 60,000 newton meters (Nm).

Industrial robots have replaced human workers in assembly lines in a number of industries including automotive, pharmaceutical, electronics, food and consumer goods. KUKA also sells a number of much smaller 4-axis robots such as the KR 3 and KR 5 with a payload of 3 and 5 Kg respectively. What differentiates industrial from consumer robots is their high accuracy and repeatability of motion. This is necessary because these robot arms must repeat the same motion several hundred if not thousand times every day for many days without the need for re-calibration. For example, the KR 5 has positioning accuracy of under 0.02 mm; that's pretty damn accurate!

CMU Tartan Racing is the 2nd team to qualify for the Urban Challenge

CMU Tartan racing BossCarnegie Mellon's Tartan Racing team successfully passed the qualifying test for the upcoming Urban Challenge. They join Stanford University's team which qualified last week. The CMU team that is led by William "Red" Whittaker has outfitted a Chevy Tahoe, nicknamed Boss, with all the necessary sensors and smarts to obey the most basic rules of the road including stopping at an intersection, passing a stopped vehicle and executing a 3-point turn. CMU finished second to Stanford during the Grand Challenge even though they led the way for much of the course; a mechanical failure caused their vehicle to lose ground and eventually be overtaken by Stanford's Stanley. Notably, CMU did much better in this qualifying test than Stanford that needed a second try for passing the stopped car. The competition for first place is going to be stiff. Other than Whittaker, CMU has a large number of top people on the team including Martial Hebert, Reid Simmons and Sanjiv Singh.

There are now 28 spots left for 51 teams that will be given a chance to qualify for the final event to take place in early November.

Mitsubishi robot receptionist available for rent

WakamaruJapan is continuing its push to create robot workers to deal with labor shortages due to an aging population . A couple of days ago, Mitsubishi Heavy Industries Ltd. announced that their mobile robot Wakamaru will now be available to rent for places that require a receptionist "in need of a humanoid touch." The mobile robot has an android upper body with two arms and an expressive head but it moves using a wheeled base. It can understand about 10,000 words, recognize faces and track people while avoiding obstacles in its path. Interestingly, Wakamaru localizes itself using a map of the ceiling and a camera looking at it just like CMU's museum tour guide Minerva which in the past I have referred to as the tipping point for robotics.

Mitsubishi made Wakamaru available for sale a couple of years ago for a mere $13,000; I don't know how many robots they were able to sell. According to the recent announcement, one can now rent the robot receptionist for $1,000 a day for up to five days. It seems that renting is more expensive than actually buying the robot but if one considers that the rental price comes with a team of engineers who will set it up and fix it if something goes wrong, then the price makes sense.

There is definitely a trend in Japan towards renting service robots. Even Honda's much celebrated humanoid robot ASIMO is available for rent for more than $160,000 a year (2002 estimate.) A quick calculation shows that ASIMO would cost about $438 a day or half as much as Wakamaru. This is probably a result of the longer lease contract as even Mitsubishi will give huge discounts for those who rent the robot for longer periods of time.

Wakamaru open arms

What does Stanford's robot car Junior see?

Stanford Racing Team JuniorStanford researchers were the winners of the Defense Advanced Research Projects Agency's (DARPA's) Grand Challenge not long ago. They are now preparing their new vehicle for the upcoming Urban Challenge. We have talked about the new challenge in previous posts. Currently, DARPA is visiting qualified teams and testing their vehicles during a second qualifying round as they select only the best teams for the final event to take place later this year; there are currently 53 teams that are still in the competition. DARPA visited Stanford's team last week and tested their autonomous passenger VW car which as expected passed with flying colors; Stanford Racing Team's robotic car passed 3 of the 4 tests given during a 2.5-hour long course in a parking lot near Google headquarters in California. You can watch a video of the robot trial at the San Fransisco Chronicle website here.

Point Greay Research Ladybug 2One interesting aspect of the Stanford car is that other than using several laser sensors to judge the distance to other vehicles on the road, the car also has a spherical vision camera mounted on the roof. The camera is a Point Grey Research spherical vision Ladybug 2. As you can see from the image to the right, the Ladybug consists of 6 CCD cameras in a small package capable of simultaneously capturing images that cover 75% of the visible sphere. The camera connects to a computer using an IEEE-1394b interface allowing the transfer of data to the computer at 30fps. In case you are wondering what spherical vision images look like after all individual images (from each camera) are stitched together then watch the following video,


The video is provided by Point Grey Research and can be found on a demo DVD that can be requested from the company via its sales team.

So, I wonder what does Stanford's Junior use the data from the Ladybug for? The Stanford team may be using the visual data for doing mapping and localization when the GPS is failing. Or the visual data may be used to detect and recognize road signs and traffic lights. The latter is one component of the Urban Challenge and it would be very hard if not impossible to solve using a time-of-flight sensor. It would also be a bit more expensive to construct a specialized sensor just for detecting traffic lights and road signs.

I am really looking forward to the Urban Challenge competition this coming November.

Big Brother is watching you from behind that billboard

eyebox2
Canadian startup Xuuk Inc. has developed a new camera that can measure the gaze of a random person as far as 10 meters away. The eyebox2 camera operates by taking advantage of the red-eye effect of flash photography. Using an array of infrared LEDs and a 1.3 Megapixel camera, the device captures two frames such that when subtracted return an image that shows only the person's pupils. I found on Wikipedia, the following more technical description of how this type of eye tracker works (Note: I heard about this type of eye trackers a few years ago during an HCI lecture but I never looked carefully at how they work so take the following description with a Wikipedia grain of salt in terms of accuracy)

Most modern eye-trackers use contrast to locate the center of the pupil and use infrared and near-infrared non-collumnated light to create a corneal reflection (CR). The vector between these two features can be used to compute gaze intersection with a surface after a simple calibration for an individual.

Two general types of eye tracking techniques are used: Bright Pupil and Dark Pupil. Their difference is based on the location of the illumination source with respect to the optics. If the illumination is coaxial with the optical path, then the eye acts as a retroreflector as the light reflects off the retina creating a bright pupil effect similar to red eye. If the illumination source is offset from the optical path, then the pupil appears dark.

The company wants to sell the technology to advertisers who might want to measure the effectiveness of their ads.
Use it to track who's looking at your screen ads in the mall. Obtain detailed statistics on viewing behavior over time on any number of persons within view of a plasma display. Detect when a customer is looking at your product display. See what movie poster your audience is interested in. Or control your home theater or first person shooter games with your eyes.

I will offer another possible application for the camera and that would be creating a computer interface for disabled people suffering from Lou Gehrig's disease such as Stephen Hawking. Advertising usings plasma displays has been gaining momentum in North America and Europe in what is known as “ambient” advertising.

Gaze tracking exampleThe company is currently testing their eye tracking system with a 107cm plasma screen that has been outfitted with an eyebox2 sensor and positioned in front of a Tim Hortons restaurant on the Queen’s University campus. If the test is successful, the company hopes to get the attention of advertisers who just can't get enough statistics about consumers. The company also claims that because the eyebox2 has a small footprint, it could easily be found tracking shoppers' gazes at super market aisles. If you are concerned about your privacy, then Xuuk says that no personal data is stored after it has been processed for the eye tracking but can anyone really guarantee that at some point in the future this technology will not be misused?

The eybox2 eye tracker sells for just under $1,000 as a special offer until July 1st, 2007. Xuuk is a spin off company out of the Human Media Laboratory at Queen's University in Ontario, Canada.

The Barrett Technology WAM robot arm is the most advanced of its kind

WAM with hand end-effectorWhen it comes to robot arms, Barrett Technology Inc's Whole-Arm Manipulator (WAM) is way ahead of the competition. It was no accident when the Guinness Book of World Records named it the most advanced robotic arm in their special Millennium edition.

Barrett's WAM is a backdrivable, cable-driven robot arm that comes in two configurations one with 4 degrees-of-freedom and one with 7. Cable-driven means that this robot arm does not use any gears for manipulating the joints; the gears are replaced by a cable drive eliminating any backlash problems while allowing for improved speed and stiffness. In addition, the WAM is

the only arm sold in the world with direct-drive capability supported by transparent dynamics between the motors and joints, so its joint-torque control is unmatched. It is built to outperform today's conventional robots by offering extraordinary dexterity, fast dynamics, high bandwidth, zero backlash, and near-zero friction. It is the only practical arm ever built to be inherently joint-torque controllable.

In other words, this incredible marvel of engineering is far more dexterous, speedy and easy to program than the traditional robot arms. Pairing the arm's backdrivability with the “Teach and Play” feature of the WAM software, a user can manually move the arm through any trajectory and then play it back with full control over several parameters including the arm's speed and acceleration. The arm's weight ranges from 25.4 to 27.2 Kg and its payload varies from 3 to 4.5 Kg depending on the configuration. It has a much smaller footprint than a traditional robot arm and it looks much slicker if you ask me.

WAM grapsing objectUnfortunately, I have heard that at them moment the WAM is rather pricey so it is mostly targeted for use in research labs and industrial applications. The technology, however, is so good that I would not be surprised if in just a few years, the Barrett Technology WAM becomes the standard robot arm for every robot.

Photos of WAM are copyright Barrett Technology Inc.

Braintech releases Volts-IQ visual tracking SDK for robotics applications

Braintech logoBraintech, a Canadian robotics company specializing in visually guided robots, has released a Community Technical Preview (CTP) version of their Volts-IQ Robot Vision software that integrates with the Microsoft Robotics Studio. For the time being, Volts-IQ is capable of performing real-time tracking of a specified object; the system learns an appearance model for the object from a collection of pixels designated by the user.

For the current CTP, the company focused on providing one of the most essential visual capabilities every robot needs, which is the ability to recognize and track an object or distinct pattern in its habitat. The Vi_TrackerTM service which is part of the CTP can be easily trained on various targets by simply drawing a box around the pattern or target of interest in the image. After training, the algorithm begins to continuously recognize and follow the target as it moves around in the image. Examples of objects to recognize and track may include household objects such as a television remote, a book or a beverage container as well as artificial targets marking the placement of points of interest such as the robot's charging station. So long as the target has some distinct texture/appearance, the Vi_Tracker service can recognize and localize it within the image, providing its image coordinates and rotation angle.

But talk is cheap so here is Braintech's video showcasing their visual tracker,

It is not clear to me from the video what algorithms are used for the tracking but considering the robust performance of the tracker under occlusion I would guess that it is CONDENSATION-based using color or possibly simple point features such as Harris corners. There is not much documentation on the site yet that explains how to optimize the tracker's performance but hopefully this will follow soon.

The company expects to receive lots of feedback from end users about the product and they intend to listen to their feedback so fire up your computer and start hacking! Braintech will soon be releasing a number of tutorials demonstrating the capabilities of the tracker. The company also plans to continue enhancing their software suit providing additional services; one such service is the soon to be released optimized Vi_WebcamTM service.

The Volts-IQ product looks interesting and I am looking forward to its future development.

Read Braintech's Press Release about Volts-IQ.

Intelligent machine to play poker against humans during the AAAI 2007 conference

Star Trek Data playing pokerCBC is reporting that artificial intelligence researchers from the University of Alberta, Canada, lead by professor Jonathan Schaeffer have challenged two of the best professional poker players to compete against their own artificial intelligence software, Polaris. The human players are Phil Laak and Ali Eslami and they will be competing against a pair of intelligent programs for 2,000 hands of Texas hold 'em. The winner of this battle between man and machine will take home a $50,000 prize and of course bragging rights.

The competition will take place in late July during the annual AAAI conference to be held in Vancouver, Canada. It is part of the second annual poker competition which Schaeffer's team easily won last year; the poker competition is between programs developed by researchers from around the world. You can get more information at the event's official website here.

For the man vs machine challenge, the Canadian team's Polaris will be utilizing different programs each using a different strategy.

One is very aggressive, but doesn't take into account the playing style of opponents. Another program assesses the strengths and weaknesses of other players and adjusts its style accordingly.

The team hopes that by studying the opponent's game including his rate of bluffing, the computer will be able to switch among the programs to maximize its returns. It would be interesting to find out if this strategy will actually pay off for the AI team; my guess is that it will but only if the AI's models are correct. The real challenge is to get the model right in the first place or start with something good and then improve it using the data from the tournament.

I am very confident that if anything this computer will put up a good fight against its human opponents. Schaeffer's GAMES research group, for years, has been developing intelligent agents capable of playing some of the oldest and most interesting games such as chess, Go, checkers and of course poker. The group's achievements include solving the game of checkers a couple of years ago. In fact, their checker playing program Chinook is the only program to win a human world championship and make the Guinness Book of World Records.

I will be attending the AAAI conference and so I will be able to report from location on this competition's progress (and all the other ones happening at the same time such as the Semantic Robot Vision Challenge.) I guess it is about time that I learn how to play poker; luckily, I have plenty of time to familiarize myself with the basics of the game in the next 1.5 months.

PS: This post was mostly an excuse for posting a photo of Data playing poker :)

Software for solving sequential decision making problems

Intelligent agents acting in the world must be able to make complex decisions under uncertainty. In artificial intelligence, solving the problem of deciding on a course of action is called planning. An example of an intelligent agent faced with a sequential decision making problem would be one that has to buy and sell stock on NASDAQ. This agent can observe the current price of stock and the amount being traded. It then has to estimate whether the stock's value will increase or decrease. Given this estimate, the agent must decide on whether to buy or sell. During the course of the day, the trading agent will make several buy and sell decisions. Another example of an agent faced with a sequential decision making problem is the intelligent agent used for assisting people with dementia as we discussed last week.

The framework of Markov Decision Processes (MDPs) and Partially Observable Markov Decision Processes (POMDPs) are becoming the preferred way for solving such decision problems. MDPs are useful in domains that are fully observable, i.e., the world's state is known to the agent, while POMDPs are useful in domains that are only partially observable; partially observable domains are dominant in robotics. Anthony Cassandra has published a good non-mathematical introduction to MDPs and POMDs here.

Solving sequential decision problems in stochastic domains and finding optimal solutions is a hard. Research in the last few years have yielded methods for finding approximate solutions that may not be optimal but good enough for the agent to act intelligently. Some of the researchers have made their software available for free. So, here is a list of free software for solving MDPs and POMDPs.

SPUDD stands for Stochastic Planning Using Decision Diagrams. It is currently the fastest method for finding optimal and approximate solutions for MDPs and POMDPs. In academia, it is considered the benchmark for testing new MDP solution methods. The software is available for Linux and it is written in C++; the authors have also created a website where anyone can upload their problem described in a specific file format and then have the server compute a solution. The online version is great if you want to try the software without downloading and compiling it for yourself (it would appear that the online version is not available at this time but hopefully will be back soon.)

PERSEUS is the state of the art in finding approximate solutions for POMDPs. PERSEUS is a point-based algorithm that estimates the POMDP value function for a small set of reachable belief points that hopefully generalize well for all important points the agent may encounter. The software is available for MATLAB.

Symbolic PERSEUS implements the PERSEUS algorithms for finding approximate solutions for POMDPs but it uses decision diagrams to represent the problem and value function. It can be more efficient than the original method for large problems with lots of structure that can be exploited using the decision diagrams. Symbolic PERSEUS is written in MATLAB and JAVA.

HSV stands for Heuristic Search Value Iteration and it is an alternative methods for finding approximate solutions for POMDPs. HSV works by using a heuristic function to guide the search for a solution. The software is written in C++.

An intelligent powered wheelchair for older adults with cognitive disabilities

Powered wheelchair with sensorsYesterday, I talked about a collision avoidance system used in the new Lexus LS 600h L sedan for making driving safer. Collisions are an equally serious problem when older adults with cognitive or sensory impairments, i.e., low vision, are given access to powered wheelchairs. Mobility is important for high quality of life at any age; studies have shown that lack of mobility reduces one's chances of socializing causing isolation and depression. For this reason, older people with cognitive impairments such as dementia or Alzheimer's disease who are not strong enough to use a standard wheelchair are given access to powered wheelchairs. Such people often live in places where they can receive health care on a daily basis sharing the facilities with other old adults who may still be able to walk, possibly with the help of a cane or walker. The problem is that operators of powered wheelchairs may not have the cognitive capabilities to use them safely often causing collisions with other people.

To put the danger of collision to perspective, note that even a minor collision between a powered chair and an elderly adult may result in a fall which causes a serious injury 20-30% of the time. Studies show that 5-10% of the time such a fall results in a fracture which unfortunately in 40% of adults can lead to death within six months; this is the case for those adults who suffer from a hip fracture because of the collision and fall, while death is the result of complications relating to the broken hip.

Obviously, we want to be able to improve the quality of life for all people so somehow, we must make it possible for powered wheelchairs to be operated safely in care facilities. As a result, researchers at the Intelligent Assistive Technology and Systems Lab (IATSL) at the University of Toronto, Canada, are working on a new collision avoidance system for powered wheelchairs. The group has outfitted a Nimble Rocket powered wheelchair with a laptop computer and a stereo camera sensor (or a Canesta Vision sensor) made by Point Grey Research. Software processes the stereo images returned from the camera to construct a 2D occupancy grid representation of the space in front of the chair. A decision theoretic planner formulated as a Partially Observable Markov Decision Process (POMDP) is then used to evaluate the probability of a collision. If a possible collision is detected then the system uses a verbal prompt to warn the user; if the user does respond then the software stops the chair from moving forward into the object.

Intelligent Wheelchair
There is a video demonstrating the wheelchair anti-collision system on Canada's CTV television station here.

The anti-collision system is definitely not ready for prime time yet but the team has successfully completed the first step of demonstrating the applicability of the approach for the safe operation of powered wheelchairs. It is always refreshing to read about intelligent systems designed to improve our quality of life. These are the applications in which artificial intelligence and robotics can have a real impact. I am sure that as this blog continuous to grow, we will be covering more AI and robotics uses in assistive living and medical applications.


Images are copyright IATSL.

The Lexus LS 600h L may be the smartest car you can drive today

Lexus LS 600h L luxury sedanWe are all keeping an eye on the teams participating in the DARPA Urban Challenge as they try to build intelligent cars that can autonomously navigate congested urban streets. In the meantime, a few lucky drivers will soon have access to the smartest car available today, the special edition 2008 Lexus LS 600h L. This luxury sedan is not only economical on fuel and has low emissions due to its hybrid drivetrain but it also makes driving safer by using sensors to detect road obstacles and intelligent steering control that helps drivers swerve around them.

How does it do that? Well, the car is equipped with two forward looking infrared cameras for detecting people and animals using their heat signature. At the same time, a radar is used to compute the distance to these obstacles. This seems like an obvious use of forward sensors. What differentiates the LS 600h L is that it has a 3rd camera looking at the driver. This camera is mounted on the steering wheel and it can detect the orientation of the driver's head. So, if the outside sensors detect an obstacle and the inside sensor detects that the driver is not looking at it, then the car flashes a light or sounds a buzzer to warn the driver before automatically applying the brakes to avoid a collision. Better yet, if the car detects that the driver is swerving around obstacles then it switches to a different set of gears in the steering rack helping him steer around the obstacle with ease; most importantly, this happens transparently with the driver never knowing of the extra help he/she receives. Finally, the car also comes equipped with the Advanced Parking Guidance System which is a hands-free parking system for those who find parallel parking challenging. Get the full scoop of Lexus' Advanced Pre-Collision System by watching the informational video at the company's official website here.

The Lexus LS 600h L uses intelligent technology to make driving safer exactly as I had suggested in an earlier post. The car's intelligent control systems aid the driver by working under the hood giving him the impression that he is always in control. This is significant as people like being in control. Nobody wants to just seat in the car and be driven around town; at least not yet. Just as people are slowly embracing alternatives to the internal combustion engine by first adapting the gas-electric hybrid engine, they will also come to accept self-driven autonomous cars after a hybrid control transition period.

By the way, the price tag for this luxury sedan is set to more than $100,000 so not many will be able to afford it as yet. However, it won't be more than a few more years before such intelligent control technologies become commonplace in most cars including economy sedans.

Printing robots with a fabber

Fabathome fabberPopular Science magazine is running an article this month presenting the work of Cornell University roboticist Hod Lipson who is on a mission to develop an affordable, portable and versatile 3D printer. Such printers are designed to print 3-dimensional objects directly from CAD models.

Lipson is the same person behind the self healing robot that made headlines in November, 2006.

This time, Lipson's group at the Computational Synthesis Lab designed a portable fabrication machine called a "fabber" which can be used to print thousands of different objects including watchbands, bottles, a flashlight, a Darth Vader silicon mask and apparently artificial muscles and soon enough a complete and working robot. Better yet, Lipson has started the fabathome.org website making the fabber available to anyone who can afford the $3,000 price tag.

Lipson has great plans for his invention; he says that this technology can be as influential as the Altair 8800. His ultimate goal is to fabricate a complete robot including a power source. Conceivably, future robots could use such devices to print new parts. Lipson also says that if a robot could be fabricated from scratch using a 3D printer then it could be used in space exploration. In such a scenario, a fabber would be transported to a remote planet where it would fabricate a robot for performing scientific exploration and even preparing a base for the coming of human explorers.

In my opinion, such devices could also have a large impact closer to home. Futurists such as Rodney Brooks and Ray Kurzweil speculate that in the future the fusion of man and machine is inevitable. When (or if) that happens then it would be conceivable for people to print replacement or upgrades to their mechanical parts at home using a fabber.

You can read the Popular Science article here.

Where do Google Streetviews come from?

Google's new streetviews have attracted a lot of commentary. All the hype is a hard act to follow but here's my 2 bits.

I find the excitement about privacy issues and whether Google capturing your cat looking out your window is "invasion of privacy" rather funny. But clearly the privacy issue is going to be a serious one. Especially when you consider competitions such as this one for "best images" you can find on the street views. As an example they suggest "citizens flaunting the law". Here's another site of interesting street view images. Luckily for Google, face and license plate detection method are getting pretty robust so having a fuzzifier blur out those features may allow them to side step the privacy issue.

What I am also interested in is where did these images come from? Immersive Media seems to be taking the credit (see June 1, 2007 press release). However, clearly not all the data comes from IMC, since there are numerous "spottings" of the Google camera van in street view reflections (for example) where it looks like some home-grown hexagonal speaker system - clunky, unpolished, and much different from Immersive Media's cute Volkswagen Beetle camera car.

Trading of Immersive Media stock was halted the day the streetview functionality was announced. This was to prepare for massive stock activity that the announcement of Immersive Media's licensing their image data to Google would generate. What instead we see from the stock price is a steep rise a week before the announcement followed by a brief peak and then a decline in stock price. I guess the insiders who bought up stock before the announcement managed to unload it all in a day or two.

I would think that if Google intends to continue using Immersive Media's services, then we wouldn't be seeing this decline. The fact that Immersive stock isn't soaring makes me think this license deal is a one timer to jumpstart the streetview content. But Google probably has other plans for digitizing the planet... such as mentioned previously in this blog. It will be very interesting to see what comes out of that.

When Bill Gates met Steve Jobs

Bill Gates and Steve Jobs at D5 conferenseOkay, I am exaggerating a bit with the post's title. Microsoft's Bill Gates and Apple's Steve Jobs met for the first time back in the seventies when the Personal Computer revolution was about to take place; and both were the two most influential people who made it happen. However, last week marked the first time in more than 20 years that the two men shared a stage; the last time was back in 1983. This seminal meeting took place during the fifth annual D: All Things Digital conference organized by The Wall Street Journal.

Walt Mossberg and Kara Swisher interviewed Gates and Jobs asking questions about the past present and future of computing. Anything said about the past history of Apple and Microsoft it was interesting since the two are considered bitter rivals since the beginning; although after the second coming of Jobs to Apple, he made it clear that he no longer considers Microsoft as a direct competitor since there is plenty of room for the two companies to coexist. That said, I want to mention something that was said during the interview that I think is very important.

When Mossberg asked them about what they think will be the future of computers, both men agreed that it would be vision; as in computer vision that is. Most computers today are equipped with a camera by default. Gates said that he believes computers in the future will be able to see us and understand our natural gestures; he also mentioned speech recognition as a future technology that will have a major impact on the way we use computers. Specifically, Gates said,

...And as we get natural input, that will cause a change. … Software is doing vision and so, you know, imagine a game machine where you’re just going to pick up the bat and swing it or the tennis racket and swing it...You can’t sit there with your friends and do those natural things. That’s a 3D positional device (my note: he means the Wii controller.) This is video recognition.

It was clear from Gates' responses that Microsoft Research (MR) was busy trying to bring such technologies to market. And this does not surprise me considering that the MR groups has assembled some of the best people in computer vision, machine learning and human-computer interaction. Examples include Andrew Blake, Christopher Bishop, Harry Shum, Rick Szeliski, Eric Horvitz and David Heckerman.

Jobs focused more on the post-PC devices, i.e., things such as the iPod and iPhone, but he still believes that we are going to interface with these in a different way but the change will be more gradual to ease adaptation.

Finally, I would like to mention something that Jobs said about hardware and software (this is when he was interviewed separately at an earlier time during the same conference.) He said (with regards to the success of the iPhone considering that it was a late entry to the portable mp3 player market,)
It’s because Japanese consumer electronics couldn't produce elegant software. And that’s why Apple enjoys the success it does with the iPod. If you look at handsets, the situation is similar. Manufacturers have the hardware down, but they just can’t seem to get the software right. The iPhone is great software wrapped in wonderful hardware, and its software is five years ahead of anything else out there.

This quote made me think about the state of robotics in Japan and Korea versus North America. The engineers in the Far East have been busy developing some excellent hardware but in terms of software, i.e., functionality, they are terrible machines. The work on algorithms and software infrastructures for intelligent agents is mostly the focus of research in North America and Europe. It may just turn out similarly to the iPod that the ultimate robot will be a combination of Japanese/Korean hardware and North American/European software. Wouldn't that be interesting?

You can watch video of the entire Bill Gates and Steve Jobs interview at the official D: All Things Digital conference website here.

Bill Gates and Steve Jobs photograph is copyright The Wall Street Journal.

Point Grey Research releases the Bumblebee XB3 multi-baseline stereo camera

Bumblebee XB3 stereo cameraPoint Grey Research has finally introduced a new stereo camera with a wide baseline. Actually, the new Bumblebee XB3 has two baselines. The XB3 has 3 CCD cameras but it is not a trinocular stereo system similar to the company's Digiclops camera. The 3 cameras on the XB3 are configured in a line providing stereo processing at 2 different baselines; one is 12cm and the other 24cm. With this configuration, a vehicle equipped with the camera can compute accurate depth images for objects close and far away. The ability to estimate distances accurately for objects far away from the camera will be welcomed by those researchers working on outdoor applications. Don Murray, director of Point Grey Research explains,

We received numerous requests from customers developing mobile robot applications, for a longer baseline stereo solution. We designed the XB3 so that it will not only address the long-baseline requirement, but also preserve the 12cm baseline used in existing applications.

Stereo processing for the XB3 is performed in real-time using Point Grey's highly optimized Triclops Stereo SDK. Image processing including rectification and correlation-based stereo is performed on host PC that receives data from the camera using the IEEE-1394b (or Firewire) interface. The company says that the camera can process data at 15fps at 1280x960 pixels resolution. The Bumblebee XB3 is available in B/W configuration offering higher resolution and in color for those applications that need it.

Interesting tidbit on Point Grey Research Inc. According to their website, the company was started as a spin-off from the University of British Columbia 10 years ago,
The Company was founded in January of 1997 as a spin-off from the Laboratory for Computational Intelligence (LCI) at the University of British Columbia (UBC) and the Institute for Robotics and Intelligent Systems (IRIS). The founders and principals of the company are all graduates from UBC.

It is good to see that students in Canada have the opportunity to start a company that can grow to become a market leader.