Apple research paper details LiDAR-based 3D object recognition for autonomous vehicle navi...
Apple researchers are pushing forward with efforts to bring autonomous vehicle systems to public roads, and last week published an academic paper outlining a method of detecting objects in 3D point clouds using trainable neural networks. While still in its early stages, the technology could mature to improve accuracy in LiDAR navigation solutions.
Like other recent scholarly articles published by Apple engineers, the latest entry, "VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection" by AI researcher Yin Zhou and machine learning specialist Oncel Tuzel, was made public (PDF link) through the arXiv archive of scientific papers.
In its article, Apple notes accurate detection of objects in 3D point clouds, like those generated by LiDAR arrays, is a sticking point in a number of burgeoning real-world applications. From autonomous cars to robotic vacuums, machines that navigate the world around them without the assistance of human operators need to detect critical object with speed and precision.
Compared to 2D image-based detection, LiDAR technology proves to be a more reliable alternative as it provides depth information to better localize objects in space, Apple says. However, LiDAR point clouds, generated by emitting laser pulses and logging the time it takes for the light to return after bouncing off a solid surface, are sparse and have highly variable point density, thus causing a host of problems.
Current state-of-the-art techniques designed to manage data interpretation involve manually creating feature representations for said point clouds. Some methods project point clouds into a bird's eye perspective view, while others transform the data into 3D voxel grids and encode each voxel with certain features. Manually crafting feature representations introduce an "information bottleneck" that restricts such systems from efficiently leveraging 3D shape information, according to Apple.
Instead, Zhou and Tuzel propose the implementation of a trainable deep architecture for point cloud based 3D detection. The framework, called VoxelNet, uses voxel feature encoding (VFE) layers to learn complex features for characterizing 3D shapes. In particular, the technique breaks down the point cloud into 3D voxels, encodes the voxels via stacked VFE layers and renders a volumetric representation.
In tests, Apple's methodology showed promise, outperforming current LiDAR based detection algorithms and image-based approaches "by a large margin." This is according to evaluations run through the KITTI 3D object detection benchmark, which Apple used to assess its process. VoxelNet was trained to detect three basic objects -- car, pedestrian and cyclist -- in a variety of tests.
Aside from theoretical research, Apple is currently evaluating a self-driving vehicle testbed on the streets of Cupertino, Calif. The company's efforts in autonomous technology began under the "Project Titan" initiative, which sought to build a branded self-driving car from the ground up. After significant investment and multiple employee reassignments, Titan hit a number of snags and was ultimately put on ice in late 2016, though remnants of the initiative, like supporting software and hardware, remain active.
A report in August claimed Apple is looking to parlay the technology into an autonomous shuttle that will ferry employees between its Silicon Valley campuses.
While Apple's research paper focuses heavily on autonomous vehicle navigation, the tech described can also be applied to augmented reality systems that use depth mapping hardware to detect real-world objects. The new iPhone X sports equipment similar to LiDAR arrays in its front-facing TrueDepth camera, which incorporates a miniaturized dot projector for accurate depth mapping operations. If TrueDepth's range is extended, and mounted on the rear of a portable device, it could potentially be paired with advanced software to power an entirely new consumer AR experience.
Like other recent scholarly articles published by Apple engineers, the latest entry, "VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection" by AI researcher Yin Zhou and machine learning specialist Oncel Tuzel, was made public (PDF link) through the arXiv archive of scientific papers.
In its article, Apple notes accurate detection of objects in 3D point clouds, like those generated by LiDAR arrays, is a sticking point in a number of burgeoning real-world applications. From autonomous cars to robotic vacuums, machines that navigate the world around them without the assistance of human operators need to detect critical object with speed and precision.
Compared to 2D image-based detection, LiDAR technology proves to be a more reliable alternative as it provides depth information to better localize objects in space, Apple says. However, LiDAR point clouds, generated by emitting laser pulses and logging the time it takes for the light to return after bouncing off a solid surface, are sparse and have highly variable point density, thus causing a host of problems.
Current state-of-the-art techniques designed to manage data interpretation involve manually creating feature representations for said point clouds. Some methods project point clouds into a bird's eye perspective view, while others transform the data into 3D voxel grids and encode each voxel with certain features. Manually crafting feature representations introduce an "information bottleneck" that restricts such systems from efficiently leveraging 3D shape information, according to Apple.
Instead, Zhou and Tuzel propose the implementation of a trainable deep architecture for point cloud based 3D detection. The framework, called VoxelNet, uses voxel feature encoding (VFE) layers to learn complex features for characterizing 3D shapes. In particular, the technique breaks down the point cloud into 3D voxels, encodes the voxels via stacked VFE layers and renders a volumetric representation.
In tests, Apple's methodology showed promise, outperforming current LiDAR based detection algorithms and image-based approaches "by a large margin." This is according to evaluations run through the KITTI 3D object detection benchmark, which Apple used to assess its process. VoxelNet was trained to detect three basic objects -- car, pedestrian and cyclist -- in a variety of tests.
Aside from theoretical research, Apple is currently evaluating a self-driving vehicle testbed on the streets of Cupertino, Calif. The company's efforts in autonomous technology began under the "Project Titan" initiative, which sought to build a branded self-driving car from the ground up. After significant investment and multiple employee reassignments, Titan hit a number of snags and was ultimately put on ice in late 2016, though remnants of the initiative, like supporting software and hardware, remain active.
A report in August claimed Apple is looking to parlay the technology into an autonomous shuttle that will ferry employees between its Silicon Valley campuses.
While Apple's research paper focuses heavily on autonomous vehicle navigation, the tech described can also be applied to augmented reality systems that use depth mapping hardware to detect real-world objects. The new iPhone X sports equipment similar to LiDAR arrays in its front-facing TrueDepth camera, which incorporates a miniaturized dot projector for accurate depth mapping operations. If TrueDepth's range is extended, and mounted on the rear of a portable device, it could potentially be paired with advanced software to power an entirely new consumer AR experience.
Comments
Can only work if you're the only one on the road, it seems to me..
Or am I missing something.
That’s a good question.
My guess is that there will be a peer to peer/mesh/ad hoc networking capability for AV's in the near future, and that cars in LIDAR range with each other would just negotiate for a narrow time slot to make the scan; it's likely not that bandwidth intensive. For now, we consider only a single vehicle's AV capability, not the swarm's, and that is not the sufficient for the future.
At the same time, vehicles would need to coordinate with each other to manage traffic, and to build the "packets" of vehicles that platoon to near and distant locations. Autonomous vehicle will have to be anything but that once they are in traffic, and let's just call that part SkyNet. At some point in the future, "classic" vehicles with human drivers, will have to be retrofitted with transponders giving relevant data about the vehicle's and driver's capability to the swarm. Good drivers would have a higher score based on ML that would allow them to interact closer with the swarm, or individual AV's. We shouldn't forget all of the ethical issues that have to be solved once we loose AV's on the world in large numbers.
Then, sometime in the future, there would be elimination of classic vehicles entirely on major arteries in urban areas, followed by restriction to limited areas of use.
Sadly, the odds on me still being alive when that happens is pretty good, but by then, my age will have me happy to have the assist from AV's.
I do believe fully self-driving cars will come, but the timeframe for widespread use is longer than most expect. I use technology all the time, every day, and none of it is perfected to the level that I would expect self-driving to require.
It does seem like something Apple would do, and I’m sure they’re considering it, but I’d guess their biggest concern about taking “Titan” in that direction is the fact that what they were working on would then be out in the open. Something Apple isn’t generally keen on.
—————————————-
Good point. Maybe the analysts have a good answer for that… analysts? 🏏🏏🏏