The Case for Affordable and Open 3D Mapping to Accelerate Augmented Reality and Autonomy

Introduction

The goal for this blog post is to demonstrate how a detailed 3D point cloud can be built and accurately georectified to a real world spatial coordinate system for a large geographic area using commodity video. Specifically, highlighting the cost of production at scale. Before we get to the meat of the result we are going to take a detour highlighting why mapping a large geographic area in an affordable manner is an important result. It is a circuitous trip but one we think is worth taking.

Maps for Humans vs. Maps for Machines

The advent of Google Earth inspired a significant technological push and investment to map the world in 3D. These efforts encompassed Google’s work plus Microsoft Virtual Earth, Apple Look Around, Vricon and others. Utilizing a blend of expensive sensors like satellites, aerial cameras and dedicated sensor laden cars imagery was collected at massive scale. This aggregate imagery was then processed to optimize photo-realism to allow humans to explore the Earth virtually. We can think of these as 3D maps for humans.

How to Scale Economically

For almost two decades we’ve been able to fly anywhere in the world in Google Earth/Maps to see imagery of the planet. One feature that is often overlooked in the small line at the bottom of the map listing all the data sources you are actively viewing.

Google Imagery Sources for Boulder County

Misunderstanding Absolute Accuracy

So, why are we waxing poetic about the virtues of real world spatial coordinate systems (i.e. latitude, longitude, altitude) and absolute accuracy? There is a strong belief in the maps for machines (computer vision) community that you only need relative accuracy and Cartesian coordinate systems for machine navigation. In fact, this is a true statement. I think many computer vision specialists came into 3D mapping/navigation and saw this archaic adherence to absolute accuracy as a distraction. It was an unnecessary set of rigor that did not improve engineering performance.

Mapping a Large Geography Economically

Let’s see if we can provide an alternative to expensive single sensor mapping. For our 3D mapping work there are two key dimensions to affordability 1) collection cost and 2) compute cost. Reducing collection cost is the most obvious. Sensor laden vehicles are expensive. The first Google Street View cars in 2007 required a $45,000 camera and $90,000 for the mount and onboard processing unit. Then it was another $125 to $700 per mile of video footage in operational costs. More modern HD mapping rigs, for autonomy use cases (e.g. AutoNavi), can cost upwards of $1,000,000.

Pixel8 Video Collection Rigs
Seeding and Growing a Living 3D Map of the Globe
Rough Georegistration of Point Clouds Using GPS from EXIF
Point Cloud Partitioning and Stitching
Visual Bundle Adjustment Comparison to Aerial Imagery

The Power of Self Service Interfaces and API’s

Driving the cost down is only effective if you can maintain quality when doing so. We’ve covered quantifying systematic measures of errors from our georectification process in previous posts, and they currently range between 25–50cm. Combining inexpensive data collection gear with efficient compute to create high accuracy data really becomes compelling when you allow anyone to do this themselves. To this end we’ve been working hard on a parallelized scale out of our infrastructure to handle multiple concurrent jobs.

Pixel8 Scale Out Architecture

A Cost Breakdown of Mapping the University of Colorado Boulder

Given the unfortunate circumstances of the COVID-19 pandemic mapping the University of Colorado campus was a solo affair. In three hours a single person mapped 25 linear kilometers creating 63 GB’s of video. The total area collected was roughly 1 SQKM.

Pixel8 Collection Path for the University of Colorado Boulder
25km of Video to Point Cloud Compute Cost

The Results

Now for the fun part. Showing off the results of mapping the University of Colorado’s campus and sharing the open data. The eagle eyed readers have probably noticed that our collection of the University of Colorado at Boulder campus was done at the end of May and we are good way through June now. While we were able to process and align al the data programmatically the day after collection back in May we found an issue. Specifically, that the since ground control surface we were using was collected the university had done a considerable amount of significant construction. Entirely new complexes had popped up and some older buildings had disappeared. In the most extreme cases this introduced alignment errors in our data. While the University of Colorado case was a bit extreme it was an edge case we wanted to be able to handle. The good news is the team was able to devise new odometry and pose graph optimization routines to allow SfM reconstructions to bridge temporal gaps in the reference data. In this image we can see how the new method accurately structured and aligned the SfM point cloud even though the building in question was missing from the ground control.

--

--

We are building a multi-source 3D map of the globe one image at a time.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Pixel8Earth

We are building a multi-source 3D map of the globe one image at a time.