There’s no good way to transport autonomous car test data — yet

car-in-self-driving-mode

Behind the scenes at locations around the world the auto makers are running tests on autonomous cars for literally thousands of hours. The industry has poured more than $80 billion into R&D on autonomous cars over the last four years, so they are serious about making this happen.

Those of us working on these tests have one overwhelming challenge: how to manage all the data that gets generated during the tests. One eight-hour shift can create more than 100 terabytes of data. In a week of testing multiple cars, we’re talking about petabytes of data. And often — at rural testing centers, for example — Internet bandwidth speeds are simply insufficient to ensure that the data reaches our data centers in North America, Europe and Asia at the end of the test day.

Right now, we have two main ways to transport data back to a data center. They are both cumbersome, but have different plusses and minuses. Until advances in technology make these challenges easier to manage, here’s what we do today:

  • Connect the car to the data center. Test cars generate about 28 terabytes of data in an hour and it takes 30 to 60 minutes to offload that data by sending it to the data center over a fiber optic connection. While this is a time-consuming option, it remains viable in cases where the data gets processed in somewhat smaller increments.
  • Take/ship the media to a special station. In many situations the data loads are too large and the fiber connections unavailable (e.g., at geographically remote test locations such as deserts, ice lakes and rural areas) to upload data directly from the car to the data center. In these cases we remove a plug-in-disk from the car and take it or ship it to a “Smart Ingest Station” where the data is uploaded to a central data lake. Because it only takes a couple of minutes to swap out the disks, the car stays available for testing. The downside of this option is we need to have several sets of disks, so compared to Option 1 we are buying time by spending money.

In three to five years we may get to the point where both options are outmoded by advances in technology that make it possible for the computers in the car to run analysis and select the needed data. If the test car could isolate the test-car video on, for example, right-hand turns at a stop light, the need to send terabytes of data back to the main data center would be alleviated and the testers could send these smaller data sets over the Internet.

Of course, we’re several years away from having such a capability. In the past year, IBM and Sony have been working on a 330 terabyte tape drive that promises faster and more resilient data storage in a form factor that can fit in the palm of your hand. Once such products are commercialized, it should make our lives a bit easier.

Ultimately, we’d like the ability to move our various equipment easily in and out of hotel rooms and carry it on plane trips in our pockets or briefcases. Today, the equipment is often clunky and hard to move around. While technology can help, we have to be realistic and understand the data challenges surrounding autonomous cars are likely to increase exponentially.  The challenges may grow, but at least sometime soon the gear we use won’t be so cumbersome that our muscles ache at the end of the day.


Hanno Borns is a Big Data Solution Architect in the automotive space, focusing on solutions to securely and robustly transport, store and analyze very large amounts of data with high performance. Over his 19-year career with HPE and now DXC, Hanno worked for the HP Corporate B2B Gateway and HP Service Bus Engineering Team. Since 2011, he served as a B2B Solutions Consultant at Business Exchange Services (BES), and has brought his experience working at one of the largest on-premise B2B Gateways in the world to DXC’s clients, in particular with regards to large scale implementations and to optimize communication.

Slawomir Folwarski is a Senior Architect in the DXC Robotic Drive Center of Excellence and DXC Analytics Platform, where he focuses on data workload optimization and big data platform architecture. Slawomir has 17 years of experience in the telco, public sector, automotive, logistics and finance industries with expertise in data warehousing, business intelligence and Hadoop technologies.

Comments

  1. This may sound silly, but i was thinking if there was a way to fold the data like the double helix. And If there was a way to shorten the data by way of coding or encryption. Coiled encryption, is there such a process? Think of a spring that is compressed and on the surface all around it is encrypted data not visible. Just a thought.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: