The case for synthetic data
Driver monitoring systems (DMS) that assess alertness behind the wheel are rapidly becoming the leading automotive safety feature across the globe. In the EU for example, vehicle safety regulator EuroNCAP is requiring all new cars to incorporate a DMS in order to comply with its safety rating.
Amidst this push, startups are benefiting from business opportunities in the DMS space, offering solutions that range from heartmetrics to onset sleep detection. Among them, Swedish Devant is tapping the potential of synthetic data.
Launched in 2021, the startup generates synthetic data of lifelike digital humans to support the training, validation, and testing of machine learning networks — such as the ones behind driver monitoring systems. Specifically, it develops 3D simulated humans that are diverse in both appearance and behaviour across different situations
But how exactly can synthetic data improve DMS? TNW spoke with Richard Bremer, Devant’s co-founder and CEO, to find out more.
The gap synthetic data can fill
Interest in synthetic data started in the early 1990s, and it didn’t take long for the tech industry to realise the technology’s value in accelerating machine learning.
The automotive sector was one of the first proponents of synthetic data, adopting it in the mid-2010s for the development of autonomous vehicles, advanced driver assistance systems (ADAS), and most recently, DMS.
Driver and occupant monitoring systems (DMS and OMS) typically use infrared cameras and sensors to collect real-time information on the driver and the passengers. Thanks to computer vision and machine learning, this information is then analysed, tracking for instance the driver’s gaze or facial expressions, to determine their alertness and attention to the road.
This means that to perform at their best, both DMS and OMS need to be trained on vast amounts of high-quality data, comprising images and recordings that capture as many diverse situations as possible. Think of drivers texting on their phone, drinking at the wheel, or even leaning to the back seats to stop their children from fighting.
“For any AI network, sufficient data quantity and quality are essential.
While data from cameras and even actor roleplay have fueled the development of DMS so far, using these sources alone to capture every imaginable situation comes with multiple challenges. It’s expensive, time-consuming, limited in variability, and associated with privacy concerns.
That’s where the value of synthetic data comes in, according to Bremer. “The potential and the interesting part about synthetic data is that you can reduce the time and cost and also increase the performance of the network.”
How Devant’s technology works
The Norrköping-based startup uses a step-by-step process on its platform, combining different kinds of 3D assets to create images and animations. In the automotive case, this content can be 3D cabins and people — supplemented with details such as accessories, garments, or eyewear.
To ensure a high-quality result that doesn’t tamper with a machine learning network’s performance, the data’s reliability and accuracy are validated via a series of quality assessment systems throughout the entire process.
“When it comes to what we have built, it’s primarily about making sure that the data has been tested and validated,” Bremer says.
Devant’s aim for its 3D human models is threefold: to align with how things look in the real world, to expand their diversity and offer the widest range of different scenarios possible, and to match customer requirements.
For this reason, the Swedish startup offers a configuration tool for users to select the parameters that correspond to their needs. The adjustments can range from more generic variables (such as age, ethnicity, and sex) to more specific details, including clothing, the frequency of the eyelid movement, or the lighting conditions inside a vehicle.
In June, the company joined forces with Australia-based Seeing Machines, a developer of (DMS and OMS) used by major car manufacturers.
Through the partnership, Seeing Machines will use Devant’s 3D simulations to train and validate its machine learning networks, aiming to further advance its in-cabin monitoring systems and create a large-scale dataset of distracted driver behaviours that align with the EuroNCAP requirements.
Quality just as essential as quantity
To truly tap the potential of synthetic data, it’s not just about hitting a button and generating millions of images within a few days, Bremer explains. It’s also about the data’s quality and accuracy.
The premise is simple. “For any AI network to perform as well as possible, sufficient quantity and sufficient quality are essential.”
The promising aspect about computer-generated data is that “we know exactly down to pixel level what every single image contains thanks to its accompanying metadata,” Bremer says. In contrast, when it comes to real-world data, “you do not have that level of granular control and accuracy as you do with synthetic data.”
But there’s a catch. The more you increase the data’s quality by adding more parameters and realism to cover the vast amount of possible scenarios and human behaviours, the more complex it becomes. This, in turn, increases rendering times.
“That’s why no one before us has taken this quality approach to synthetic data, because it’s so costly in terms of rendering times,” Bremer claims. In fact, Devant struggled quite some time to solve the puzzle of maintaining quality, while optimising speed.
Current limitations
Despite synthetic data’s obvious advantage in quantity and its ability to provide accurate, high-quality simulations, Bremer emphasises that the technology shouldn’t be seen as “a silver bullet.” At least, not yet.
Instead, he says, replacing real-world data with their computer-generated equivalent should be done with a step-by-step, cautious approach.
“I think that the most important thing to remember here is that DMS are life-critical systems,” he notes. And there’s still a number of challenges to overcome — which go beyond the need to have thousands of 3D models to ensure sufficient coverage.
The first challenge is establishing a threshold for what constitutes good and bad data, which Devant will explore in collaboration with Seeing Machines. The second is identifying exactly what data the machine learning network will recognise as important enough to use.
The startup is also putting additional effort into covering more aspects of camera optics. “Simulating different camera parameters is very complex, especially when you need to do it within a limited rendering time per image,” Bremer explains.
The way forward
So far, Devant has been working on the various levels of driver distraction, focusing especially on realistically simulating the eye, with its different movements, eyelid behaviours, and varied pupil sizes.
Through the partnership with Seeing Machines, the startup aims to climb the complexity ladder and keep on adding features that will cover the entire EuroNCAP protocol. From there, Bremer sees drowsiness as the “next natural thing,” with intoxication being another interesting possibility on the company’s list.
Devant’s decision to develop human-centric synthetic data for the automotive industry was a targeted one from the outset, driven by the business opportunity presented by the increasing attention to DMS and the impending EU regulations. According to Bremer, it was also about generating actual value and using technology in a way that benefits people.
Beyond the automotive space, the startup envisions other potential industries where its tech could make a positive impact, such as training AI systems to detect signs of diseases at an early stage.