Get in touch with Eric Richter, Director Innovation Technology of BASELABS.
What is sensor fusion?
In automated driving, sensor fusion is the process of combining the outputs of different environmental sensors such as radars, cameras, and lidars to obtain more reliable and meaningful data.
Object fusion combines object-level data from camera, radar, and LiDAR sensors using algorithms like Kalman filtering. It provides a unified list of objects where each object contains quantities like position, velocity, size, and class.
Object fusion has the lowest hardware requirements.
Grid fusion combines data on detection or pixel-level from camera, HD radar, and LiDAR sensors using probabilistic algorithms. It provides a grid map where each grid cell is classified as occupied or free. Dynamic grid fusion also provides velocity and heading information.
Grid fusion provides the highest functional performance.

Get insights from our experts into sensor fusion concepts for automated driving. Dive deep into the topic, including considerations for next-generation sensor fusion systems.
- Safe Autonomous Operation of Commercial Vehicles
- A Sensor Fusion Benchmark of ARM CPUs
- When a machine is driving your car, how does it avoid getting into accidents?
- Why the best way to detect objects is not to detect them - A Comparison of Environmental Model Architectures for Automated Driving
- Next-Generation Sensor Fusion for Next-Generation Sensors and Driving Functions
- Sensor Fusion - It's all about Prediction
- Is DMIPS still of relevance today? Predicting the runtime of software for sensor fusion
- Sensor Models - Key Ingredient for Sensor Fusion in Automated Driving
Thoughtful considerations are required when making strategic decisions for sensor fusion. BASELABS CEO Robin Schubert shares strategic suggestions to enrich your thinking.

Read more about sensor fusion and related topics in our Sensor Fusion Literature section. The suggested works give a starting point in the abundant literature available for our domain. You might know Sebastian Thrun. But do you know all the other inspiring authors?

What is sensor fusion required for?
Sensor fusion is required for
- resolving contradictions between sensors: Deciding whom to believe if one sensor reports a detection from a particular position and another does not
- synchronizing sensors: Ensuring the position of a vehicle is correctly calculated even though it moved between the times of measurement of two sensors
- providing the basis to predict the future positions of objects: Calculating where the detected vehicles or pedestrians will most likely be a couple of seconds later
- exploiting the strengths of heterogenous sensors: For example, combining the longitudinal accuracy of a radar with the lateral accuracy and classification capabilities of a camera
- detecting malfunctions of sensors: Detecting if one sensor systematically provides implausible detections compared to other sensors
- achieving automated driving safety requirements: Ensuring that automated vehicles will operate safely in a broader range of scenarios than a single sensor could cover
How is sensor fusion implemented?
Regardless of their measuring principle, all sensors provide data containing errors such as
- false positives: The sensor falsely claims that there is an object though it is not,
- false negatives: The sensor falsely claims that there is no object though there is, and
- measurement noise: The sensor reports a value that differs from the true value, e.g. an object reported at 28 m is actually at a distance of 30 m.
All these errors can occur simultaneously, and sensor fusion resolves them using various techniques. Some of these are probabilistic algorithms, Dempster-Shafer methods, or convolutional neural networks. Another type of classification can be made based on the level of the processed sensor data:
Object fusion corresponds to combining data at the object level from various sensors, for example, bounding boxes that are determined from camera images, radar, or LiDAR data. From this potentially heterogeneous data, so-called Multiple Objects Tracking (MOT) algorithms determine a unified list of potential objects. For each object, quantities like position, velocity, and length/width are calculated, including confidence metrics.
In object fusion, detections are initially considered as potential objects whose existence is yet to be confirmed, so-called tracks. For each track, algorithms like the Kalman filter simultaneously estimate the track's quantities like position and motion as well as the track's existence.
For each measurement cycle of each sensor, the algorithm determines which new detections may belong to already confirmed tracks using a group of techniques called data association, for example, PDA, IPDA, JPDA, or JIPDA methods. Those detections are used to update the track's quantity estimates and to update the track's existence probability. New tracks are created from the remainders. Once a track has been detected often enough, it is considered a confirmed object.
Grid fusion combines data from various sensors at a low level – the cell level. In contrast to object fusion, data with high resolution can be processed, including camera images with class annotation on pixel level as provided by semantic segmentation, HD radar data, and LiDAR point cloud detections. For each cell, the algorithm determines if an object occupies this cell or if this cell represents free space, including confidence and conflict metrics. More sophisticated approaches like dynamic grid fusion also determine the velocity and driving direction of object parts that occupy cells.
Due to false positive, false negative, and noisy sensor measurements, all grid or grid map cells are initially set to an unknown state. When a detection falls into a cell, its occupancy probability is increased. If enough detections were associated with a cell, it is considered occupied. Dynamic grid fusion adds so-called particles to each grid cell to estimate the velocity and driving direction of the occupying object using Monte Carlo methods like particle filtering. Also, it uses further sensor data like classification information to improve the robustness of the estimation.