Stay informed about technical articles and developments by subscribing to our newsletter.
Sensor Fusion > Next-Generation Sensor Fusion for Next-Generation Sensors and Driving Functions
Typical driver assistance systems or automated driving functions consist of several components: One or multiple sensors, the sensor fusion, the driving function, and the actual vehicle control.
Current generation ADAS like AEB, ACC, and lane-keeping operate in well-structured environments and need to be aware of similar object types in a limited number of scenarios. For this, low-resolution sensors are used:
- Low-resolution radar: These radar sensors provide detections where each detection consists of a 2d position and the Doppler velocity. Per detected object, a maximum of 2 to 3 detections are provided.
- Low-resolution camera: Independent from the pixel resolution of the internal camera sensor, these sensors provide bounding boxes of detected objects, either in the world or the image frame. Often, a single detection per object is provided.
To use the data from multiple sensors and extract dynamic objects, current ADAS apply so-called Dynamic Object Fusion approaches with the following general properties:
- They are based on Kalman filtering, which runs on comparably small CPUs.
- They require low-resolution sensor inputs and thus are a natural fit for the sensors mentioned above.
- They require to know all object types during design time. There is no support for unknown object types.
- They are well suited for large distance tracking of dynamic objects. However, they have only limited performance for near-distance and extended objects.
- They neither support static environment nor free-space.
Next-Generation Sensors and Driving Functions
Let’s have a look at next-generation radar sensors. Just like the current ones, they provide detections consisting of a 2d position plus the Doppler velocity.
However, the big difference is the number of detections, which is many times higher than with current sensors. As a result, radar sensors are now also able to image objects in greater detail.
While current-generation cameras provide detections of objects, next-generation cameras often perform so-called semantic annotation.
For each pixel of the image, the class of the underlying object is determined and provided. Again, much more details of the scene and the containing objects become available to the sensor fusion.
Next-generation driving functions come with new requirements and challenges. While current driving functions often operate on highway conditions, more and more driving functions are developed for urban scenarios.
These urban scenarios include more traffic participant types like pedestrians, bicycles, and wheelchairs. Additionally, dynamic objects that are not known at design time are more likely in these scenarios and must be detected. From these driving function requirements, the following needs for the environmental model can be derived:
- Object extensions: The environment model shall provide the spatial extension of dynamic objects, e.g., for proper lane association.
- Object motion prediction: For each dynamic object, its kinematics should be determined so that its motion can be predicted up to a certain extend.
- Static objects: In addition to dynamic objects, static objects should be provided.
- Free space: As soon as the driving function includes automated steering and path planning, explicit free space information is required.
- Unknown object types: Modeling all potentially moving objects types upfront is an enormous if not unsolvable challenge. Thus, unknown dynamic objects need to be detected and predicted as well.
Limitations of Current Sensor Fusion Approaches
Current-generation sensor fusion approaches often include a combination of dynamic object fusion and a static occupancy grid. Dynamic object fusion and its Kalman filter approaches have limited performance for small distances and extended objects, mainly due to the often heavily violated point-source assumption that underlies these algorithms. They require to know all object types in advance and thus, have an inherent risk of missing such objects.
Furthermore, if used with high-resolution sensors, clustering of the data is required so it can be used with Kalman filters. As such clustering is done at the detection level, clustering errors occur often and can hardly be corrected by later processing steps. These clustering errors then result in poor object fusion performance.
Static occupancy grid methods provide static objects as well as free space. While this is sufficient from an interface perspective, static grids also contain dynamic objects that generate the typical incorrect tails, resulting in false negatives for free space.
To circumvent this effect, people often try to exclude measurements from dynamic objects before putting the measurements into the grid. For determining which measurements stem from dynamic objects, the association from the dynamic object fusion is used.
As the quality of the association heavily depends on the clustering mentioned above, errors are propagated from the clustering through the object fusion to the static occupancy grid.
Built-In Consistency: Objects and Free Space Based on a Single Source
Integrated sensor fusion approaches solve this chicken-and-egg problem by estimating both static and dynamic objects as well as free space.
The Dynamic Occupancy Grid or short Dynamic Grid is an integrated low-level sensor fusion approach that does not rely on a clustering step as mentioned above. Thus, it does not suffer from error propagation due to early decisions.
Instead, the classification between static and dynamic objects is based on more information, resulting in better performance. Per design, there are no conflicts between dynamic traffic participants and static environment as they are derived from the same model at low latency.
The grid-based approach also supports unknown dynamic object types, which further improves the applicability in urban environments. If a sensor provides object classification such as semantic segmentation information from a camera, this information can be added to the grid.
This improves the robustness of object extraction, provides detailed class information for traffic participants and can be used to classify free space, e.g. for automated parking applications.
To summarize, classic sensor fusion approaches like dynamic object fusion and static occupancy grids have limited performance and applicability for high-resolution sensors and next-generation driving functions.
Low-level sensor fusion approaches like the dynamic grid resolve these limitations by directly processing the HD sensor data in an integrated fashion, providing a consistent environmental model consisting of dynamic and static objects as well as free space plus class information.
Check out this article as a presentation by Marcel Markgraf.