Stay informed about technical articles and developments by subscribing to our newsletter.
Sensor Fusion > Is DMIPS still of relevance today? Predicting the runtime of software for sensor fusion
When it comes to planning and partitioning future hardware and yet-to-be-developed AD software, DMIPS are often used to specify hardware-independent compute budgets for software blocks. The DMIPS correspond with the number of program iterations of the synthetic Dhrystone benchmark normalized to a MIPS computer of the '70s, therefore DMIPS. Those 'DMIPS budgets' are then compared to DMIPS figures of certain processor cores and then allocated to processes.
As someone who is not only interested in sensor fusion and the software layer but also the underlying HSI and HW, I was always wondering how good this 'DMIPS translation' works for our sensor fusion code. The curiosity stems from the fact that Dhrystone is an integer benchmark, but sensor fusion code is usually heavily using floating-point. Also, even if the instruction sets of processors are the same, the microarchitectures of the cores may have very different levels of complexity, all boiled down to that one DMIPS. Finally, I wanted to know if the derisive abbreviation 'Meaningless Information about Processor Speed' applies when using our binaries.
In the last few days, I luckily had the chance to answer this question. One of our teams at BASELABS develops a library for sensor data fusion, BASELABS Create Embedded. In the course of benchmarking the runtime time of different sensor configurations (1 radar + 1 camera, 3 radars + 1 camera, 5 radars + 1 camera), we executed these different configurations on various hardware. We did not hand-optimize our C-Code and only used activated the compilers' optimizations. Our team measured the runtime of the sensor fusion on different cores: an ARM Cortex-A53, an ARM Cortex-A72, a 6th generation Intel Core i7, a TriCore v1.6P in an Infineon AURIX TC277 and a TC397 as well as a TriCore v1.6E in the TC277.
I multiplied the measured runtimes with the DMIPS numbers of the hardware available from the internet, getting the DMIPS equivalent of the three sensor fusion configurations on each hardware ('measured DMIPS'). As the true DMIPS equivalent is not known, I used least squares to approximate a reference value ('reference DMIPS') for each sensor fusion based on the measured DMIPS. Finally, I determined the ratio between the 'measured DMIPS' and the 'reference DMIPS' for each combination. Looking at the numbers, my conclusion is that the DMIPS indication can be used to get an idea how the runtimes of our sensor fusion algos translate between different microarchitectures.
Some concluding remarks: Based on the publicly available DMIPS/MHz information of the different cores, I used the following assumptions for my calculations: Cortex-A53 – 2.3 DMIPS/MHz; Cortex-A72 - 4.2 DMIPS/MHz; TC277 TriCore v1.6E - 1.4 DMIPS/MHz; TC277 TriCore v1.6P - 1.6 DMIPS/MHz; TC397 TriCore v1.6P - 2.17 DMIPS/MHz; 6th gen Core i7 - 9 DMIPS/MHz (most uncertain). Furthermore, the whole exercise really was to assess the DMIPS-translation of sensor fusion runtimes between the different microarchitectures and cores, and not on judging any relative performance of the hardware – this is from my point of view not possible based on the numbers above, as other aspects would be relevant than the published figures. Also, hardware-specific optimization may change the numbers a lot.