embedded - Performance Differences between evaluation boards -
our company proud owner of stm32f4 evaluation board ( cortex m4f) , received evaluation board (arm7tdmi board).
before starting migration arm7 evaluation board, want know if hardware strong enough us,so wont waste anytime discover later.
our project utilize many dsp algorithms (that takes advantage of fpu) , heavy usage of sdio , , around 1 megabyte of memory .
so , thinking following tests on both evaluation boards ,and see performance differences between them :
math : addition , subtraction,division,multiplication , abs , sqrtf . run loop ( , floating numbers used). sdio : read/write 2 kilobyte buffer in loop memory : read/write external , internal ram in loop.
in opinion , results give indication performance differences ,and expect "real" project ?
thanks michael
i advise against new design based on arm7 - legacy arm architecture. should check the vendor's part status , planned obsolescence part intend design in. no vendor releasing new designs based on arm7.
i suggest dsp algorithms, dsp features of cortex-m4 more important floating point. arm cortex-m cmsis includes dsp library takes advantage of this. either way fixed-point dsp algorithms far more efficient using floating point.
cortex-m far more efficient design arm7 achieving 1.2 dmips per mhz compared less 1.0 dmips per mhz. coupled dsp instructions, floating-point, , separate buses on-chip flags, ram , peripherals make code faster on cortex-m.
the cortex-m architecture defines sysclk , interrupt controller, wheras on arm7 these defined chip vendor , vary between vendors making porting of code between them more difficult.
stm32f4xx parts run @ upto 180mhz; arm7 parts 60mhz or less.
performing comparison using floating point pointless. floating point hardware outperform software floating point necessary on arm7 factor of 5 10 @ least. unless application can cope drop in performance, unsuited arm7. however, applications not need floating point. integer or fixed point algorithms can run around 5 times faster software floating point, compete hardware floating point. remember cortex-m4 fpu single precision only.
it more reasonable comparing cortex-m3 cortex-m4 test sensitivity of application lack of hardware fp , dsp support.
sdio performance limited sdio interface , sd card (which vary in performance @ same "speed rating") - load imposed on processor low, or spend of time waiting data if application busy-waits rather doing useful while waiting on sd card. use of dma transfers can make cpu load more-or-less negligible.
the following diagram illustrates how arm7 positioned compared cortex-m4. latter both higher performance , greater capability. @ same clock frequency, cortex-m4 sites between arm9 , arm11 on performance scale.
i not think need perform benchmark tests comparing arm7 , cortex m4 since broad performance figures available. perhaps measure cpu load of existing application on current platform. if low (perhaps < 20%) , spends of time idle, arm7 might feasible. of course if application not running on rtos or scheduler idle task, measuring true cpu load might difficult.
Comments
Post a Comment