Evaluating MMX technology using DSP and multimedia applications
Proceedings. 31st Annual ACM/IEEE International Symposium on …, 1998•ieeexplore.ieee.org
Many current general purpose processors are using extensions to the instruction set
architecture to enhance the performance of digital signal processing (DSP) and multimedia
applications. In this paper, we evaluate the X86 architecture's multimedia extension (MMX)
instruction set on a set of benchmarks. Our benchmark suite includes kernels (filtering, fast
Fourier transforms, and vector arithmetic) and applications (JPEG compression, Doppler
radar processing, imaging, and G. 722 speech encoding). Each benchmark has at least one …
architecture to enhance the performance of digital signal processing (DSP) and multimedia
applications. In this paper, we evaluate the X86 architecture's multimedia extension (MMX)
instruction set on a set of benchmarks. Our benchmark suite includes kernels (filtering, fast
Fourier transforms, and vector arithmetic) and applications (JPEG compression, Doppler
radar processing, imaging, and G. 722 speech encoding). Each benchmark has at least one …
Many current general purpose processors are using extensions to the instruction set architecture to enhance the performance of digital signal processing (DSP) and multimedia applications. In this paper, we evaluate the X86 architecture's multimedia extension (MMX) instruction set on a set of benchmarks. Our benchmark suite includes kernels (filtering, fast Fourier transforms, and vector arithmetic) and applications (JPEG compression, Doppler radar processing, imaging, and G.722 speech encoding). Each benchmark has at least one non-MMX version in C and an MMX version that makes calls to an MMX assembly library. The versions differ in the implementation of filtering, vector arithmetic, and other relevant kernels. The observed speed up for the MMX versions of the suite ranges from less than 1.0 to 6.1. In addition to quantifying the speedup, we perform detailed instruction level profiling using Intel's VTune profiling tool. Using VTune, we profile static and dynamic instructions, microarchitecture operations, and data references to isolate the specific reasons for speedup or lack thereof. This analysis allows one to understand which aspects of native signal processing instruction sets are most useful, the current limitations, and how they can be utilized most efficiently.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果