Title
Loop transformations leveraging hardware prefetching
Author
Sioutas, S.
Stuijk, S.
Corporaal, H.
Basten, A.A.
Somers, L.
Publication year
2018
Abstract
Memory-bound applications heavily depend on the bandwidth of the system in order to achieve high performance. Improving temporal and/or spatial locality through loop transformations is a common way of mitigating this dependency. However, choosing the right combination of optimizations is not a trivial task, due to the fact that most of them alter the memory access pattern of the application and as a result interfere with the efficiency of the hardware prefetching mechanisms present in modern architectures. We propose an optimization algorithm that analytically classifies an algorithmic description of a loop nest in order to decide whether it should be optimized stressing its temporal or spatial locality, while also taking hardware prefetching into account. We implement our technique as a tool to be used with the Halide compiler and test it on a variety of benchmarks. We find an average performance improvement of over 40% compared to previous analytical models targeting the Halide language and compiler. © 2018 Association for Computing Machinery. ACM SIGMICRO; ACM SIGPLAN; IEEE Computer Society
Subject
Industrial Innovation
Compiler optimizations
Halide
Loop optimizations
Hardware
Memory architecture
Combination of optimizations
Compiler optimizations
Halide
Loop optimizations
Memory access patterns
Memory-bound applications
Optimization algorithms
Performance improvements
Program compilers
To reference this document use:
http://resolver.tudelft.nl/uuid:4cfadaae-70f4-49a0-87ba-1a2d1a75b910
TNO identifier
842631
Publisher
Association for Computing Machinery, Inc
ISBN
9781450356176
Source
CGO 2018 - Proceedings of the 16th International Symposium on Code Generation and Optimization, CGO 2018. 24 February 2018 through 28 February 2018, 2018-February, 254-264
Document type
conference paper