The existing implementation of the OpenACC "kernels" construct in GCC is unable to cope with many language constructs found in real HPC codes which generally leads to very bad performance. This talk presents upcoming changes to the "kernels" implementation that improve the performance significantly:
- A more unified internal representation of "kernels" and "parallel"
regions as a foundation for the other improvements.
- Data-dependence analysis based on Graphite.
- Improvements to Graphite (e.g. runtime alias checking) to enable its
use on more code.
- Language Frontend (e.g. delinearization of array accesses for
Fortran) and Middle-end changes (e.g. a "omp_data_optimize" pass to
derive synthetic OpenACC "private" clauses on "kernels") that enable
Graphite to analyze more code.