The existing implementation of the OpenACC "kernels" construct in GCC is unable to cope with many language constructs found in real HPC codes which generally leads to very bad performance. This talk presents upcoming changes to the "kernels" implementation that improve the performance significantly:
- A more unified internal representation of "kernels" and "parallel" regions as a foundation for the other improvements.
- Data-dependence analysis based on Graphite.
- Improvements to Graphite (e.g. runtime alias checking) to enable its use on more code.
- Language Frontend (e.g. delinearization of array accesses for Fortran) and Middle-end changes (e.g. a "omp_data_optimize" pass to derive synthetic OpenACC "private" clauses on "kernels") that enable Graphite to analyze more code.