Deep Learning algorithms are rapidly evolving, with new techniques and architectures being proposed on a regular basis. This poses a significant challenge for hardware design. These algorithms often require specialized hardware accelerators for efficient execution. However, the design cycle for these accelerators is complex and time-consuming, as it involves a significant effort to master the algorithm and implement an appropriate hardware architecture. As new DL algorithms emerge, existing hardware accelerators may become obsolete or may not be able to integrate the latest optimizations. This leads to a significant gap between newly emerging algorithms and available hardware accelerators. To address this problem, High-Level Synthesis (HLS) use has increased to accelerate the design process and bridge the gap between software and hardware design by describing the desired behavior of the accelerator in a high-level programming language (e.g. C++). We present a methodology that bridges the open-source DL framework N2D2 and Catapult HLS to help reducing the design process of hardware accelerators, making it possible to keep pace with new AI algorithms. By proposing a new automatic synchronization, we were able to balance the execution time of all convolutional layers in MobileNet-v1 to achieve a pipelined hardware architecture capable of handling 500 fps.
PhD - Research Engineer
Nermine Ali is a research engineer at CEA List (French Alternative Energies and Atomic Energy Commission), France, in the field of embedded systems and artificial intelligence, since December 2021. She received her PhD Degree in Electronics from Université de Bretagne-Sud, France, in 2022. Her current research interests include hardware designs for neural networks applications and high-level design flows including High-Level Synthesis tools to exploit fast exploration and hardware generation.