Neural networks are typically developed and trained in a high-performance 32-bit floating-point compute environment. But, in many cases a custom hardware solution is needed for the inference engine to meet power and real-time requirements. Each neural network and end-application may have different performance requirements that can dictate the optimal hardware architecture. This makes custom-tailored solutions impossible when using hand-coded RTL creation flows. HLS has the unique ability to quickly go from complex algorithms written in C to generated RTL, enabling accurate profiles for power and performance for an algorithm's implementation without having to write it by hand. This session steps through a CNN (Convolutional Neural Network) inference engine implementation, highlighting specific architectural choices, and shows how to integrate and test inside of TensorFlow. This demonstrates how an HLS flow is used to rapidly design custom CNN accelerators.
This webinar is part 3 of the seminar HLS for Vision and Deep Learning Hardware Accelerators.
HLS Technologist
Michael Fingeroff has worked as an HLS Technologist for the Catapult High-Level Synthesis Platform at Siemens Digital Industries Software since 2002. His areas of interest include Machine Learning, DSP, and high-performance video hardware. Prior to working for Siemens Digital Industries Software, he worked as a hardware design engineer developing real-time broadband video systems. Mike Fingeroff received both his bachelor's and master's degrees in electrical engineering from Temple University in 1990 and 1995 respectively.