On-Demand-Webinar

DUTh: Architectural Improvements for Low-Power and Functional Safety of Dataflow CNN Accelerators Using HLS

Architectural Improvements for Low-Power and Functional Safety of Dataflow CNN Accelerators Using HLS

Geschätzte Wiedergabezeit: 64 Minuten

Teilen

DUTh demos using HLS to design CNN accelerators with on-line checking capabilities, improve power efficiency due to optimized data handling on spatial variants of convolution, and effectively use HLS for customized FP operators.

Deep Convolution Neural Networks (CNNs) are dominant in modern Machine Learning (ML) applications. Their acceleration directly in hardware calls for a multifaceted approach that combines high-performance, energy efficiency, and functional safety. These requirements hold for every CNN architecture, including both systolic and spatial dataflow alternatives. In this talk, we focus on the latter case, where CNNs are implemented using a series of dedicated convolutions engines, where the data are streamed from one layer to the other through custom memory buffers.

In this context, we will first present a High-Level Synthesis (HLS) implementation for dataflow CNNs that utilizes the benefits of Catapult HLS and allows the implementation of a wide variety of architectures. Especially, we will focus on the energy-efficient implementation of non-traditional forms of spatial convolutions, such as strided or dilated convolutions, which leverage the decomposition of convolution to eliminate any redundant data movements.

In the following, we will present an algorithmic approach for online fault detection on CNNs. In this case, the checksum of the actual result is checked against a predicted checksum computed in parallel by a hardware checker. Based on a newly introduced invariance condition of convolution, the proposed checker predicts the output checksum implicitly using only data elements at the border of the input features. In this way, the power required for accumulating the input features is reduced without requiring large buffers to hold intermediate checksum results.

Finally, we study customized floating point HLS operators that support fused dot products of single or double-width output (i.e., input in FP8 and the result in FP16) that eliminate redundant rounding and type conversion steps. Also, floating-point datatypes that support adjustable bias for tuning the dynamic range will be discussed. 

What you will learn

  • Utilizing HLS to design CNN accelerators with on-line checking capabilities.
  • Improve power efficiency due to optimized data handling on spatial variants of convolution. 
  • Effective use of HLS for implementing customized FP operators.

Who should attend

  • HW designers/Researchers interested in High-Level Synthesis designs for ML accelerators and Floating-Point arithmetic.

Vorstellung des Referenten

Democritus University of Thrace (DUTh)

Dionysios Filippas

Ph.D. student, Democritus

Dionysios Filippas is a Ph.D. student in Electrical and Computer Engineering at Democritus University of Thrace (DUTh). He received his Diploma and M.Sc. degree in the same department in 2019 and 2021 respectively. His experience involves the design of network-on-chips and customized hardware accelerators for data clustering algorithms. His current research focuses on the design of power efficient CNN accelerators as well as the design of customized floating-point arithmetic units using HLS.

Verwandte Ressourcen

Infineon: HLS Formal Verification Flow Using Siemens Formal Verification
Webinar

Infineon: HLS Formal Verification Flow Using Siemens Formal Verification

High-Level Synthesis (HLS) is design flow in which design intent is described at a higher level of abstraction such as SystemC/C++/Matlab/etc.

STMicroelectronics: A Common C++ and UVM Verification Flow of High-Level IP
Webinar

STMicroelectronics: A Common C++ and UVM Verification Flow of High-Level IP

STMicro presents a unified way to integrate the definition of RTL and C functional coverage and assertion (reducing the coding effort) and a method to add constraints to the random values generated in UVMF.

CEA: Bridging the Gap Between Neural Network Exploration and Hardware Implementation
Webinar

CEA: Bridging the Gap Between Neural Network Exploration and Hardware Implementation

CEA presents a methodology that bridges the open-source DL framework N2D2 and Catapult HLS to help reducing the design process of hardware accelerators, making it possible to keep pace with new AI algorithms.

High-Level Synthesis & Advanced RTL Power Optimization – Are you still missing out?
Webinar

High-Level Synthesis & Advanced RTL Power Optimization – Are you still missing out?

Discover how C++ & SystemC/MatchLib HLS is more than just converting SystemC to RTL. In the RTL Design space, we will cover our technology for Power Optimization with PowerPro Designer & Optimizer.

Alibaba: Innovating Agile Hardware Development with Catapult HLS
Webinar

Alibaba: Innovating Agile Hardware Development with Catapult HLS

At the IP level, an ISP was created within a year using Catapult, a task impossible using traditional RTL. To reduce dependency on designer experience, Alibaba introduced an AI-assisted DSE tool.

Space Codesign High-Level Synthesis for Hardware/Software Architectural Exploration of an Inferencing Algorithm
Webinar

Space Codesign High-Level Synthesis for Hardware/Software Architectural Exploration of an Inferencing Algorithm

Space Codesign Seminar: design flow including HW/SW co-design & HLS that allows developers to migrate compute intensive functions from software running on an embedded processor to a hardware based accelerator.