This presentation introduces Harvard's experiences of using Catapult HLS in research projects. Since 2018, we have taped out several chips with hardware components designed in SystemC. The HLS design flow has been very helpful to shorten development cycle of accelerators as well as providing a way to quickly prototype different research ideas. First half of the talk focuses on our learnings with the HLS tool. Even though comparing with Verilog RTL, SystemC allows us to design hardware in a more concise and efficient manner, we find it difficult at first to figure out what kind of coding is more appropriate. Specifically, how to pass HLS and achieve pipeline initial interval “II = 1” for critical parts of accelerator designs. Throughout many trial and errors, we have developed intuitions and strategies for SystemC coding. In addition, we also extensively leverage standard hardware IPs provided in Mentor Algorithmic C and NVIDIA MatchLib to implement our designs. For the second half, we share some projects involving Catapult HLS. One of them is about proposing adaptive floating-point (FP) quantization to replace integer (INT) quantization for NNs, especially for large-scale NLP models. For hardware evaluation of this idea, we utilize Catapult HLS and PowerPro to compare power and area of FP and INT MAC datapath designs. The work has been selected as the best research paper in DAC 2020 conference.

Meet the Speaker

Harvard University

En-Yu Daniel Yang

PhD Student

En-Yu (Daniel) Yang is a PhD student in Computer Science at Harvard. He received B.S. in Electrical Engineering from National Tsing Hua University in 2018. His research focuses on specialized architecture and hardware design for machine learning applications. He has been working on designing accelerators using SystemC with Catapult HLS tool, and some of his work is accepted in DAC’20 and ISSCC’21 conferences.

相關資訊

針對不同的 FPGA 平台最佳化 HLS 程式碼
White Paper

針對不同的 FPGA 平台最佳化 HLS 程式碼

在此白皮書中,我們將審視一個簡易的卷積濾波器,並概述如何使用 HLS 將它導入至不同的 FPGA 平台。我們還將著重於導入不同平台時為獲得最佳效能可能需要的不同最佳化,以及可用來獲得更佳效能的編碼形式。

Rapid Algorithm to HW: Using HLS for Computer Vision and Deep Learning Seminar
Webinar

Rapid Algorithm to HW: Using HLS for Computer Vision and Deep Learning Seminar

How HLS helps project teams rapidly & accurately explore power/performance of algorithms, quickly get to FPGA implementations to create demonstrator/prototypes & use same source RTL IP for ASIC implementation.