on-demand webinar

Neural Network Quantization for Low-Power

Estimated Watching Time: 62 minutes

This webinar will describe how to use Qkeras and High-Level Synthesis to produce a bespoke quantized CNN accelerator, and compares the accuracy, power, performance, and area of different quantizations.

Inferencing for Convolutional Neural Networks (CNNs) is notoriously compute intensive. This makes them an ideal candidate for hardware acceleration, which is faster and more power efficient than running software on general purpose CPUs. Training and inferencing are typically done using floating point representations of the features, weights, and biases. Using a fixed point representation reduces the size and power of the operators in the accelerator. With a purpose built accelerator, the size of fixed point operators can be anything - they are not limited to 8 or 16 bits. Qkeras, or quantized Keras, is a library built on Tensorflow that allows developers to specify quantized fixed-point operations for each layer. It enables training and inferencing with reduced precision representations. This webinar will describe how to use Qkeras and High-Level Synthesis to produce a bespoke quantized CNN accelerator, and compares the accuracy, power, performance, and area of different quantizations. What you will Learn

How to determine the optimal operand sizing for a hardware accelerator deploying a neural network using QKeras
How to determine the area, performance, and energy of a neural network accelerator
How to compare software performance against hardware accelerated performance, and make informed trade-off decisions

Who Should Attend

Developers of neural networks that will be deployed on the edge or in other contexts where low power and efficiency are required in addition to high performance.

Meet the speakers

Siemens EDA

Russell Klein

HLS Program Director

Russell Klein is a Program Director at Siemens EDA’s (formerly Mentor Graphics) High-Level Synthesis Division focused on processor platforms. He is currently working on algorithm acceleration through the offloading of complex algorithms running as software on embedded CPUs into hardware accelerators using High-Level Synthesis. He has been with Mentor for over 25 years, holding a variety of engineering, marketing and management positions, primarily focused on the boundary between hardware and software. He holds six patents in the area of hardware/software verification and optimization. Prior to joining Mentor he worked for Synopsys, Logic Modeling, and Fairchild Semiconductor.

Siemens EDA

Ajay Mishra

Senior Product Deployment Manager

Ajay Mishra is a Senior Product Deployment Manager in Siemens EDA’s High-Level Synthesis Division, currently working on physical realization of algorithm acceleration using High-Level Synthesis. Mr. Mishra developed a methodology to determine power estimates based physical realization. Ajay Mishra has 22 years of experience, including semiconductor design work at STMicroelectronics, Intel, and Philips Semiconductors, and 14 years of EDA experience in Mentor Graphics (now Siemens EDA). At Siemens EDA he served as Product Deployment Engineer to facilitate the proliferation and deployment of C-to-GDSII products, flows, and methodologies. Growing IC design complexities cause most IC architects to look for smart and efficient electronic design automation solutions to improve IC performance and time to market while avoiding technological barriers in between engineering domains.

Neural Network Quantization for Low-Power

Share

Meet the speakers

Russell Klein

Ajay Mishra

Related resources