Artificial Intelligence (AI) and Machine Learning (ML) are rapidly transforming many aspects of integrated circuit (IC) design. The high computational demands and evolving AI/ML workloads are dramatically impacting the architecture, VLSI implementation, and circuit design tradeoffs of hardware accelerators. To keep up with the rapid pace of change in AI/ML workloads, NVIDIA Research leverages a High-Level Synthesis (HLS) based design methodology based off SystemC and libraries such as MatchLib for maximizing code reuse and minimizing design verification effort. This methodology provides for rapid co-optimization of AI algorithms and hardware architecture and has enabled NVIDIA Research to tape out a state-of-the-art 5nm deep learning inference accelerator testchip that achieves up to 95.6 TOPS/ with per-vector scaled 4-bit quantization for Transformer neural network inference.
Senior Director of ASIC and VLSI Research
Brucek Khailany joined NVIDIA in 2009 and currently leads the ASIC & VLSI Research group. During his time at NVIDIA, he has contributed to projects within research and product groups on topics spanning computer architecture, unit micro-architecture, and ASIC and VLSI design techniques. Dr. Khailany is also currently the Principal Investigator to a NVIDIA-led team under the DARPA CRAFT project researching high-productivity design methodology and design tools. Previously, Dr. Khailany was a Co-Founder and Principal Architect at Stream Processors, Inc. (SPI) where he led research and development activities related to highly-parallel programmable processor architectures. He received his Ph.D. and Masters in Electrical Engineering from Stanford University and received B.S.E. degrees in Electrical Engineering and Computer Engineering from the University of Michigan.