Project 162: Hardware/Software Codesign for TinyML Chips

Contact Information

Assoc. Prof. Xinfei Guo

Project Description and Objectives

The field of computer engineering has been actively researching hardware acceleration technology. Multiple accelerators have been proposed to improve the energy efficiency requirements of AI applications. Examples include Google's TPU and Cambrian's AI accelerators. However, these accelerators are often designed for training or inference of large-scale models. As machine learning tasks are shifting towards the "edge", the development of customized edge AI chips plays a crucial role in optimizing energy efficiency and promoting AI adoption. In this project, we are exploring a software-hardware co-design approach specifically targeted at edge AI chips. At the software level, we are investigating efficient model compression techniques, such as mixed-precision quantization, to achieve high efficiency. At the hardware level, we are researching novel computing architectures, such as in-memory computation and reconfigurable computing, to further improve the energy efficiency and scalability of AI computing tasks.

Eligibility Requirements

Interested in the intersections of hardware and software for machine learning systems.

Feel comfortable with scripting in C, C++ or Python.

Prior experience of using Linux systems is required.

Prior knowledge of computer organization and architecture is highly preferred.

Prior experience of working with machine learning models is highly preferred.

Prior research experience is highly preferred.

Main Tasks

Week 1: Read relevant literature on edge intelligence software and hardware, study basics of quantization, computer architectures, and get familiar with popular machine learning frameworks such as PyTorch and commonly used edge intelligence models.

Week 2: Familiarize with inference frameworks like TVM, STM's CUBE-AI, understand the design process of commonly used on-chip systems like PULP and RISC-V architectures, and be able to run common use cases.

Week 3-4: Explore quantization and model compression techniques at the software level, test them using inference frameworks, and collect metrics such as accuracy, inference time, and storage space utilization.

Week 5-6: Deploy the researched algorithms on relevant platforms such as STM32 development board or PULP SoC development environment at the hardware level, collect actual hardware cost metrics such as power consumption and memory usage. Wrap up the project, and write the final report.

Website

Lab: https://sites.ji.sjtu.edu.cn/icas/

School: https://www.ji.sjtu.edu.cn/