Skip to main content

Omnitek: Algorithms and Novel Hardware Architectures for AI Acceleration

Dr Roger Fawcett CEO Omnitek

Artificial Intelligence and Machine Learning algorithms are being applied to a rapidly growing range of tasks including object recognition, language translation and beating world champions at the board game Go.

These algorithms are predominantly based on neural network architectures and are not well suited to traditional CPU "Von Neumann" compute engines. This is because Von Neumann architectures comprise a small number of processing units that access a single external memory for data storage and program instructions via a relatively low bandwidth interface. By contrast, neural network architectures are better suited to massively parallel DSP compute engines with distributed local memory of significant total memory bandwidth.

GPUs have provided a short-term solution but do not represent the optimum architecture. This has led to a growing movement in both industry and academia looking at alternative compute engine architectures on silicon.

This seminar will give an overview of the multiplicity of new compute engine architectures that are being used to run neural network algorithms, including dedicated ASICs, massively parallel arrays of simple DSP processors with local memory and FPGAs (Field Programmable Gate Arrays). Omnitek has recently developed a world leading FPGA-based Deep Learning Processing Unit (DPU). The company will shortly release a live demonstration of its performance and present the silicon architecture of the DPU.

Novel compute architectures go hand in hand with algorithmic optimisations of neural network designs to reduce compute complexity. For example, many operations can be carried out at reduced precision without loss of accuracy. This includes reduced precision floating point, block floating point, integer and even binary networks. Mathematical operations such as Winograd and Fourier transforms can be used to mathematically accelerate computation. Taking these a stage further, novel AI algorithms can be designed with the ability of compute engines in mind, for processing of both neural network algorithms and non-neural network-based processing.

As this rapidly advancing field progresses, it is difficult to predict what other processing AI algorithms will require in order to prove successful. Perhaps neural network topologies will be designed using other self-optimising techniques such as genetic algorithms. Given this uncertainty, we believe fully programmable FPGA solutions are not only the most powerful platforms but also provide a future-proof strategy.

The Omnitek Oxford University Research Scholarship seeks to fund DPhil students to research these exciting areas and to work in collaboration with our own industry-focussed research to help shape the future of AI compute engines and the algorithms that run on them.

 

 

Share this: