Skip to main content

Automating Tensor Program Partitioning on Accelerator Systems with PartIR

Dimitrios Vytiniotis ( DeepMind )

The rapid rise in demand for training large neural networks has brought into focus the need for partitioning across systems of accelerator devices. Implementing various forms of partitioning is increasingly supported through program primitives, but identifying efficient partitioning strategies requires expensive experimentation and expertise. We present the prototype of an automated partitioning system that integrates into existing compilers and existing user workflows. Our system relies on layering functional loop abstractions – that return or reduce over chunks of arrays – on top of an arbitrary array “dialect” (following the MLIR terminology) such as XLA. We use rewrite rules reminiscent of fusion rules from stream fusion to express various forms of propagation of partitioning information across a program. Our system compiles functional loops to SPMD abstractions in a lower-level dialect whose types capture distributed arrays and which includes explicit array redistribution commands. This dialect can then be lowered, compiled, and executed using the “native” backend compiler and runtime (e.g. XLA) in a device-agnostic manner. We will present the design of a search environment controlling the actions of our rewrite engine that is specifically aiming to tame the size of search space by (a) mimicking the way expert programmers would attempt to partition their programs and (b) exploiting high-level model structure already available in popular libraries for neural networks. We show promising initial results, such as the ability to automatically recover good partitioning for important neural network architectures; and we outline remaining challenges.

Speaker bio

Dimitrios Vytiniotis is a research scientist leading the research in programming languages and machine learning systems at DeepMind. He holds a PhD from the University of Pennsylvania (2008) and was a researcher with Microsoft Research Cambridge until 2018. His interests span functional programming and type systems, and more broadly language design and implementation, with applications in areas like systems and machine learning.

 

Video

 

Share this: