Accelerating distributed DNN training

Neural network training with approximate tensor operations enables faster training by reducing the number of computations and communications without substantial accuracy loss.




There are many methods to trade accuracy and speed via approximation

Our approach is to reduce the number of computations in the core computation kernel: tensor products

The basis for our method is Column-Raw Sampling:


In the paper we show how to extend this method to tensor products and convolutions, delve into the theoretical properties, and in particular, show that it does not affect conversion under reasonable assumptions, and then show how to use it in training while minimizing the loss of accuracy.