This afternoon Microsoft announced Brainwave, an FPGA-based system for ultra-low latency deep learning in the cloud. Early benchmarking indicates that when using Intel Stratix 10 FPGAs, Brainwave can sustain 39.5 Teraflops on a large gated recurrent unit without any batching.
Microsoft has been pouring resources into FPGAs for a while now, deploying large clusters of the field-programmable gate arrays into its data centers. Algorithms are written into FPGAs, making them quite efficient and easily reprogrammable. This specialization makes them ideal for machine learning, specifically parallel computing.
Building on this work, Microsoft has synthesized DPU or DNN processing units into its FPGAs. The hope is that by focusing on deep neural nets, Microsoft can adapt its infrastructure faster to keep up with research and offer near real-time processing.
For More Blogs