Preview only show first 10 pages with watermark. For full document please download

Machine Learning On Fpgas

Rating
Date

October 2018
Size

12.9MB
Views

3,640
Categories

Vehicles & accessories Motor vehicle electronics Car video systems

Transcript

Machine Learning on FPGAs Jason Cong Chancellor’s Professor, UCLA Director, Center for Domain-Specific Computing [email protected] http://cadlab.cs.ucla.edu/~cong 1 Impacts of deep learning for many applications Unmanned Vehicle Speech & Audio Text & Language Genomics Image & Video Multi-Media 2 All images are from internet search ImageNet Competition ◆  1,200,000 Training Images §  With 50,000 Validation & 100,000 Test Images ◆  1000 Category of objects [1] 3 [1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. NIPS2012 ImageNet Competition Results Winning % Error NEC-UIUC 30% Xerox Research Centre Europe 25% 20% AlexNet, University of Torronto 15% Clarifai, Company 10% GoogleLeNet, Google & VGG, Oxford 5% 0% 2009 2010 2011 Traditional Methods 2012 2013 2014 2015 Deep learning algorithm emerges 4 Convolutional Neural Network (CNN) feature maps feature maps feature maps feature maps input image output category Inference: A feedforward computation Max-pooling is optional Input feature map Output feature map 5 Backward propagation Hidden Layer Input Layer Output Layer Optimization target: Min. Inference error rate Diff( Y[0] , golden[0]) = delta[0] X[0] Random start point X[1] Diff( Y[1] , golden[1]) = delta[1] X[2] W1ij +∆ W2ij+∆ Local Minimum Feedforward (inference) Backward (gradient decent algorithm) 6 Real-life CNNs AlexNet [1] : Winner of imagenet 2012 classification task Real-life CNNs Neurons layers Parameter AlexNet 650, 000 8 VGG16 14,000,000 16 140 Million GoogleNet 8,300,000 22 4 Million 60 Million 7 [1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. Distributed Deep Learning System u Distributed Machine Learning §  Google, Baidu, Facebook [3] §  A cluster of thousand servers [2] [2] Dean, Jeffrey, et al. "Large scale distributed deep networks." Advances in Neural Information Processing Systems. 2012. 8 [3] Li, Mu, et al. "Scaling distributed machine learning with the parameter server." Proc. OSDI. 2014. An Example of High-Performance GPU Cluster [NIPS’13] ◆  Deep learning with COTS HPC systems §  Stanford University §  A cluster of 12 GPUs ◆  High performance §  Train 1 billion parameter network in a couple of days §  Comparable to CPU cluster of 1000 machines ◆  Cost Effective §  $20,000 §  CPU cluster with comparable performance cost $1 Million 9 FPGA acceleration of feedforward phase ◆  In many applications, neural network is trained in back-end CPU or GPU clusters ◆  FPGA: very suitable for latency-sensitive real-time inference job §  Unmanned vehicle §  Speech Recognition §  Audio Surveillance §  Multi-media ◆  Related ◆  Work [LeCun’09] [Farabet’10] [Aysegui’13] [Gokhale’15] [Zhang’15], etc. 10 Inference (or feedforward computation) Input feature map Output feature map K 1 for(row=0; row

Machine Learning On Fpgas

Rating

Date

Size

Views

Categories

Share

Transcript

Forgot your password?.