Machine learning is one of the buzzwords of the last few years but it is hardly a new phenomenon and can be traced back to the early 1980s. There are in fact a number of related terms, including artificial intelligence (AI) and deep learning, which are often erroneously used interchangeably by the unfamiliar. The term AI was coined in the mid-1950s and was intended to mean generalised human intelligence being exhibited by machines. Machine learning is the practice of using algorithms to parse raw data, generate a set of rules from its analysis and to then to make a determination or a prediction about new inputs based on the knowledge that it has ‘learnt’. This is usually against a narrow range or well-defined set of tasks. Deep learning is really just a set of techniques for implementing more nuanced and sophisticated machine learning methods, primarily based around layered artificial neural networks (graphs). What is clear is that a machine learning revolution is underway and that soon very few areas of technical endeavour will not have some level of AI associated with it.
NVIDIA as a company has very much been at the forefront of this new wave of machine learning which has been driven by wide-scale adoption of machine learning by the ‘Super Seven’ (Google, Amazon, Microsoft, Facebook, Alibaba, Baidu and TenCent). The current techniques being used to train the machine learning systems are demanding and to date rely heavily on the parallel processing power of GPUs (principally NVIDIA P100s). While inferencing is less amenable to a GPU architecture NVIDIA has directly addressed this with ‘Volta’ their next generation GPU (which will ship in Q3 this year in DGX-1 boxes).
By any metrics, Volta is a beast of a device, featuring more than 21.1B transistors and an area of 815mm^2 on a 12nm TSMC process node. Performance has been significantly boosted across the board (by between 40 and 60% for many typical HPC benchmarks) but the biggest departure is probably the addition of the 672 TensorCores which are specifically geared for machine learning workloads (tensor products – essentially 4x4 matrix matrix-multiplies) and can deliver a theoretical peak performance of 120TF of mixed precision performance. According to NVIDIA Volta is good for a 12x increase in training performance and more than 6x for inferencing. There’s understandably a lot of detail that wasn’t given at GTC which I imagine will help us better understand the strengths and weaknesses of the approach.
The addition of what amounts to machine learning specific acceleration to their mainstream GPU architecture speaks to both how computationally demanding the space is and how much of as market there is to serve. In addition, the decision to open source of NVIDIA’s deep learning accelerator (DLA) IP, which will have its first outing on the Xavier SoC targeted at the autonomous vehicle market, can be seen as a move to address Google’s TPU strategy and build a deep learning ecosystem around NVIDIA IP.
NVIDIA have also looked to serve the machine learning market with the DGX-1 (what they are calling an AI supercomputer) and also the NVIDIA deep learning stack. Both of these are really an attempt to make picking up and using machine learning more of a turnkey or appliance solution which by bundling an ecosystem and support enables profit margins to remain healthy.
What’s clear is that machine learning is going to be a battleground for all the silicon heavyweights and that the architecture and marketing types are going to be busy trying to capture market share in what is sure to be one of the major growth areas for the foreseeable future.
#GTC17 #ai #machinelearning #deeplearning #volta #xavier #dla