• 原文地址:Getting started with TensorFlow —— IBM
  • Author: Vinay Rao
  • Translation from: The Gold Project
  • This article is permalink: github.com/xitu/gold-m…
  • Translator: JohnJiangLA
  • Proofreader: CACppuccino & Atuooo

IBM engineer’s introduction to TensorFlow pointed north

In the world of machine learning, a tensor is a multidimensional array used in exponential models to describe neural networks. In other words, a tensor is usually a generalized higher-dimensional matrix or vector.

Tensor is able to represent complex N-dimensional vectors and supershapes as n-dimensional arrays by using a simple way of displaying dimensions using the rank of a matrix. Tensor has two properties: data types and shapes.

About TensorFlow

TensorFlow is an open source deep learning framework released in late 2015 under the Apache 2.0 license. Since then, it has become one of the most widely adopted deep learning frameworks in the world (as measured by the number of projects based on it on Github).

TensorFlow, derived from Google DistBelief, is a deep learning system developed and owned by the Google Brain project. Google designed it from scratch for distributed Processing and runs in optimal mode on custom application specific integrated circuits (ASIcs), commonly known as Tensor Processing Units (Tpus), in Google product data centers. This design can lead to effective deep learning applications.

The framework can run on a CPU, GPU, or TPU and can be used on a server, desktop, or mobile device. Developers can deploy TensorFlow on different operating systems and platforms, both on the ground and in the cloud. Many developers would argue that TensorFlow supports distributed processing better and offers greater flexibility and performance in business applications than similar deep learning frameworks such as Torch and Theano, which also support hardware acceleration and are widely used in academia.

Deep learning neural networks are usually composed of multiple layers. They use multidimensional arrays to pass data between layers or perform operations. A tensor “flows” between the layers of the neural network. Hence the name TensorFlow.

The main programming language used by TensorFlow is Python. There are available, but not guaranteed, application programming interfaces (apis) for C++, Java®, and Go, as well as third-party bindings designed for C#, Haskell, Julia, Rust, Ruby, Scala, R, and even PHP. Google recently released a tensorFlow-Lite library optimized for mobile devices to enable the TensorFlow application to run on Android.

This tutorial provides an overview of the TensorFlow system, including the benefits of the framework, supported platforms, installation considerations, and supported languages and bindings.

The advantage of TensorFlow

TensorFlow offers a number of benefits to developers:

  • Calculate the flow diagram model. TensorFlow uses a data flow graph called a directed graph to represent the computational model. This allows developers to easily and directly use native tools to see what is happening between the layers of the neural network and to interactively adjust parameters and configurations to improve their neural network architecture.
  • Easy to use API. Python developers can develop their own models using TensorFlow’s native low-level or core APIS, or they can use the high-level API libraries to build built-in models. TensorFlow has many built-in and community libraries, and it can also serve as a high-level API on top of more advanced deep learning frameworks such as Keras.
  • Flexible architecture. One of the main advantages of using TensorFlow is its modular, extensible, and flexible design. Developers can easily convert models between CPU, GPU, or TPU processors with only a few code changes. Although originally designed for large-scale distributed training and speculation, developers can also use TensorFlow to try out other machine learning models and system optimizations of existing models.
  • Distributed processing. Google designed TensorFlow from scratch to run distributed on custom ASIC Tpus. In addition, TensorFlow can run on a variety of NVIDIA GPU cores. Developers can take full advantage of the X64 CPU architecture based on Intel Xeon and Xeon Phi or the ARM64-based CPU architecture. TensorFlow can run on multi-architecture and multi-core systems as if it were a distributed process, handing over computation-intensive processes as if they were production tasks. Developers can create TensorFlow clusters. These computational flow diagrams are distributed to these clusters for training. Tensor can perform distributed training synchronously or asynchronously, both within and across the flow diagram, and can share common data in memory between compute nodes in the network.
  • Performance. Performance is often a controversial topic, but most developers understand that any deep learning framework relies on the underlying hardware to perform optimally and achieve high performance with low power consumption. In general, any framework should be optimized for its native development platform. TensorFlow performs well on Google TPU, but what’s even more exciting is that it delivers high performance on a variety of platforms, whether server and desktop, embedded systems and mobile devices. The framework also supports a surprising number of programming languages. Although another framework running on a native environment (such as IBM Watson® running on an IBM platform) may sometimes outperform TensorFlow, it is still a developer favorite because human-only projects cross platforms and programming languages and are designed for a variety of end-use applications. And all of these need to produce consistent results.

TensorFlow application

This section introduces the applications that TensorFlow is good at. Clearly, since Google uses its proprietary version of TensorFlow to develop text and voice search, language translation, and image search applications, the main advantages of TensorFlow are classification and speculation. For example, Google uses RankBrain (Google’s search results ranking engine) in TensorFlow.

TensorFlow can be used to optimize speech recognition and speech synthesis, such as distinguishing between multiple voices or filtering out noise to extract speech from high-noise backgrounds, and simulating speech patterns during text generation to achieve more natural speech. In addition, it can process sentence structures in different languages to produce better translation results. It can also be used for image and video recognition and classification of objects, landmarks, people, emotions, or activities. This has led to major improvements in image and video search.

Because of its flexible, extensible, and modular design, TensorFlow does not restrict developers to specific models or applications. Developers used TensorFlow to implement not only machine learning and deep learning algorithms, but also statistical and general computing models. See Using TensorFlow for more information about the application and the community model.

Which platforms support TensorFlow?

TensorFlow is supported on any platform that supports the Python development environment. However, to plug into a supported GPU, TensorFlow relies on other software, such as the NVIDIA CUDA Toolkit and cuDNN. The Pre-built Python binaries (currently released) for TensorFlow (version 1.3) are available for the operating systems listed in the following table.

Note: Getting GPU acceleration support on Ubuntu or Windows requires CUDA Kit 8.0 and cuDNN 6 or higher, And a kit compatible with this version and a CUDA Computer Capability 3.0 or higher GPU card. TensorFlow on macOS above version 1.2 no longer supports GPU acceleration.

For details, see Installing TensorFlow.

Build TensorFlow from source code

Bazel is officially used to build TensorFlow on Ubuntu and macOS. Building TensorFlow from source code with Bazel for Windows or CMake for Windows is still in the experimental process.

IBM used an NVIDIA NVLink cable to connect two POWER8 processors and four NVIDIA Tesla P100 Gpus on the S822LC HIGH-PERFORMANCE computing system to make the PowerAI suitable for deep learning. Developers can build TensorFlow on IBM Power Systems running OpenPOWER Linux. For more information, see Deep Learning on OpenPOWER: Building TensorFlow on An OpenPOWER Linux System.

Many community – or vendor-supported builds are also available.

How does TensorFlow use hardware acceleration?

To support the use of TensorFlow on a wider range of processor and non-processor architectures, Google has provided vendors with a new abstract interface to implement a new hardware back end for accelerated Linear Algebra (XLA), a domain specific compiler for linear algebra calculations. It can be used to optimize the TensorFlow calculation process.

CPU

Currently, because XLA is experimental, TensorFlow is supported, tested, and built on X64 and ARM64 CPU architectures. On the CPU architecture, TensorFlow uses vector processing extensions to accelerate linear algebra calculations.

Intel CPU-centric HPC architectures (such as Intel Xeon and Xeon Phi series) use Intel math core function libraries to implement deep neural network primiions to achieve accelerated linear algebra calculations. Intel also offers pre-built Python optimized distributions with optimized linear algebra libraries.

Other vendors, such as Synopsys and CEVA, use mapping and profser programs to transform TensorFlow flow diagrams and generate optimized code to run on their platforms. Developers using this approach need to migrate, analyze, and tweak the resulting code.

GPU

TensorFlow supports specific NVIDIA Gpus that are compatible with relevant CUDA kits and meet relevant performance standards. Despite some community efforts to run TensorFlow on OpenCL 1.2-compatible Gpus (such as AMD’s), OpenCL support is still a project in the works,

TPU

According to Google, TPU-based flow graphs perform 15-30 times better than CPU or GPU execution and are very energy efficient. Google designed the TPU as an external accelerator that could be plugged into a serial ATA hard disk slot and connected to the host via the PCI Express Gen3 X16 interface for high-bandwidth throughput.

Google TPU is a matrix processor rather than a vector processor, and the neural network does not require high-precision mathematical operations, but uses massively parallel low-precision integer operations. Not surprisingly, the Matrix Processor (MXU) structure has a 65,536 8-bit multiplier and pushes the data through pulsating array structure fluctuations like blood through the heart.

This design is a complex instruction set computation (CISC) structure that, while single-threaded, allows a single high-level instruction to trigger multiple low-level operations on the MXU, and may execute 128,000 instructions per loop without accessing memory.

As a result, tPUS can achieve significant performance improvements and energy efficiency ratios compared to GPU arrays or multi-instruction set, multi-data CPU HPC clusters. By evaluating every ready execution node in the TensorFlow flow diagram in each cycle, TPU greatly reduces the training time of deep learning neural network compared to other architectures.

TensorFlow installation Precautions

In general, TensorFlow runs on any platform that supports a 64-bit Python development environment. This environment is sufficient to train and test most simple examples and tutorials. However, most experts agree that the HPC platform is strongly recommended for research or professional development.

Processor and memory performance requirements

Due to the large computational amount of deep learning, high-speed multi-core cpus with vector expansion and one or more Gpus with high-end CUDA support are the common standards of deep learning. Most experts also recommend paying attention to the CPU and GPU cache, as memory transfer operations are energy-intensive and detrimental to performance.

There are two modes of deep learning performance that need to be considered:

  • Development patterns. Typically, in this mode, training time, performance, sample size, and data set size all affect processing performance and memory requirements. These elements determine the limits of the neural network’s computational performance and training time.
  • Apply patterns. Typically, in trained neural network processing, processing performance and memory determine the real-time performance of classification or speculation. Convolutional neural networks require more low-precision computing power, while fully connected neural networks require more memory.

Vm This parameter is optional

Virtual machines (VMS) for deep learning are now best suited to cpu-centric, multi-core hardware architectures available. Because the host operating system controls physical devices such as cpus and Gpus, achieving acceleration on a VIRTUAL machine is complicated. There are two known methods:

  • GPU mounting: * Can only run on Type-1 hypervisers, such as Citrix Xen, VMware ESXi, Kernel Virtual Machine, and IBM Power. Mounting overhead varies depending on a particular combination of CPU, chipset, hypervisor, and operating system. In general, the latest generation of hardware costs much less. * A given hypervisor-operating system combination supports a specific NVIDIA GPU.
  • GPU virtualization: * Support for all major GPU vendors, such as NVIDIA (GRID), AMD (MxGPU), and Intel (GPT-G). * Support for the latest version of OpenCL on specific new Gpus (TensorFlow does not officially support OpenCL). * The latest version of the NVIDIA GRID supports CUDA and OpenCL on specific new Gpus.

Docker installation options

Running TensorFlow on a Docker container or Kubernetes container cluster system has many advantages. TensorFlow can distribute flow diagrams as execution tasks to TensorFlow server clusters that are actually mapped to container clusters. An added advantage of using Docker is that the TensorFlow server can access the physical GPU core (device) and assign it specific tasks.

Developers can also deploy TensorFlow in a Cluster of Kubernetes containers on PowerAI OpenPOWER servers by installing community-built Docker images. For example, “TensorFlow training using PowerAI’s Kubernetes system on OpenPOWER server”.

Cloud Installation options

The TensorFlow cloud comes with several options:

  • Google Cloud TPU. For researchers, Google has an Alpha version of the TensorFlow Research Cloud that provides online TPU instances.
  • Google Cloud. Google offers custom TensorFlow machine instances in specific areas that can access one, four, or eight NVIDIA GPU devices.
  • IBM Cloud Data Science and Management. IBM provides a Python environment with Jupyter Notebook and Spark. TensorFlow is preinstalled.
  • Amazon Web Services (AWS). Amazon offers AWS Deep Learning Amazon Machine Images (AMIs) with optional NVIDIA GPU support that runs on various Amazon Elastic Compute Cloud instances. TensorFlow, Keras, and other deep learning frameworks are pre-installed. The AMI can support up to 64 CPU cores and 8 NVIDIA Gpus (K80).
  • Azure. You can set up TensorFlow on a Docker instance using Azure container services or on an Ubuntu server. Azure machine instances can support 24 CPU cores and up to 4 NVIDIA Gpus (M60 or K80).
  • IBM Cloud Kubernetes cluster. A Kubernetes cluster on IBM Clound can run TensorFlow. A community-built Docker image is available. The POWERAI server provides GPU support.

Which programming languages does TensorFlow support?

Although Google implements the TensorFlow core code in C++, its main programming language is Python, and the API is the most complete, powerful, and easy to use. See the Python API documentation for more information. The Python API also has the most extensive documentation and extensibility options and extensive community support.

In addition to Python, TensorFlow supports apis for the following languages, but stability is not guaranteed:

  • C+ +. TensorFlowCThe ++ API is the next most powerful API for building and executing data flow diagrams and TensorFlow services. More information aboutCFor information about the ++ API, seeC++ API. The relevantCFor more information about the ++ service API, seeTensorFlow service API reference.
  • The Java language. Although this API is experimental, the newly released Android Oreo support for TensorFlow may make it even more salient. For more information, see tensorflow.org.
  • The Go. This API is a highly experimental binding to the Google Go language. For more information, see Package TensorFlow.

Third party binding

Google has defined an external function interface (FFI) to support other language bindings. This interface exposes the TensorFlow C++ core functions using the C API. FFI is new and may not be used by existing third-party bindings.

A survey on GitHub revealed that there are third-party TensorFlow bindings developed by communities or vendors in the following languages: C#, Haskell, Julia, node.js, PHP, R, Ruby, Rust and Scala.

Android

There is now a new optimized TensorFlow-Lite Android library to run TensorFlow applications. For More information, see What’s New in Android: O Developer Preview 2 & More.

Simplify TensorFlow with Keras

Keras’s layers and models are fully compatible with pure TensorFlow tensor. Therefore, Keras provides a good model definition plug-in for TensorFlow. Developers can even use Keras with other TensorFlow libraries. For more information, see Using Keras as a brief interface to TensorFlow: A Tutorial.

conclusion

TensorFlow is just one of many open source software libraries for machine learning. However, based on the number of GitHub projects it has, it has become one of the most widely adopted deep learning frameworks. In this tutorial, you learned an overview of TensorFlow, learned which platforms support it, and looked at installation considerations.

If you’re ready to check out some examples using TensorFlow, check out the Developer code pattern for Machine Learning To speed up training and Picture recognition training with PowerAI Notebooks.


Download resources

  • PDF file of this article

Diggings translation project is a community for translating quality Internet technical articles from diggings English sharing articles. The content covers Android, iOS, React, front end, back end, product, design and other fields. If you want to see more high-quality translations, please continue to pay attention to The Jingjin Translation Project, official weibo, zhihu column.