How to setup deep learning Python (3) environment with NVidia GPU

Dr. George Jen

If your computer has NVIDIA GPU, chances are you do, and runs Ubuntu, following are steps to setup:

Latest version CUDA

· NVIDIA GPU Driver

· cuDNN (NVidia CUDA Deep Neural Network library)

· Anaconda3 Python 3.6

· Tensorflow (for GPU)

· Keras (for GPU)

· CNTK (for GPU)

· Theano

· PyTorch

· along with other required software libraries

Then you have an environment to pursue machine learning running on GPU, which can be typically more than 10 times faster than running on CPU.

This demo environment is under Ubuntu 18.04 (if you have different version of Ubuntu, there could be slight difference in setting things up; if you have other Linux, such as Redhat or CentOS, procedures can be more different)

1. Check OS release:

cat /etc/*release*

DISTRIB_ID=Ubuntu

DISTRIB_RELEASE=18.04

DISTRIB_CODENAME=bionic

DISTRIB_DESCRIPTION="Ubuntu 18.04 LTS"

NAME="Ubuntu"

VERSION="18.04 LTS (Bionic Beaver)"

ID=ubuntu

ID_LIKE=debian

PRETTY_NAME="Ubuntu 18.04 LTS"

VERSION_ID="18.04"

HOME_URL="https://www.ubuntu.com/"

SUPPORT_URL="https://help.ubuntu.com/"

BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"

PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"

VERSION_CODENAME=bionic

UBUNTU_CODENAME=bionic

2. Confirm NVidia GPU plugged in, using command lspci -v to show it.

lspci -v | grep VGA

02:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 710B] (rev a1) (prog-if 00 [VGA controller])

3. Pre-requisite:

developer tools such as g++, gcc, git and make etc are already installed. If not, you can install them

sudo apt update

sudo apt upgrade

sudo apt install build-essential

sudo apt install git

4. Clean up/remove previous NVIDIA software including drivers:

sudo apt-get remove --purge nvidia* cuda-drivers libcuda1-396 cuda-runtime-9-2 cuda-9.2 cuda-demo-suite-9-2 cuda

5. Install latest CUDA kit:

For detail about CUDA:

https://developer.nvidia.com/cuda-zone

sudo apt install nvidia-cuda-toolkit

become root to make /opt/local/cuda/bin/nvcc available for GPU driver install:

sudo –i

cd /usr/local

mkdir cuda

cd cuda

mkdir bin

ln -s /usr/bin/nvcc nvcc

6. Install latest NVIDIA GPU driver:

sudo add-apt-repository ppa:graphics-drivers/ppa

sudo apt update

ubuntu-drivers devices

sudo ubuntu-drivers autoinstall

Reboot to load Nvidia driver installed, when comes back, run command "nvidia-smi" to verify enabled NVIDIA GPU

Install CUDA SDK kit include source sample code:

Mkdir ~/cuda

cd cuda

wget https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64

sudo dpkg -i ./cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64

sudo apt-key add /var/cuda-repo-10-0-local-10.0.130-410.48/7fa2af80.pub

sudo apt update

sudo apt install cuda

becomes root:

sudo –i

cd /usr/local/cuda-10.0/samples

Build all sample codes:

make

Reboot. After come back, run deviceQuery

/usr/local/cuda-10.0/samples/bin/x86_64/linux/release# /usr/local/cuda-10.0/samples/bin/x86_64/linux/release/deviceQuery

7. Install cuDNN (The NVIDIA CUDA Deep Neural Network library):

For detail, see:

https://developer.nvidia.com/cudnn

Download cdDNN, you need to register a developer membership with NVIDIA, which is free before download below deb files. Once downloaded, run below to install.

sudo dpkg -i libcudnn7_7.1.4.18-1+cuda9.2_amd64.deb

sudo dpkg -i libcudnn7-dev_7.1.4.18-1+cuda9.2_amd64.deb

sudo dpkg -i libcudnn7-doc_7.1.4.18-1+cuda9.2_amd64.deb

cp -r /usr/src/cudnn_samples_v7/ $HOME

cd $HOME/cudnn_samples_v7/mnistCUDNN

make clean && make

Test mnist number recognition using cuDNN

./mnistCUDNN

~/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN

cudnnGetVersion() : 7104 , CUDNN_VERSION from cudnn.h : 7104 (7.1.4)

Host compiler version : GCC 7.3.0

There are 1 CUDA capable devices on your machine :

device 0 : sms 1 Capabilities 3.5, SmClock 954.0 Mhz, MemSize (Mb) 2000, MemClock 900.0 Mhz, Ecc=0, boardGroupID=0

Using device 0

Testing single precision

Loading image data/one_28x28.pgm

Performing forward propagation ...

Testing cudnnGetConvolutionForwardAlgorithm ...

Fastest algorithm is Algo 2

Testing cudnnFindConvolutionForwardAlgorithm ...

^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.054144 time requiring 100 memory

^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.056032 time requiring 0 memory

^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.123040 time requiring 57600 memory

^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.474048 time requiring 207360 memory

^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 1.869824 time requiring 2057744 memory

Resulting weights from Softmax:

0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000

Loading image data/three_28x28.pgm

Performing forward propagation ...

Resulting weights from Softmax:

0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000

Loading image data/five_28x28.pgm

Performing forward propagation ...

Resulting weights from Softmax:

0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006

Result of classification: 1 3 5

Test passed!

8. Install Anaconda Python (3.6.5)

Download Anaconda Python 3 from:

https://www.anaconda.com/download/

Once downloaded, run .sh file:

sh ./Anaconda3-5.2.0-Linux-x86_64.sh

conda update conda

source ~/.bashrc

9. Install Tensorflow (GPU version) (TensorFlow™ is an open source software library for high performance numerical computation.) using conda commandline:

conda install -c anaconda tensorflow-gpu

Test it with

python -c "import tensorflow"

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally

10. Install keras (GPU) (Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano.)

conda install -c anaconda "keras-gpu"

Tested it:

python -c "import keras"

..anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.

from ._conv import register_converters as _register_converters

Using TensorFlow backend.

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally

11. Install Theano (A Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays) using conda commandline:

conda install -c anaconda theano

Verify installation by:

python -c "import theano"

12. Install Pytorch (Tensors and Dynamic neural networks in Python with strong GPU acceleration) using conda commandline:

conda install -c anaconda pytorch

Verify installation by:

python -c "import torch"

13. Install CNTK (Microsoft Cognitive Toolkit), GPU version:

sudo apt-get install openmpi-bin

pip install cntk-gpu

Install Open MPI Library (Open Message Passing Interface Library)

Get the installation sources:

wget https://www.open-mpi.org/software/ompi/v1.10/downloads/openmpi-1.10.3.tar.gz

tar -xzvf ./openmpi-1.10.3.tar.gz

cd openmpi-1.10.3

./configure --prefix=/usr/local/mpi

make -j all

sudo make install

Verify installation

python -c "import cntk"

14. Other libraries and modules:

Protobuf Protocol Buffers for serialization. For installation please follow these steps:

cd ~

sudo apt-get install autoconf automake libtool curl make g++ unzip

wget https://github.com/google/protobuf/archive/v3.1.0.tar.gz

tar -xzf v3.1.0.tar.gz

cd protobuf-3.1.0

./autogen.sh

./configure CFLAGS=-fPIC CXXFLAGS=-fPIC --disable-shared --prefix=/usr/local/protobuf-3.1.0

make -j $(nproc)

sudo make install

15. install zlib1g-dev

sudo apt-get install zlib1g-dev

16. LIBZIP

wget http://nih.at/libzip/libzip-1.1.2.tar.gz

tar -xzvf ./libzip-1.1.2.tar.gz

cd libzip-1.1.2

./configure

make -j all

sudo make install

17. Boost Library

The Boost Library is a prerequisite for building the Microsoft Cognitive Toolkit.

sudo apt-get install libbz2-dev

sudo apt-get install python-dev

wget -q -O - https://sourceforge.net/projects/boost/files/boost/1.60.0/boost_1_60_0.tar.gz/download | tar -xzf -

cd boost_1_60_0

./bootstrap.sh --prefix=/usr/local/boost-1.60.0

sudo ./b2 -d0 -j"$(nproc)" install

18. Install h5py, requires for saving model

conda install -c anaconda h5py

19. Now you can test import all these modules:

python -c "import torch"

python -c "import tensorflow"

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally

python -c "import theano"

python -c "import keras"

.../anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.

from ._conv import register_converters as _register_converters

Using TensorFlow backend.

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally

python -c "import cntk"

Now your machine is now a Python work horse doing machine learning on NVIDIA GPU that could have hundreds or thousands of CUDA cores depending on how much you pay for the GPU card(s).