How to setup deep learning Python (3) environment with NVidia GPU
Dr. George Jen
If your computer has NVIDIA GPU, chances are you do, and runs Ubuntu, following are steps to setup:
Latest version CUDA
· NVIDIA GPU Driver
· cuDNN (NVidia CUDA Deep Neural Network library)
· Anaconda3 Python 3.6
· Tensorflow (for GPU)
· Keras (for GPU)
· CNTK (for GPU)
· Theano
· PyTorch
· along with other required software libraries
Then you have an environment to pursue machine learning running on GPU, which can be typically more than 10 times faster than running on CPU.
This demo environment is under Ubuntu 18.04 (if you have different version of Ubuntu, there could be slight difference in setting things up; if you have other Linux, such as Redhat or CentOS, procedures can be more different)
1. Check OS release:
cat /etc/*release*
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04 LTS"
NAME="Ubuntu"
VERSION="18.04 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
2. Confirm NVidia GPU plugged in, using command lspci -v to show it.
lspci -v | grep VGA
02:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 710B] (rev a1) (prog-if 00 [VGA controller])
3. Pre-requisite:
developer tools such as g++, gcc, git and make etc are already installed. If not, you can install them
sudo apt update
sudo apt upgrade
sudo apt install build-essential
sudo apt install git
4. Clean up/remove previous NVIDIA software including drivers:
sudo apt-get remove --purge nvidia* cuda-drivers libcuda1-396 cuda-runtime-9-2 cuda-9.2 cuda-demo-suite-9-2 cuda
5. Install latest CUDA kit:
For detail about CUDA:
https://developer.nvidia.com/cuda-zone
sudo apt install nvidia-cuda-toolkit
become root to make /opt/local/cuda/bin/nvcc available for GPU driver install:
sudo –i
cd /usr/local
mkdir cuda
cd cuda
mkdir bin
ln -s /usr/bin/nvcc nvcc
6. Install latest NVIDIA GPU driver:
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
ubuntu-drivers devices
sudo ubuntu-drivers autoinstall
Reboot to load Nvidia driver installed, when comes back, run command "nvidia-smi" to verify enabled NVIDIA GPU
Install CUDA SDK kit include source sample code:
Mkdir ~/cuda
cd cuda
sudo dpkg -i ./cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64
sudo apt-key add /var/cuda-repo-10-0-local-10.0.130-410.48/7fa2af80.pub
sudo apt update
sudo apt install cuda
becomes root:
sudo –i
cd /usr/local/cuda-10.0/samples
Build all sample codes:
make
Reboot. After come back, run deviceQuery
/usr/local/cuda-10.0/samples/bin/x86_64/linux/release# /usr/local/cuda-10.0/samples/bin/x86_64/linux/release/deviceQuery
7. Install cuDNN (The NVIDIA CUDA Deep Neural Network library):
For detail, see:
https://developer.nvidia.com/cudnn
Download cdDNN, you need to register a developer membership with NVIDIA, which is free before download below deb files. Once downloaded, run below to install.
sudo dpkg -i libcudnn7_7.1.4.18-1+cuda9.2_amd64.deb
sudo dpkg -i libcudnn7-dev_7.1.4.18-1+cuda9.2_amd64.deb
sudo dpkg -i libcudnn7-doc_7.1.4.18-1+cuda9.2_amd64.deb
cp -r /usr/src/cudnn_samples_v7/ $HOME
cd $HOME/cudnn_samples_v7/mnistCUDNN
make clean && make
Test mnist number recognition using cuDNN
./mnistCUDNN
~/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN
cudnnGetVersion() : 7104 , CUDNN_VERSION from cudnn.h : 7104 (7.1.4)
Host compiler version : GCC 7.3.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 1 Capabilities 3.5, SmClock 954.0 Mhz, MemSize (Mb) 2000, MemClock 900.0 Mhz, Ecc=0, boardGroupID=0
Using device 0
Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 2
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.054144 time requiring 100 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.056032 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.123040 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.474048 time requiring 207360 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 1.869824 time requiring 2057744 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
8. Install Anaconda Python (3.6.5)
Download Anaconda Python 3 from:
https://www.anaconda.com/download/
Once downloaded, run .sh file:
sh ./Anaconda3-5.2.0-Linux-x86_64.sh
conda update conda
source ~/.bashrc
9. Install Tensorflow (GPU version) (TensorFlow™ is an open source software library for high performance numerical computation.) using conda commandline:
conda install -c anaconda tensorflow-gpu
Test it with
python -c "import tensorflow"
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally
10. Install keras (GPU) (Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano.)
conda install -c anaconda "keras-gpu"
Tested it:
python -c "import keras"
..anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally
11. Install Theano (A Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays) using conda commandline:
conda install -c anaconda theano
Verify installation by:
python -c "import theano"
12. Install Pytorch (Tensors and Dynamic neural networks in Python with strong GPU acceleration) using conda commandline:
conda install -c anaconda pytorch
Verify installation by:
python -c "import torch"
13. Install CNTK (Microsoft Cognitive Toolkit), GPU version:
sudo apt-get install openmpi-bin
pip install cntk-gpu
Install Open MPI Library (Open Message Passing Interface Library)
Get the installation sources:
wget https://www.open-mpi.org/software/ompi/v1.10/downloads/openmpi-1.10.3.tar.gz
tar -xzvf ./openmpi-1.10.3.tar.gz
cd openmpi-1.10.3
./configure --prefix=/usr/local/mpi
make -j all
sudo make install
Verify installation
python -c "import cntk"
14. Other libraries and modules:
Protobuf Protocol Buffers for serialization. For installation please follow these steps:
cd ~
sudo apt-get install autoconf automake libtool curl make g++ unzip
wget https://github.com/google/protobuf/archive/v3.1.0.tar.gz
tar -xzf v3.1.0.tar.gz
cd protobuf-3.1.0
./autogen.sh
./configure CFLAGS=-fPIC CXXFLAGS=-fPIC --disable-shared --prefix=/usr/local/protobuf-3.1.0
make -j $(nproc)
sudo make install
15. install zlib1g-dev
sudo apt-get install zlib1g-dev
16. LIBZIP
wget http://nih.at/libzip/libzip-1.1.2.tar.gz
tar -xzvf ./libzip-1.1.2.tar.gz
cd libzip-1.1.2
./configure
make -j all
sudo make install
17. Boost Library
The Boost Library is a prerequisite for building the Microsoft Cognitive Toolkit.
sudo apt-get install libbz2-dev
sudo apt-get install python-dev
wget -q -O - https://sourceforge.net/projects/boost/files/boost/1.60.0/boost_1_60_0.tar.gz/download | tar -xzf -
cd boost_1_60_0
./bootstrap.sh --prefix=/usr/local/boost-1.60.0
sudo ./b2 -d0 -j"$(nproc)" install
18. Install h5py, requires for saving model
conda install -c anaconda h5py
19. Now you can test import all these modules:
python -c "import torch"
python -c "import tensorflow"
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally
python -c "import theano"
python -c "import keras"
.../anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally
python -c "import cntk"
Now your machine is now a Python work horse doing machine learning on NVIDIA GPU that could have hundreds or thousands of CUDA cores depending on how much you pay for the GPU card(s).