Basic environment

  1. First, learn about the kernel version of the operating system of your server:
    1. Run the cat /etc/issue or cat /etc/lsb-release command to view the version of the operating system

    2. View the graphics card information of the server:

      1. lspci | grep -i nvidiaView all graphics card information.
      2. nvidia-smiUse this command if you already have a graphics driver installed.
      3. cat /proc/driver/nvidia/versionView the driver information about the installed graphics card.

      Install the graphics card driver based on the OS version.

Multiple versions of GCC and g++

GCC and g++ is a lot of driver installation process need to use the compiler, many times due to the compiler version does not correspond to make the installation of a lot of wonderful error, according to experience, now CUDA 10.1, can also use 4.8, so the best choice between 4.8-5.4 version is better, compatible with a bit. For the installation of multiple versions of GCC and g++ in detail:

  1. Check your owngccandg++Version:gcc --versionAs well asg++ --version
  2. To install new or multiple versions of GCC and g++ :
    1. sudo add-apt-repository ppa:ubuntu-toolchain-r/testStart by adding a repository of updates to make it easier to update.
    2. sudo apt-get updateAs well assudo apt get updateMake necessary updates to required software packages, etc.
    3. Sudo apt - get the install GCC - 4.9As well asSudo apt - get the install g + + 4.9Used to install the corresponding versiongccAs well asg++. Pay attention to the version you need to modify.
    4. Sudo update-alternatives --install /usr/bin/gcc GCC /usr/bin/gcc-4.9 20

      Sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20

      These two lines are used to put the newly installedgccandg++Regist-like operations are added to the bin for optional operations. That is to say, through this operation to continuously register new to the systemgccandg++Version.
    5. update-alternatives --config gcc

      update-alternatives --config g++Used to select the version. After entering, complete the selection according to the prompts. If the permission is insufficientsudo.
    6. Generally, we use the version of the compiler between 4.8 and 5.4. If there is still an error, the preferred operation should be to uninstall the graphics driver and reinstall it. This method has the highest success rate.

Driver installation

Sudo apt-get remove –purge nvidia* Remove –purge nvidia* sudo apt-get remove –purge nvidia* Nouveau. Sudo vim/etc/modprobe. D/blacklist. Conf on file the back to join the following contents:

blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off
Copy the code

Check whether the operation is successful: lsmod | grep nouveau didn’t show that success.

The installation of the graphics card driver is relatively simple. You can download the corresponding driver directly from the official website. Let me download


Sudo. / NVIDIA - Linux - x86_64-430.34. The run


nvidia-smi

CUDA installation and CUDNN installation

CUDA is a computing platform for GPU computing. Install CUDA based on site requirements.

cuda
cuda
cuda

  1. Download the corresponding cudA version.Download all cudA versions from me. Since the access to this site is difficult to find, you’d better save it.

    This is the version I downloaded myself. Upload the downloaded version to the server for installation as well.

    Sudo sh cuda_10. 0.130 _410. 48 _linux. RunInstallation can be baidu 1 ha.

    Options during installation:

    1. Nvidia Accelerated Graphics Driver, n Because we’ve already installed it.
    2. Everything else is yes.
    3. Soft connection establishment. Note that there is also an option for soft connection setup, which will be in/usr/local/Let’s establish a soft linkcudaThe soft link connects to the real of the installationCuda - 10.0 -The address. The establishment of soft connection can be used for the management of multiple versions of CUDA.

      It’s the yellow onecudaIs a soft connection, red is multiple installedCUDAYou can modify the soft connectioncudaVersion.

      This is after the installation is complete. After installation, adopt

NVCC -v Check the installation. If no corresponding command is found, configure environment variables. Here we follow the hypothesized cudA soft connection configuration: sudo vim ~/. Bashrc add the following content:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
export PATH=$PATH:/usr/local/cuda/bin
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda
Copy the code

Then use nvidia-SMI

  1. Multiple versions of CUDA management. In fact, it is very simple, which is the management of the soft connection, for example, I need other versions of CUDA, because my environment variables directly point to a soft connection, so I can delete the old soft connection and then establish a new soft connection to achieve the switch of different versions of CUDA.
sudo rm -rf cuda Delete old versions of the softlink
sudo ln -s /usr/localThe/usr / / cuda - 9.1local/cuda  Set up a soft connection for the new version. The preceding path is the cudA installation path for the required version.
Copy the code
  1. Install the corresponding CUDNN. Select a CUDNN based on the CUDA version.Click on me to download

    If you install it, you need to download 3 DeBs for each version of the operating system. So let’s just download the version of the arrow and change the suffix of the file toCudnn - 10.0 - Linux - x64 - v7.3.0.29. Solitairetheme8This is alsotgz. Decompress the downloaded file.

Cp cudnn - 10.0 - Linux - x64 - v7.3.0.29. Solitairetheme8 cudnn - 10.0 - Linux - x64 - v7.3.0.29. TGZ# change the suffixThe tar - XVF cudnn - 8.0 - Linux - x64 - v5.1. TGZ# decompression
Copy the code

After unpacking, you get a CUDA folder. Perform the following operations to install CUDNN. (This time the CUDNN should be installed directly into the actual installation path of the corresponding version of CUDA. So you can read the cudNN file while setting up the soft connection.)

sudo cp cuda/include/cudnn.h    /usr/local/cuda-xx.x/include # Fill in the CUDA path for the corresponding version
sudo cp cuda/lib64/libcudnn*    /usr/local/cuda-xx.x/lib64   # Fill in the CUDA path for the corresponding version
sudo chmod a+r /usr/local/cuda-xx.xx/include/cudnn.h   /usr/local/cuda-xx.xx/lib64/libcudnn*
Copy the code

If you prefer to use deb installation, see the above process. If it is still wrong, it is recommended to uninstall the graphics driver and try again.

Anaconda understand and install

Anaconda is a very efficient solution for python environment management. Download the corresponding software version from the repository and install it. Click me to download.

Bash Anaconda3 5.0.1 - Linux - x86_64. Sh
source ~.bashrc

  1. Create a new environment for yourselfconda create-n your_name python=your_version
  2. Activate a new environmentsource activate your_name

Pytorch installation

Go to the PyTorch website to download the corresponding version of PyTorch. website

python  Enter the Python environment
import torch  Import the installed PyTorch package
torch.cuda.is_available()  # Check whether CUDA is usable
Copy the code

If the output is false for the torch.cuda.is_available(), there is a problem with the previous driver or the CUDA installation, most likely the driver. To solve the problem, directly uninstall and reinstall the graphics card driver.