Give me a new server, how will I arrange CUDA

I’m participating in nuggets Creators Camp # 4, click here to learn more and learn together!

👋 only actual combat, do not speak of theory, one article to read 👋

🎉 statement: as one of the bloggers in AI field, ❤️ lives up to your time and your ❤️

In the early stages of building a deep learning environment, I believe that you will often encounter the problem that TensorFlow or Pytorch libraries do not match Cuda or Cudnn versions.

Therefore, in many cases, installing multiple CUDAs on a single server has become the inevitable choice for all of you

This article reveals how I can arrange Cuda to make my learning and production more efficient with a brand new server

📔 server is a team or project, so CUDA is your own!!

The first part shows you how to install Ubuntu Cuda10.0

Installation system: Ubuntu 18.04.5

🟧 1 Cuda official website download address

Cat /proc/version (Linux Displays the version of the current operating system)Copy the code

🟨 2 installation

# the first installation package "cuda_10. 0.130 _410. 48 _linux. Run, the properties of the modified into executable;Chmod 755 cuda_10. 0.130 _410. 48 _linux. RunDo not install using SudoSh cuda_10. 0.130 _410. 48 _linux. RunCopy the code

The process is as follows. Press the space to read the agreement and perform the operations as shown in the picture below:

Note: No new drivers are installed here because:

1: The driver installed by root user can support current CUDA10.0 running.
2. Root permission is required for driver update and installation (that is, only one Nvidia kernel driver can be installed on a Linux server). I have no permission to update the graphics card driver on the server of the team.
3. If the server’s driver version is high enough to support both CUDA10 and CUDA9, then CUDA10.0 installed here will work.

🟦 3: Configure environment variables

cd /home/zhijian
vim .bashrc

Add the newly installed CUDA path at the bottom:
---
export PATH="/home/zhijian/usr/local/cuda10/bin:$PATH"
export LD_LIBRARY_PATH="/home/zhijian/usr/local/cuda10/lib64:$LD_LIBRARY_PATH"
---

After saving, the configuration takes effect:
source .bashrc   
Copy the code

Command line inputnvcc -VView the CUDA version as follows:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
Copy the code

The installation of CUDa10.0 is OK, and the public CUDA of the server is no longer used

📕 Configure cudnn. The version of CUDNN configured here is 7.6.0

TensorFlow1.2 ~ 2.1 the corresponding with CUDA GPU version version | bsde

🔴 1 Official website to download

This download requires email registration and login, easy to forget the password, it is very annoying there is no…

🔵 2 Decompress CUDnn

Cudnn for Linux is a.solitairetheme8 file that needs to be decompressed in TGZ format (I was also taken abaze by this operation) :

Cp cudnn - 10.0 - Linux - x64 - v7.6.0.64. Solitairetheme8 cudnn - 10.0 - Linux - x64 - v7.6.0.64. TGZ tar ZXVF. - Cudnn - 10.0 - Linux - x64 - v7.6.0.64. TGZCopy the code

🟣 3 Installation and Configuration

 cp cuda/include/cudnn.h /home/zhijian/usr/local/cuda10/include/

 cp cuda/lib64/libcudnn.s* /home/zhijian/usr/local/cuda10/lib64/

 chmod 755 /home/zhijian/usr/local/cuda10/include/cudnn.h
 
# View the CUDNN version
 cat /home/zhijian/usr/local/cuda10/include/cudnn.h | grep CUDNN_MAJOR -A 2
Copy the code

I get the correct output as follows:

📙 Friendly tips

I installed Cuda10.0 and cudNN 7.6.0 because the code training tensorflow-GPU version 2.0. You need to install the corresponding CUDA and CUDNN versions according to your requirements.
TensorFlow2.0-GPU training to walk…
🍊 # Article read tensorflow-GPU installation, testing

— Core dry goods knowledge points online —

📙 Can CUDa9 and CUDA 10 coexist on a Linux server?

It’s possible to co-exist:

For example, if you have installed an older VERSION of Cuda and the corresponding kernel, and the driver also supports an older VERSION of Cuda, you can freely switch between cudAS by modifying the configuration file.

However, if the system installed CUDa9 and the corresponding driver first, then you want to install cuda10 for the current non-root user, because you do not have Root permission, you cannot update the driver.

📙 So, given a new server, how will I arrange Cuda?

Install the driver kernel of the latest version, such as Cuda 11, on the server as user Root
Ordinary users can install their own Cuda, depending on their project needs

The default Cuda installation positions and configurations are as follows:

export PATH=/usr/local/cuda-10.0/bin:$PATH  
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda
Copy the code

For common users, you can configure multiple Cuda installation directories, but only one version can be used at a time. In this case, CUDa10 can be configured as the corresponding cuda9 installation directory based on your requirements

For example, if I only have one CUDA configured, I will comment out the configuration and use the system’s default CUDA.

#export PATH="/home/moli/usr/local/cuda10/bin:$PATH" 
#export LD_LIBRARY_PATH="/home/moli/usr/local/cuda10/lib64:$LD_LIBRARY_PATH"
Copy the code

The configuration file is the.bashrc file in the root directory of each user.

The configuration is as follows:
cd ~ 
vim .bashrc

Make the configuration take effect
source .bashrc
Copy the code

📙 How do I switch Cuda versions

For now, installing cuda10.x or Cuda11.x for your own users is pretty much enough
If different projects really only work with different versions of CUDAS, you may need to install multiple CUDAs yourself
You need to be able to edit ~/. Bashrc and configure different VERSIONS of Cuda
As shown below, for deep learning Python projects, you only need to open your current Cuda configuration each time you switch versions
For C++ projects, Cuda configuration can be done in cmakelists.txt


# >>> Cuda10.0 installed by root user of server
# export PATH = / usr/local/cuda - 10.0 / bin: $PATH
# export LD_LIBRARY_PATH = / usr/local/cuda - 10.0 / lib64: $LD_LIBRARY_PATH
#export CUDA_HOME=/usr/local/cuda


# >>> Cuda 11 add by ML Cuda 11 path
export PATH=/home/ml/usr/mycuda/bin:$PATH
export LD_LIBRARY_PATH=/home/ml/usr/mycuda/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/home/ml/usr/mycuda

You can also configure a Cuda9, Cuda8 if necessary.Copy the code

🚀🚀 Mexic AI

🎉 as one of the bloggers with the most dry goods in the field of AI, ❤️ lives up to his time and qing ❤️

❤️ If the article is helpful to you, like, comment encourage bloggers every minute to seriously create

Happy learning AI, deep learning environment building: an article to read

🍊 # Ubuntu install CUDa11.2 for current users
🍊 # Linux and Windows setup PIP image source – the most practical machine learning library download acceleration setup
🍊# Anaconda conda switch to domestic source, Windows and Linux configuration method, add tsinghua source —
🍊 # Specify the GPU to run and train Python programs, deep learning single card, multi-card training GPU Settings
🍊 # Install Pytorch and Torchvision in Cuda10.0 for Linux
🍊 # Read SSH password login, public key authentication login
Install JDK 11: configure the JAVA_HOME environment variable