NVIDIA – SMI is introduced

  • Nvidia-smi (NVSMI for short) is a cross-platform tool that supports all standard nvidia drivers on Linux distributions and 64-bit systems starting with WindowsServer 2008 R2. This tool comes with the N card driver, as long as you install the driver will have it.

  • C:\Program Files\NVIDIACorporation\NVSMI\nvidia-smi.exe In Linux, run the /usr/bin/nvidia-smi program. Since the PATH PATH has been added to the program, you can directly run the nvidia-smi program.

Nvidia-smi command series in detail

nvidia-smi

Display the current information status of all Gpus

In the table shown:

Fan: Indicates the Fan speed (0-100%). N/A indicates that there is no Fan

Temp: indicates the GPU temperature. (If the GPU temperature is too high, the GPU frequency decreases.)

Perf: Performance state, from P0 (maximum performance) to P12 (minimum performance)

Pwr: indicates the GPU power consumption

Persistence-m: Persistent mode state (persistent mode consumes a lot of energy but takes less time to start a new GPU application)

Bus, Bus – Id: GPU domain: Bus: device. The function

Disp.A: Display Active: indicates whether the GPU Display is initialized

Memory-usage: Memory Usage

Volatile GPU-util: indicates GPU usage

ECC: Indicates whether to enable error checking and correction. The value is 0/DISABLED and 1/ENABLED

Compute M. : 0/DEFAULT,1/EXCLUSIVE_PROCESS,2/PROHIBITED

Additional options:

  • Nvidia – smi – I XXX
    • Specify a GPU
  • Nvidia – smi – l XXX
    • Dynamically refreshes information (every 5 seconds by default). Press Ctrl+C to stop the refresh. You can specify the refresh frequency in seconds
  • Nvidia – smi – f XXX
    • The queried information is output to a specific file and not displayed on the terminal

nvidia-smi -q

Example Query the current information about all Gpus

Additional options:

  • The nvidia – smi – q – u
    • Display units instead of GPU properties
  • Nvidia – smi – q – I XXX
    • Specify specific GPU or unit information
  • Nvidia – smi – q – f XXX
    • The queried information is output to a specific file and not displayed on the terminal
  • Nvidia – smi – q – x
    • Output the query information as XML
  • Nvidia – smi – q – d XXX
    • To display certain information about the GPU card, The XXX parameter can be MEMORY, UTILIZATION, ECC, TEMPERATURE, POWER,CLOCK, COMPUTE, PIDS, PERFORMANCE, SUPPORTED_CLOCKS, PAGE_RETIREMENT,ACCOUNTING
  • Nvidia – smi – q – l XXX
    • Dynamic refresh information, press Ctrl+C to stop, you can specify the refresh frequency, in seconds

nvidia-smi –query-gpu=gpu_name,gpu_bus_id,vbios_version–format=csv

Selective query options that specify the property options to display

You can view the following attributes: TIMESTAMP, driver_version, pCI. bus, pCI.link.width. Current. (See nvidia-smI –help-query — GPU to see which attributes are available)

Device modification options

You can manually set the GPU card status option

  • Nvidia – smi – 0/1 PM
    • Set the persistent mode to 0/DISABLED and 1/ENABLED
  • Nvidia 0/1 – smi – e
    • ECC switching can be 0 or DISABLED or 1 or ENABLED
  • Nvidia 0/1 – smi – p
    • Reset ECC error counts: 0/VOLATILE, 1/AGGREGATE
  • Nvidia – smi – c
    • Set the computing application mode to 0/DEFAULT,1/EXCLUSIVE_PROCESS,2/PROHIBITED
  • Nvidia – smi – r
    • GPU reset
  • Nvidia – smi – vm
    • Example Set the GPU virtualization mode
  • Nvidia – smi – ac XXX, XXX
    • Set the GPU operating frequency. E.g. Nvidia – smi – ac2000, 800
  • Nvidia – smi – rac
    • Reset the clock frequency to the default value
  • Nvidia – smi – 0/1 acp
    • Toggle permission requirements for -AC and -RAC, 0/UNRESTRICTED, 1/RESTRICTED
  • Nvidia – smi – pl
    • Specify maximum power management limit (watts)
  • Nvidia – smi – am 0/1
    • Enable or disable the counting mode, 0/DISABLED,1/ENABLED
  • Nvidia – smi – caa
    • Clears all recorded Pids in the buffer, 0/DISABLED,1/ENABLED

nvidia-smi dmon

Device monitoring command to display GPU device statistics in scroll bar format.

GPU statistics are displayed in a one-line scrolling format, and the metrics to be monitored can be adjusted based on the width of the terminal window. A maximum of four Gpus are monitored. If no GPU is specified, gPU0-GPU3 (GPU index starts from 0) is monitored by default.

Additional options:

  • Nvidia – smi dmon – I XXX
    • Use commas to separate the GPU index, PCI bus ID or UUID
  • Nvidia – smi dmon – d XXX
    • Specify the refresh time (default: 1 second)
  • Nvidia – smi dmon c XXX
    • Displays a specified number of statistics and exits
  • Nvidia – smi dmon – s XXX
    • Specify which monitoring metrics (puC by default) to display, where:
      • P: Power usage and temperature (PWR: power consumption, temp: temperature)
      • U: GPU usage (SM: stream processor, MEM: video memory, ENC: encoding resource, DEC: decoding resource)
      • C: GPU processor and GPU memory clock frequency (MCLK: video memory frequency, PCLK: processor frequency)
      • V: Power supply and heat are abnormal
      • M: FB memory and Bar1 memory
      • E: indicates the number of ECC errors and PCIe replay errors
      • T: PCIe read/write bandwidth
  • Nvidia – smi dmon – o D/T
    • Specifies the display time format D: YYYYMMDD, THH:MM:SS
  • Nvidia – smi dmon – f XXX
    • The queried information is output to a specific file and not displayed on the terminal

nvidia-smi pmon

Process monitoring command to display GPU process status in scroll bar format.

GPU process statistics are displayed in a scrolling format. This tool lists statistics of all GPU processes. The metrics to be monitored can be adjusted based on the width of the terminal window. A maximum of four Gpus are monitored. If no GPU is specified, gPU0-GPU3 (GPU index starts from 0) is monitored by default.

Additional options:

  • Nvidia – smi pmon – I XXX
    • Use commas to separate the GPU index, PCI bus ID or UUID
  • Nvidia – smi pmon – d XXX
    • Specify the refresh time (default: 1 second, maximum: 10 seconds)
  • Nvidia – smi pmon – c XXX
    • Displays a specified number of statistics and exits
  • Nvidia – smi pmon – s XXX
    • Specifies which monitoring metrics (u by default) to display, where
      • U: GPU usage
      • M: FB memory usage
  • Nvidia – smi pmon – o D/T
    • Specifies the display time format D: YYYYMMDD, THH:MM:SS
  • Nvidia – smi pmon – f XXX
    • The queried information is output to a specific file and not displayed on the terminal