1. Common Linux commands

1) File operation

Cat, vi, vim, ls, mkdir, touch, cp, mv

Find if the file name exists:

find / -name mysql 
Copy the code

Cat: displays file contents at one time

2) log

tail -f /var/www/MOB_logs/catalina2018.-05-18.out 
Copy the code

Crawl keywords:

cat catalina2019.-03-20.out | grep "Return to respData"
Copy the code
grep -i "Return to respData" catalina2018.-06-11.out
Copy the code

3) decompression

tar -zxvf filename.tar.gz
Copy the code

4) Find the process

ps -aux|grep java
Copy the code

5) System, memory, disk, network related

Top Displays the memory and CPU information

Du and df View disk and file sizes

du -s -h /data/
Copy the code

Ping and curl check whether the network is normal

6) Permission related

Chmod: changes the permission of a file

Chown: change owner: changes the owner permission of a file or directory

Chattr: The underlying operation of chmod to lock files

2, the reason for the slow system, or suddenly very jammed

  • The Full GC count is too high
  • The CPU is too high
  • The interface is time-consuming, and the response is slow due to excessive HTTP requests. (Classic)
  • Deadlock (deadlock)
  • A thread enters WAITTING, sleep, wait time too long, fake dead.

The CPU is too high, the Full GC number is too high, the memory usage is too high, and the hard disk space is insufficient. These problems can cause the system to run slowly.

This raises two questions:

CPU utilization and load issues.

CPU utilization shows the percentage of CPU that the program uses in real time during runtime. The CPU usage reflects how busy the CPU is. The reason is that a process that occupies CPU processing time may be in I/O wait state but has not been released to enter wait state.

CPU load refers to the number of processes that occupy CPU time and the number of processes that wait for CPU time in a certain period of time. The processes that wait for CPU time refer to the processes that are waiting to be waked up, excluding the processes that are in wait state.

A high CPU usage does not mean a heavy CPU load. There is no necessary relationship between the two.

What about high CPU load?

You can run ps-axjf to check whether the STAT column contains a D process

Such as:

[root@VM-8-8-centos proc]# ps -axjf
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
    0     2     0     0 ?           -1 D        0   0:00 [kthreadd]
    2     4     0     0 ?           -1 S<       0   0:00  \_ [kworker/0:0H]
    2     6     0     0 ?           -1 D        0   0:10  \_ [ksoftirqd/0]
    2     7     0     0 ?           -1 S        0   0:00  \_ [migration/0]
Copy the code

The D state is the state of uninterruptible sleep. A process in this state cannot be killed or exit. This can only be resolved by restoring dependent resources or restarting the system.

High load, like the highway traffic jam on holidays, the road is full of cars, has been blocked, there are many cars waiting outside the toll station, the solution is to build a highway, improve the hardware performance of the server, or find out I/O waiting tasks, manual processing.

Common reasons for high load are:

  • Too many disk read/write requests
  • MySQL deadlock or query return slow
  • The disk is faulty, and the read/write request cannot obtain resources

If the CPU is high, see the following:

3. Online CPU explosion is close to 100%, how to check?

1) usetopThe command

Then press 1 to enter the first CPU (if it is multi-core, you need to view different cpus separately).

Presentation:

[root@VM-8-8-centos ~]# top
top - 23:17:16 up  7:54.1 user,  load average: 1.73.1.70.1.71
Tasks:  95 total,   1 running,  94 sleeping,   0 stopped,   0 zombie
%Cpu(s): 50.0 us, 50.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  1882056 total,    69588 free,  1255116 used,   557352 buff/cache
KiB Swap:        0 total,        0 free,        0 used.   478816 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 1953 root      20   0  101080   2248   1732 S  0.3  0.1   0:01.89 YDLive
 2310 root      20   0 2369316 246988  13760 S  0.3 13.1   0:22.47 java
 5082 root      20   0  154808  10500   3248 S  0.3  0.6   0:11.14 YDService
    1 root      20   0   43444   3872   2580 S  0.0  0.2   0:01.27 systemd
    2 root      20   0       0      0      0 S  0.0  0.0   0:00.00 kthreadd
    4 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 kworker/0:0H
    5 root      20   0       0      0      0 S  0.0  0.0   0:00.07 kworker/u2:0
    6 root      20   0       0      0      0 S  0.0  0.0   0:00.02 ksoftirqd/0
    7 root      rt   0       0      0      0 S  0.0  0.0   0:00.00 migration/0

Copy the code

See the upper right corner:

load average: 1.73.1.70.1.71
Copy the code

– The following three values of Load Average are the load of 1 minute, 5 minutes, and 15 minutes respectively. Refers to the average number of processes in runnable and non-interruptible states. The higher the number, the greater the CPU load.

If it is less than the number of cpus * the number of cores per CPU, then the load is reasonable. For example, my server has 1 CPU and only 1 core.

How to check the CPU and CPU core?

Check the number of cpus:

cat /proc/cpuinfo | grep "model name" 
Copy the code

View CPU core:

cat /proc/cpuinfo | grep "cpu cores"
Copy the code

Presentation:

[root@VM-8-8-centos ~]# cat /proc/cpuinfo | grep "model name"
model name      : AMD EPYC 7K62 48-Core Processor
[root@VM-8-8-centos ~]# cat /proc/cpuinfo | grep "cpu cores"
cpu cores       : 1
Copy the code

2) Keyboard press X

Press X to sort the CPU usage and find the PID whose CPU is too high. Take PID 19505 as an example.

Then look at the thread situation for this PID:

ps -mp 19505 -o THREAD,tid,time   
Copy the code

Presentation:

[root@VM_0_12_centos ~]# ps -mp 19505 -o THREAD,tid,time   
USER     %CPU PRI SCNT WCHAN  USER SYSTEM   TID     TIME
root      0.0   -    - -         -      -     - 04:03:21
root      0.0  19    - futex_    -      - 19505 00:00:00
root      0.0  19    - futex_    -      - 19507 00:00:08
root      0.0  19    - futex_    -      - 19508 00:00:01
root      0.0  19    - futex_    -      - 19509 00:47:56
root      0.0  19    - futex_    -      - 19510 00:00:00
root      0.0  19    - futex_    -      - 19511 00:00:00
root      0.0  19    - futex_    -      - 19512 00:00:00
root      0.0  19    - futex_    -      - 19513 00:07:45
root      0.0  19    - futex_    -      - 19514 00:00:00
root      0.0  19    - futex_    -      - 19515 00:00:00
root      0.0  19    - futex_    -      - 19516 00:00:00
root      0.0  19    - futex_    -      - 19517 00:00:00
root      0.0  19    - futex_    -      - 19518 00:01:33
root      0.0  19    - futex_    -      - 19519 00:01:21
root      0.0  19    - futex_    -      - 19520 00:00:00
root      0.0  19    - futex_    -      - 19521 02:23:05
root      0.0  19    - futex_    -      - 19539 00:00:00
root      0.0  19    - futex_    -      - 19540 00:00:00
root      0.0  19    - futex_    -      - 19576 00:05:10
Copy the code

Or use the following command

top -Hp 19505 -d 1 -n 1  
Copy the code

It’s all the same,

3) Convert tid (thread ID) to hex

The following uses TID 19507 as an example

printf "%x\n" tid
Copy the code

Presentation:

[root@VM_0_12_centos ~]# printf "%x\n" 19507
4c33
Copy the code

Tid 4c31 thread stack

Just look at the first 30 lines

jstack 19505 |grep tid -A 30
Copy the code

Presentation:

[root@VM_0_12_centos ~]# jstack 19505 |grep 4c33 -A 30 
"DestroyJavaVM" #36 prio=5 os_prio=0 tid=0x00007fbb3800a000 nid=0x4c33 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"http-nio-8848-Acceptor-0" #34 daemon prio=5 os_prio=0 tid=0x00007fbb3820e800 nid=0x4cb2 runnable [0x00007fbaff268000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
        - locked <0x00000000f2a67c30> (a java.lang.Object)
        at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept(NioEndpoint.java:448)
        at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept(NioEndpoint.java:70)
        at org.apache.tomcat.util.net.Acceptor.run(Acceptor.java:95)
        at java.lang.Thread.run(Thread.java:748)

"http-nio-8848-ClientPoller-0" #33 daemon prio=5 os_prio=0 tid=0x00007fbb38f21000 nid=0x4cb1 runnable [0x00007fbaff369000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x00000000f2a67e60> (a sun.nio.ch.Util$3)
        - locked <0x00000000f2a67e70> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000f2a67e18> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:743)
        at java.lang.Thread.run(Thread.java:748)

"http-nio-8848-exec-10" #32 daemon prio=5 os_prio=0 tid=0x00007fbb38229800 nid=0x4cb0 waiting on condition [0x00007fbaff46a000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000f2a68030> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
Copy the code

4. Check the status of garbage collection GC, including The Times and time of fullGC

1) check

ps -aux|grep java
Copy the code

Suppose the PID is 19505

2) usejstat -gcorjstat -gcutilView space usage

[root@VM_0_12_centos ~]# jstat  -gc 19505
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT   
 0.0   1024.0  0.0   1024.0 72704.0   8192.0   57344.0    45449.8   73168.0 70119.8 8708.0 8169.9    214    7.855   0      0.000    7.855
Copy the code
[root@VM_0_12_centos ~]# jstat  -gcutil 19505
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT   
  0.00 100.00  12.68  79.26  95.83  93.82    214    7.855     0    0.000    7.855
Copy the code

Argument parsing

S0 – percentage of used space on Survivor space 0 on the Heap S1 – percentage of used space on Survivor space 1 on the Heap E – percentage of used space on Eden space O – Percentage of used space in the Old space area on the Heap P – percentage of used space in the Perm space area YGC – Number of Young GC’s occurring from application startup to sampling YGCT – Young GC’s occurring from application startup to sampling Time spent in seconds FGC — Number of Full GC occurrences from application startup to sampling FGCT — Time spent in Full GC from application startup to sampling in seconds GCT — Total time spent in garbage collection from application startup to sampling in seconds

Context switch

Frequent context can cause performance problems

5. Check memory usage

On Linux/Unix systems, there is no need to worry too much about the percentage of memory usage. Generally, more than 90% is normal

1) Use free to check memory usage

[root@VM_0_12_centos ~]#  free -h
              total        used        free      shared  buff/cache   available
Mem:           1.8G        862M         69M        600K        906M        806M
Swap:            0B          0B          0B
Copy the code

2) Free memory

  • To perform the sync
[root@VM_0_12_centos ~]# sync
Copy the code

Description: the sync command runs the sync subroutine. If you must stop the system, run sync to ensure the integrity of the file system. The sync command writes all unwritten system buffers to disk, including modified I-nodes, deferred block I/O, and read/write mapping files.

  • Modify thedrop_cachesparameter

The details of drop_caches are as follows:

A, To free pagecache: clear the pagecache

echo 1 > /proc/sys/vm/drop_caches
Copy the code

B. To free dentries and inodes: clears directory entries and inodes

echo 2 > /proc/sys/vm/drop_caches
Copy the code

C. To free pagecache, dentries and inodes: clears the previous two items

echo 3 > /proc/sys/vm/drop_caches
Copy the code

I’ll try to execute it here:

echo 3 > /proc/sys/vm/drop_caches
Copy the code

Then look at the memory:

[root@VM_0_12_centos ~]# free -h
              total        used        free      shared  buff/cache   available
Mem:           1.8G        862M        904M        600K         71M        856M
Swap:            0B          0B          0B
Copy the code

Result: Free and available are larger, buff/cache is smaller, and effectively frees buffer and cache.

6. Check the hard disk usage

df

[root@VM_0_12_centos ~]# df -hl
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        50G   14G   34G  29% /
devtmpfs        909M     0  909M   0% /dev
tmpfs           920M     0  920M   0% /dev/shm
tmpfs           920M  620K  919M   1% /run
tmpfs           920M     0  920M   0% /sys/fs/cgroup
tmpfs           184M     0  184M   0% /run/user/0
Copy the code

du

[root@VM_0_12_centos ~]# du -h heap 
147M    heap
Copy the code

Non-recursive check directory size, convenient to view the overall situation:

[root@VM_0_12_centos ~]#  du -s -h /root
1.3G    /root
Copy the code

If you know how much your directory occupies, you can clean it up.

7. How to kill processes?

Normally, to terminate a foreground process, press Ctrl + C. The kill command is used to terminate a background process. We will use ps, top and other commands to obtain the PID of the process, and then use the kill command to kill the process.

Such as:

 ps -aux|grep java
Copy the code

Find the thread ID of Java

kill -9 3827
Copy the code

8. Optimized Settings of Linux VM kernel parameters

1) the CPU

Use uptime to check the CPU usage

[root@VM_0_12_centos ~]# uptime
 17:03:41 up 307 days,  1:31.3 users,  load average: 0.00.0.01.0.05
Copy the code

Run vmstat to check CPU usage

[root@VM_0_12_centos ~]# vmstat 2 10 	#2Print every second, total10Time procs -- -- -- -- -- -- -- -- -- -- - the memory -- -- -- -- -- -- -- -- -- -- -- -- -- put swap -- -- -- -- -- -- -- - the system -- -- -- -- -- -- -- -- -- -- -- -- CPU r b SWPD free buff cache si so bi bo in cs us sy id wa st0  0      0 131104 199740 1341608    0    0     0     0  137  301  0  0 99  0  0
0  0      0 131104 199740 1341612    0    0     0    26  162  342  0  0 99  1  0
0  0      0 131140 199740 1341612    0    0     0     0  135  301  0  0 99  0  0
0  0      0 130892 199740 1341616    0    0     0     0  188  463  1  1 99  0  0
0  0      0 130912 199740 1341620    0    0     0    68  145  284  1  0 99  0  0
Copy the code

Explanation:

Procs column

R: run queue length and number of running threads;

B: indicates the number of sleeping processes, that is, the number of blocked processes;

SWPD: The amount of virtual memory that has been used. If it is greater than 0, it indicates that your machine is running out of physical memory.

The memory column

Free: indicates the size of free physical memory.

Buff: cache size of contents, permissions, etc.

Cache: indicates the buffer size. The larger the value is, the more likely it is to hit the buffer and the disk will not be read or written frequently.

Swap column

Si: Indicates the amount of virtual memory read from the disk per second. If this value is greater than 0, it indicates that the physical memory is insufficient or the memory is leaked. My machine has plenty of memory and everything is fine.

So: size of virtual memory written to disk per second, same as above;

IO column

Bi: Number of blocks received by a block device per second. The block device refers to all disks and other block devices in the system. The default block size is 1024 bytes.

Bo: The number of blocks sent per second by the block device. For example, if we read a file, bo would be greater than 0. Bi and BO are generally close to 0, or the I/O is too frequent and the I/O wait time is long, which needs to be adjusted.

The system column

In: number of CPU interrupts per second, including time interrupts.

Cs: number of context switches per second;

The higher these two values are, the more CPU time the kernel consumes

CPU column

Us: indicates the execution time of the user process. A high value of US indicates that the user process consumes much CPU time. Check the program for a long time

Sy: system time. If the value of SY is high, the CPU resources consumed by the system kernel are high

Id: idle time (including IO wait time), the idle time of the CPU. Expressed as a percentage.

Wa: percentage of processes waiting for the CPU

St: indicates the cost of other CPU resources used by the hypervisor

2) port

We can adjust the TCP kernel parameters of Linux to allow the system to release TIME_WAIT connections more quickly.

[root@VM_0_12_centos ~]# netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
CLOSE_WAIT 1
ESTABLISHED 5
Copy the code

Modification:

vim /etc/sysctl.conf
Copy the code

Modify three parameters:

  1. Net.ipv4. tcp_syncookies = 1 Indicates that SYN Cookies are enabled. When THE SYN wait queue overflows, Cookies are enabled to handle the situation, which can prevent a small number of SYN attacks. The default value is 0, indicating that the function is disabled

  2. Net.ipv4.tcp_tw_reuse = 1 indicates that reuse is enabled, allowing time-wait Sockets to be reused for new TCP connections. The default value is 0, indicating that reuse is disabled

  3. Net.ipv4. tcp_TW_recycle = 1 Enables the fast recycling of time-wait Sockets in TCP connections. The default value is 0, indicating that the fast recycling of time-Wait sockets is disabled

View the range of available ports:

[root@VM_0_12_centos ~]# cat /proc/sys/net/ipv4/ip_local_port_range
32768   60999
Copy the code

Modify the sysctl.conf file.

net.ipv4.ip_local_port_range = 1024 65535

3) Clear garbage files in temporary directories and archive logs in scheduled tasks

4) Lock key system files to prevent tampering with the right to be raised

5) Clear unnecessary system virtual accounts

9. How to search reasonably

In other words, use the find parameter wisely

1) in/softwareLocate files larger than 10MB in size

find /software -type f -size +10240k

[root@VM_0_12_centos /]# find /software -type f -size +10240k
/software/mysql-5.633.-linux-glibc2. 5-x86_64.tar.gz
/software/mysql/lib/libmysqlclient.a
/software/mysql/lib/libmysqld-debug.a
/software/mysql/lib/libmysqld.a
Copy the code

2) Find files in the directory that have not been accessed within 365 days

find /software \! -atime -365

[root@VM_0_12_centos /]# find /software \! -atime -365
/software
/software/mysql-5.720.-linux-glibc212.-x86_64.tar.gz
Copy the code

3) Find files that were modified 365 days ago in the directory

find /home -mtime +365

[root@VM-8-8-centos ~]# find /home -mtime +365
/home
/home/HaC
/home/HaC/HaC.pub
/home/HaC/HaC
Copy the code

Linux directory structure

Common:

  • /bin: Bin is short for Binaries, the directory that houses the most frequently used commands.
  • /boot: This store is used to start Linux core files, including some connection files and image files.
  • /dev: dev is short for Device. This directory stores Linux external devices. In Linux, the way to access devices is the same as the way to access files.
  • /etc/etc: etc stands for Etcetera. This directory is used to store all configuration files and subdirectories needed for system management.
  • /home: the user’s home directory. In Linux, each user has a directory named after the user’s account, such as Alice, Bob, and Eve in the figure above.
  • /lib: Lib is short for Library. This directory houses the system’s most basic dynamically linked shared libraries, which function like DLL files in Windows. Almost all applications need these shared libraries.