Good system administrators are distinguished by efficiency. If an effective system administrator can do something in 10 minutes that would take another 2 hours to do, he should be rewarded (and paid more) for saving the company time, and time is money. Here are some tips to save time — if not get paid more for being productive, then at least have more free time to spend.

Tip 1: Unmount an unresponsive DVD drive

Beginner’s experience: When you press the Eject button on the DVD drive of your server (running a Redmond-based operating system), it pops up immediately. He then complains that on most enterprise Linux servers, if you run a process in that directory, the pop-up will not happen. As a longtime Linux administrator, I reboot the machine. If I don’t know what’s running and why I’m not releasing the DVD drive, I eject the disk. But that’s inefficient.

Here’s how to find the process that holds the DVD drive and eject the DVD drive easily: First simulate. Put a disk in a DVD drive, open a terminal, and load the DVD drive:

# mount /media/cdrom # cd /media/cdrom # while [ 1 ]; do echo "All your drives are belong to us!" ; sleep 30; doneCopy the code

Now open the second terminal and try eject the DVD drive:

# eject

You get the following message:

umount: /media/cdrom: device is busy

Before releasing the device, let’s find out who is using it

# fuser /media/cdrom

The process is running and it is our fault that the disk cannot be ejected. Now, if you are root, feel free to terminate the process:

# fuser -k /media/cdrom

Now you can finally unmount the drive:

# eject

Fuser is normal.

Tip 2: Recover the broken screen

Try the following:

# cat /bin/cat

Attention! The terminal is like garbage. All the input is very messy. So what to do?

Reset input. However, typing reset is too close to typing reboot or shutdown. Sweat your palms – especially when doing this on a production machine.

Rest assured, the machine will not restart while you do this. Continue operation:

# reset

The screen is now back to normal. This is much better than closing the window and logging in again, especially if you have to go through 5 machines and SSH to get to this machine.

Tip 3: Screen collaboration

Advanced maintenance user David from Product Engineering called and said, “Why can’t I compile Supercode.c on these new machines you’re deploying?”

You ask him, “What kind of machine are you running?”

David replied, “Posh”. (The modest company named its five production servers in honor of the Spice Girls.) Here you go. The other machine is operated by David:

# su - david

Go to the posh:

# ssh posh

Once there, run the following code:

# screen -S foo

Then call David: “David, run the command # screen-x foo from the terminal.”

At this point, your session with David is joined together in the Linux shell. You can type and he can type, but each can see what the other is doing. This avoids entering other layers, and both sides have equal control. The advantage of this is that David can observe your troubleshooting skills and understand exactly how to solve the problem.

Finally, you can see the problem: David’s compilation script hardcodes an old directory that is not on the new server. Load it and compile again to fix the problem, and David gets back to work. You can continue your entertainment.

One thing to note about this technique is that both parties need to be logged in as the same user. The screen command can also: implement multiple Windows and split screens. Please read the man page for more information.

I have one final tip for screen sessions. To detach and open it, type Ctrl-a D

Hold down the Ctrl key and click the A key. Then press D. This can then be spliced back together by running the screen-x foo command again.

Tip 4: Retrieve the root password

If you forget the root password, you must reinstall the entire machine. What’s worse, many people do it. But starting the machine and changing the password is simple. This doesn’t work in all cases (like setting a GRUB password and forgetting it, too), but here’s a Cent OS Linux example to show you what to do in general.

Restart the system first. On reboot, the GRUB screen shown in Figure 1 pops up. Move the arrow keys so that they can remain on this screen instead of entering normal startup.



Figure 1. GRUB screen after reboot

Then, use the arrow keys to select the kernel to boot, and type E to edit the kernel line. You should then see the screen shown in Figure 2:



Figure 2: Ready to edit the kernel line

Use the arrow keys again to highlight the line starting with kernel, and press E to edit the kernel parameters. When you reach the screen shown in Figure 3, append the number 1 to the parameters shown in Figure 3:



Figure 3. Append the number 1 to the parameter

Then press Enter and B to boot the kernel into single-user mode. Then run the passwd command to change the user root password:

Sh -3.00# passwd New UNIX Password: Retype New UNIX Password: passwd: All authentication tokens updated successfullyCopy the code

You can now reboot and the machine will start with the new password.

Tip 5: SSH backdoor

Many times I’ve been on a site that needed remote support from someone who was blocked by the company firebreak. Few people realize that if you can get outside through a firewall, you can easily get outside information in. This is literally called “punching a hole in the firewall”. I call it the SSH backdoor. In order to use it, you must have a machine connected to the Internet as an intermediary. In this case, we call such a machine Blackbox.example.com. The machine behind the company’s firewall is called Ginger. The machine supported by this technology is called Tech. Figure 4 illustrates the setup process.



Picture 4. Blow a hole in the firewall

Here are the steps:

Check what is allowed, but make sure you ask the right person. Most people are worried that you have a firewall on, but they don’t understand that it’s fully encrypted. And you have to hack the outside machine to get inside. However, you may be the “can-do” type. Judge for yourself the way you should choose, but don’t blame others when you don’t like it.

Use the -r flag to SSH from Ginger to Blackbox.example.com. Assume that you are the root user on Ginger, and Tech needs the root user ID to help use the system. Use the -r flag to forward the instructions for port 2222 on the Blackbox to Port 22 on Ginger. This sets up the SSH channel. Note that only SSH communication can get into Ginger: You don’t put Ginger on the unprotected Internet. You can do this using the following syntax:

~# ssh -R 2222:localhost:22 [email protected]
Copy the code

Once in BlackBox, just stay logged in. I always type the following command:

thedude@blackbox:~$ while [ 1 ]; do date; sleep 300; done
Copy the code

Keep the machine busy. Then minimize the window.

Now instruct your friends on Tech to use SSH to connect to BlackBox without using any special SSH tags. But you have to give them the code:

root@tech:~# ssh [email protected]
Copy the code

Once tech is on the Blackbox, it can connect to Ginger from SSH using the following command:

thedude@blackbox:~$: ssh -p 2222 root@localhost
Copy the code

Tech will prompt for a password. Ginger’s root password should be entered. Now you and support from Tech can work together and solve the problem. You even need to use the screen together! (See Tip 4).

Tip 6: Remote VNC sessions over SSH channels

VNC or virtual network computing has been around for a long time. Typically, I only need VNC when a certain type of graphics program on a remote server is available only on that server.

For example, suppose in Tip 5 that Ginger is a storage server. Many devices use GUI programs to manage storage controllers. These GUI management tools typically require a direct connection to the storage server through a network, which is sometimes kept in a dedicated sub-network. Therefore, the GUI can only be accessed through Ginger.

You can try using the -x option to connect to Ginger over SSH and start it, but this is bandwidth demanding and you have to endure the pain of waiting. VNC is a network-friendly tool for almost all operating systems.

Assume the setup is the same as in Tip 5, but you want Tech to access VNC instead of SSH. In this case, something similar needs to be done, but VNC ports are forwarded. Perform the following steps:

Start a VNC server session on Ginger. Run the following command:

root@ginger:~# vncserver -geometry 1024x768 -depth 24 :99
Copy the code

These options instruct you to start the server with a resolution of 1024×768 and a pixel depth of 24 bits per pixel. If you are using a slower connection setup, 8 May be a better option. Use :99 to specify the port on which the VNC server can be accessed. The VNC protocol starts at 5900, so :99 means the server is accessible from port 5999.

When starting the session, you are asked to specify a password. The user ID is the same as the user who started the VNC server (root in this case).

SSH from Ginger’s connection to blackbox.example.com forwards port 5999 on blackBox to Ginger. This is done in Ginger by running the following command:

root@ginger:~# SSH -r 5999:localhost:5999 [email protected] After running this command, you need to keep this SSH session open to preserve the port for forwarding to Ginger. At this point, if you are on blackbox, you can access the VNC session on Ginger by running the following command:

thedude@blackbox:~$ vncviewer localhost:99
Copy the code

This will forward the port to Ginger via SSH, but we want to give VNC access to Ginger via Tech. To do this, you need another channel. In Tech, open a channel to forward port 5999 to port 5999 on the BlackBox via SHH. This is done by running the following command:

root@tech:~# ssh -L 5999:localhost:5999 [email protected]
Copy the code

The SSH used this time is marked -L, and instead of putting 5999 into the Blackbox, it gets it from there. Once you reach the BlackBox, you need to keep this session open. Now you can use VNC in Tech!

In Tech, run the following command to connect VNC to Ginger:

root[@tech](https://my.oschina.net/u/1132):~# vncviewer localhost:99

Tech will now have a VNC session directly to Ginger. Setting it up is a bit of a hassle, but it beats running around trying to fix the storage array. But practice it a few times and it gets easier.

I’ll add one more thing to this tip: If tech is running Windows® and doesn’t have a command-line SSH client, then tech can run Putty. Putty can be set to forward SSH ports by looking for options in the sidebar. If the port is 5902 instead of 5999 in this example, you can enter what is shown in Figure 5.



Figure 5. Putty can forward SSH used as a channel

If this is set up, tech can connect to localhost:2 using VNC as if tech were running on a Linux operating system.

Tip 7: Check bandwidth

Imagine: Company A has A storage server named Ginger and loads NFS through A client node named Beckham. Company A determines that they need more bandwidth from Ginger because there are A large number of nodes that require NFS to mount Ginger’s shared file system.

The most common and cheapest way to do this is to combine two gigabit Ethernet nics together. This is the cheapest, because you usually have an extra NIC available and an extra port.

So take this approach. The question now is: How much bandwidth?

Gigabit Ethernet has a theoretical limit of 128MBit/s. Where does this number come from? Take a look at these calculations:

1 gb = 1024 MB; 1024Mb/8 = 128MB;" B = bits, b = bytesCopy the code

But what do you actually see, and what’s a good way to measure it? One tool I recommend is iperf. Iperf can be obtained as follows:

Wget # http://dast.nlanr.net/Projects/Iperf2.0/iperf-2.0.2.tar.gzCopy the code

You need to install the tool on a shared file system visible to Ginger and Beckham, or compile and install it on both nodes. I’ll compile it in Bob’s user’s home directory, which is visible to both nodes:

Tar ZXVF iperf*gz CD iperf-2.0.2./ configure-prefix =/home/ Bob /perf make make installCopy the code

On ginger, run:

# /home/bob/perf/bin/iperf -s -f M

This machine will be used as a server and output execution speed in MBit/s.

On the Beckham node, run:

# /home/bob/perf/bin/iperf -c ginger -P 4 -f M -w 256k -t 60
Copy the code

The results on both screens indicate what the speed is. On a normal server using a gigabit adapter, you might see speeds of about 112MBit/s. This is a common bandwidth in the TCP stack and in physical cables. By connecting two servers end-to-end, each using two connected Ethernet cards, I gained about 220MBit/s of bandwidth.

In fact, the NFS seen on a connected network is about 150-160 Mbit /s. This still means that the bandwidth is working as expected. If you see smaller values, you should check to see if there is a problem.

I recently encountered a situation where two nics using different drivers were connected through a connect driver. This results in very poor performance, with a bandwidth of around 20MBit/s, which is less than when you are not connected to an Ethernet card!

Tip 8: Command-line scripts and utilities

Linux system administrators can become more efficient by using authoritative command-line scripts. This includes clever use of loops and utilities that know how to parse data using awk, grep, sed, and so on. In general, this reduces keystrokes and user error rates.

For example, suppose you need to generate a new /etc/hosts file for the Linux cluster you are installing. The common approach is to add the IP address in a VI or text editor. However, you can do this by using the existing /etc/hosts file and appending the following to it. Run on the command line:

# P=1; for i in $(seq -w 200); Do the echo "192.168.99. $P n $I". P=$(expr $P + 1); done >>/etc/hostsCopy the code

200 host names (N001 through N200) will be created from IP addresses (192.168.99.1 through 192.168.99.200). It is possible to create duplicate IP addresses or host names by populating such files manually, so this is a good example of using the built-in command line to eliminate user errors. Note that this is done within the Bash shell (the default for most Linux distributions).

As another example, suppose you want to check whether the memory size is the same among the compute nodes in a Linux cluster. In general, it’s best to have a distribution or similar shell. But for demonstration purposes, SSH is used. Assume that SSH is set not to use password authentication. Then run:

# for num in $(seq -w 200); do ssh n$num free -tm | grep Mem | awk '{print $2}';

done | sort | uniq
Copy the code

The command line is pretty neat. (It’s even worse if you put regular expressions in it). Let’s break it down and discuss the parts in detail.

First, the loop goes from 001 to 200. Use the -w option of the seq command to prepopulate 0. Then replace the num variable to create the host connected over SSH. Once you have the target host, issue commands to it. In this case:

free -m | grep Mem | awk '{print $2}'

Use the free command to get the memory size in megabytes.

2. Get the result of the command and grep the line containing the string Mem.

3. Get that row and use AWK to print the second field, which is the total memory in the node, on each node.

After executing the command on each node, 200 nodes of the whole output is transmitted (| d) to sort command, to sort all memory value. Finally, use the uniq command to eliminate duplicates. This command causes one of the following:

1. If all nodes (N001 to N200) have the same memory size, only one number is displayed. This number is how much memory each operating system sees.

2. If the nodes have different memory sizes, you will see several memory size values.

3. Finally, if SSH fails on a node, you will see some error messages.

This command is not perfect. If you find a different memory value than expected, you don’t know which node is faulty or how many nodes there are. Another command needs to be issued to do this.

This technique provides a quick way to view something, and you can immediately know if something is wrong. Its value lies in quick inspection.

Tip # 9: Console reconnaissance

Some software will output error messages to the console, which will not necessarily be displayed in the SHH session. You can use a VCS device to perform this check. In an SSH session, run the following command on the remote server # cat /dev/vcs1. This will display the contents of the first console. You can also use 2 and 3 to view other virtual terminals. If a user types on a remote system, you will see what he typed.

In most data farms, using a remote terminal server, KVM, or even Serial Over LAN is the best way to view this information; It also offers some benefits of out-of-band viewing. Using VCS devices provides a fast in-band method that saves time going to the machine room to check the console.

Tip 10: Random system information collection

In Tip 8, you saw an example of using the command line to get information about the total memory in the system. In this tip, I’ll introduce several other methods for gathering important information from systems that need validation, troubleshooting, or remote support.

First, gather information about the processor. This is easily implemented with the following command:

# cat /proc/cpuinfo

This command gives information about the speed, number, and model of the processor. In many cases you can use grep to get the desired value. One check I often make is to determine the number of processors in the system. So if I buy a quad-core server with a dual-core processor, I can run the following command:

# cat /proc/cpuinfo | grep processor | wc -l
Copy the code

And then I see that the value should be 8. If not, I call the vendor and ask them to send me another processor.

The other piece of information I need is disk information. You can obtain this value using the df command. I always add the -h flag to see output in gigabytes or megabytes. # df -h also displays the partition information of the disk.

At the end of the list is a way to view system firmware – a way to get firmware information at the BIOS level and on the NIC.

To check the BIOS version, run the dmidecode command. Unfortunately, grep is not an easy way to get information, so this is not a very efficient method. For my Lenovo T61 laptop, the output is as follows:

#dmidecode | less ... BIOS Information Vendor: LENOVO Version: 7LET52WW (1.22) Release Date: 08/27/2007...Copy the code

This is much more efficient than restarting the machine and looking at the POST output. To check the driver and firmware version of the Ethernet adapter, run ethtool:

# ethtool -i eth0

driver: e1000

version: 7.3.20-k2-NAPI

firmware-version: 0.3-0
Copy the code

conclusion

There are many tips you can learn from someone who is proficient at the command line. The best way to learn is:

1. Work with others. Share screen sessions and watch how others work — you’ll discover new ways of doing things. You may need to be humble and let others guide you, but you can usually learn a lot.

2. Read the man page. Read the man pages carefully, even for familiar commands, to gain insight. For example, you probably didn’t know you could use AWK for network programming.

3. Solve problems. As a system administrator, you always have to solve problems, whether they are caused by you or someone else. This is experience, and experience can make you better and more efficient.

The best administrators are laid back because they find the fastest way to get things done and get them done quickly enough to maintain a casual life.