We will also learn about file manipulation in PHP through a series of articles. Today we’re going to start with an extension that very few people have ever used, or even know about, and that is slightly different from our daily file operations. However, these differences are not obvious to the naked eye. They are mainly due to the balance between business requirements and performance.

What is Direct IO

Direct IO is actually a concept in the Linux operating system. It means to manipulate the file stream directly. Why direct? In fact, when our operating system does file operation, it is not directly on the disk to read and write the file, there is a layer of page cache in the middle. Since it’s a cache, it certainly provides a performance boost, but it’s not absolute. Direct operation is to ignore this layer of caching operations, directly read and write to the files on the disk. As we all know, there is a huge gap between the processing speed of disks, even SSDs, and the CPU and memory, and the default page cache is designed to bridge this gap. While page caching can increase CPU operations and memory usage, direct operations do not have this problem, but it is not as fast as a cached file reading operation.

The above is a simple understanding of Direct IO. For a more detailed explanation, please refer to the second link in the reference document at the end of this article and further study. In PHP, you can download the Direct IO extension directly in PECL and install it as an extension would normally install.

Create write file

Since this is a file operation, let’s first create and write some file data.

$fd = dio_open("./test", O_RDWR | O_CREAT);

echo dio_write($fd, "This is Test.I'm ZyBlog.Show me the money4i"), PHP_EOL;
// 43

print_r(dio_stat($fd));
// Array
// (
//     [device] => 64768
//     [inode] => 652548
//     [mode] => 35432
//     [nlink] => 1
//     [uid] => 0
//     [gid] => 0
//     [device_type] => 0
//     [size] => 43
//     [block_size] => 4096
//     [blocks] => 8
//     [atime] => 1602643459
//     [mtime] => 1602656963
//     [ctime] => 1602656963
// )

dio_close($fd);

And f series function is similar, we need to use a dio_open () function to open a file, O_RDWR | O_CREAT parameters mean open a read/write files, and if the file does not exist, create it. These two constants correspond to the constants associated with direct manipulation files in Linux, and you can also see an explanation of these constants in the link at the end of this article.

The same can be done with dio_write(), which returns the length of the write, 43 characters in this case.

Dio_stat () is the information that returns the current file handle. We can see the device number, uid, gid, atime, mtime, etc. They are similar to the information we can see in Linux, which is just some simple information about the file.

Read the file

Reading the file can be done with a very simple function.

$fd = dio_open("./test", O_RDWR | O_CREAT);

echo dio_read($fd), PHP_EOL;
// This is Test.I'm ZyBlog.Show me the money4i

dio_close($fd);

The dio_read() function also takes an additional argument to read the content at a specified byte length, as we’ll see in an example later.

File operations

During the process of reading a file, it is possible to read only a part of the file, or to start from a certain location. The following operation functions operate on both aspects.

$fd = dio_open("./test", O_RDWR | O_CREAT);

var_dump(dio_truncate ($fd , 20)); 
// bool(true)
echo dio_read($fd), PHP_EOL;
// This is Test.I'm ZyB

dio_seek($fd, 3); 

echo dio_read($fd), PHP_EOL;
// s is Test.I'm ZyB

dio_close($fd);

As the name suggests, dio_truncate() is used to truncate the contents of a file. Here we truncate from the 20th character and then use dio_read() to read only the first 20 characters.

Dio_seek () specifies which character to start reading from. After we specify the starting character position of 3, the first three characters will not be read. Note that dio_truncate() modifies the contents of the original file, while dio_seek() does not.

Other Settings

$fd = dio_open('./test', O_RDWR | O_NOCTTY | O_NONBLOCK); dio_fcntl($fd, F_SETFL, O_SYNC); dio_tcsetattr($fd, array( 'baud' => 9600, 'bits' => 8, 'stop' => 1, 'parity' => 0 )); while (($data = dio_read($fd, 4))! =false) { echo $data, PHP_EOL; } // This // is // Test // .I'm // ZyB dio_close($fd);

The dio_fcntl() function is called from the C library to perform some specified operation on the file descriptor. This operation is also fixed with some constants. Here we use F_SETFL, which means to set the file descriptor flag to the specified value. This O_SYNC means that if the descriptor is set, the write to the file will not end until the data is written to disk. Of course, this function can also set a lot of other operators, you can refer to the official PHP documentation for further study.

Dio_tcsetattr () is used to set the terminal properties and baud rate of the open file. Baud is the baud rate, bits is the bits, stop is the stop bits, and parity is the bits. This requires some knowledge from Principles of Computer Composition and Operating Systems, which I don’t know very well, so I won’t explain in detail. From this, we can see that the basic courses in college are really very important. I believe that the students who have studied these basic courses will understand the function of this function immediately.

Finally, we use the second parameter in dio_read() to read the contents of the file based on the byte length, and you can see that what is read out is the output in paragraphs of 4 characters.

conclusion

The function is relatively simple to learn, but the core is to know what business scenarios the extension is more suitable for use. In the introduction at the beginning of this article, we explained some of the differences between direct file manipulation and normal file manipulation. Direct manipulation is more CPU – and memory-friendly for self-caching applications or for transferring very large amounts of data. In other cases, we’ll just stick with the default file handling method. In fact, in most cases, we can hardly tell the difference between them. So in practical application, the same sentence, combined with the actual business situation, choose the best scheme.

Test code:

https://github.com/zhangyue0503/dev-blog/blob/master/php/202010/source/4.PHP DirectIO straight operation. The use of file extensions to PHP

Reference documents:

https://www.php.net/manual/zh/book.dio.php

https://www.ibm.com/developerworks/cn/linux/l-cn-directio/

Search for “Hardcore Project Manager” on all media platforms