1. The program:

#include<cstdio> #include<cstdlib> #include<sys/time.h> #include<sys/ resource-. h> // Include the time slice used by other processes and the time spent by the process in blocking (e.g., waiting for an I/O operation to complete). Void getTime (double * CPU) {struct rusage ru; if(cpu ! = NULL) { getrusage(RUSAGE_SELF, &ru); *cpu = ru.ru_utime.tv_sec + (double)ru.ru_utime.tv_usec * 1e-6; } } int main(int argc,char* argv[]) { int count = 20000; double cpu0,cpu1,cpu2; int* arr = (int*)malloc(sizeof(int) * count * count); int i,j; gettime(&cpu0); // For (I =0; i<count; i++) { for(j=0; j<count; j++) { arr[i * count + j]=1; //printf("%d-%d ",i,j); } } // printf("\n"); gettime(&cpu1); // For (I =0; i<count; i++) { for(j=0; j<count; j++) { arr[j * count + i]=1; // printf("%d-%d ",j,i); } } // printf("\n"); gettime(&cpu2); Printf (" CPU time difference in traversing a 2D array by row: %lf\n",cpu1-cpu0); Printf (" CPU time difference in traversing a two-dimensional array by column: %lf\n",cpu2-cpu1); return 0; }

Test Machine: Processor: Apple M1 Memory: 8G

2. The result:





3. The instructions

The first for loop accesses a two-dimensional array on a row basis (line 1, line 1, line 2… The second line, the first…) , the second for loop accesses a two-dimensional array by column (row 1, row 2, row 1… The second line in the first line… It can be seen from the experimental results that, with the increase of the amount of array data, the speed difference between the two ways of accessing two-dimensional arrays is getting bigger and bigger

4. Principle analysis

The memory address of a two-dimensional array is contiguous, and the end of the current line is adjacent to the beginning of the next line. In modern computers, there is another storage mechanism between the CPU and memory, called the CPU cache. The capacity of the CPU cache is much smaller than that of memory but the swap speed is much faster than that of memory. Caching is mainly to solve the contradiction between CPU’s computing speed and memory’s reading and writing speed. Because CPU’s computing speed is much faster than memory’s reading and writing speed, it will make CPU spend a long time waiting for data to arrive or to write data to memory. When an array element is accessed, the CPU does not read from memory one element at a time, but from a region of elements. Assuming that the size of the two-dimensional array is (10 x 10), the CPU will also read the first element when it accesses its adjacent elements. Since the array is small, the CPU can cache all the elements at once, so whether the array is accessed by row or by column, the CPU will access the same amount of main memory. As the array gets larger and larger, the CPU cache can only read less than one row of the array at a time. As a result, accessing the elements by column requires memory access for each element, which makes it much slower.