Article source: blog.csdn.net/greencacti/…

Recently, in a discussion with my friend, my friend said that the performance is better when the threads are opened more, because there are already hundreds of threads in the system, and the more threads the concurrent program opens, the higher the probability of allocation to the CPU. I don’t agree with that. Let’s test it.

The minimum number of CPU cores *2 threads or number of CPU cores *2 +2 threads is preferred

Test scenario:

Hardware: HP G6(8-core 16-pipeline)

Operating system: Suse10 SP2

 

Test program implementation description:

Define a long array with n entries. If there are m threads, thread 1 accumulates elements from 0 to (n/m-1), thread 2 accumulates elements from n/m to (2n/m-1)… . Finally, add all the entries of the long array and see the number (this number is the number of operations on the array by all threads).

 

Test results:

Thread Num   Result1                     Result2                     Result3

Two Threads: 186793032169 186697212695

Threads: 373926225686 373407260760

8   Threads:   742,705,309,211        746,706,667,311        744,690,656,181

16 Threads: 794363499559

20 Threads: 794,068, 481,342,793,703,559,674,794,294,479,450

Test Conclusions:

1. When the number of Active threads equals the number of CPU pipelines, the system performance reaches the highest.

2. Of course, when the actual system has IO and network operations, the bottleneck may be in IO in many cases.

3. When designing the system, take into account the number of Active threads in the system, because many threads may enter the block state, it is not necessarily considered that the number of threads equal to pipeline is the best performance;

Test procedure:

#include <stdio.h>

#include <stdlib.h>

#include <pthread.h>

#define ARRAY_SIZE 1000000

struct range

{

int start;

int end;

};

long array[ARRAY_SIZE];

int threadStopFlag = 0;

void *updateArray(void * ptrRange);

int main(int argc, char* argv[])

{

struct range *rangeArray[100] = {NULL};

pthread_t threadIdArray[100];

int numOfThreads = 0;

int i = 0;

int size = 0;

int error = 0;

long result = 0;

//check the number of parameters if(2 ! = argc) { printf(“usage: a.out threadnumber/n”); return -1; }

    //validate the parameter

numOfThreads = atoi(argv[1]);

if(numOfThreads > 100)

{

printf(“The number of threads is greater than 100/n”);

return -1;

}   

//create the worker thread size = ARRAY_SIZE / numOfThreads; for(i=0; i< numOfThreads; i++) { //calculate the start and end value rangeArray[i] = (struct range*)malloc(sizeof(struct range)); if(NULL == rangeArray[i]) { return -1; } rangeArray[i]->start = i * size; if(i ! = numOfThreads – 1) { rangeArray[i]->end = rangeArray[i]->start + size – 1; } else { rangeArray[i]->end = ARRAY_SIZE -1; }

        //create the threads

error = pthread_create(&threadIdArray[i], NULL, updateArray, (void *)rangeArray[i]);

if(error != 0)

{

printf(“pthread is not created./n”);

return -1;

}



}

    //kill all the worker threads

sleep(300);

threadStopFlag = 1;

    sleep(60);

//free the malloc memory

for(i=0; i < numOfThreads; i++)

{

free(rangeArray[i]);

rangeArray[i] = NULL;

}

    //calculate the total number

for(i=0; i<ARRAY_SIZE; i++ )

{

result += array[i];

}

printf(“The total number is %ld/n”, result);

return 0;

}

void *updateArray(void * ptrRange)

{

struct range *arrayRange;

int pointer = 0;;

int start = 0;;

int end = 0;

    arrayRange = (struct range *)ptrRange;

start = arrayRange->start;

end = arrayRange->end;

pointer = start;



while(1)

{

if(pointer > end)

{

pointer = start;

}

        array[pointer] += 1;

pointer++;

if(1 == threadStopFlag) { pthread_exit(0); }}} \

Copyright notice: This article is the blogger’s original article, shall not be reproduced without the permission of the blogger.

\