radiation

The basic framework of radioactivity

The services within each server are composed of the uniform structure above.

Total process time statistics of the test environment

The whole process

Waiting to pull data

PACS module receives data

The back-end processing

Algorithm in line

Algorithm module calculation

The average time

950s

630s

19s

27s

189s

85s

The percentage

66%

20%

Data pull module

Overall structure drawing

Time consumption analysis of the existing scenario (illustrated with default configuration below)

Find periodic check → db (3 mins)

Move periodic check → db (1 min)→ delay(10 mins)

(receive dicom, compress dicom, seconds level ingore)

The overall delay is typically 10-11 minutes. (Incorrect server clock affects this value)

New Design (first draft)

Explain new design logic

The Find cycle queries PACS, and if the number of images in a studyUID remains unchanged on the next query, we assume that PACS has received the image. (Note: PACS itself has no way of knowing when the image is finished.)

Find and Move use RabbitMq as a messaging channel, providing more timely responsiveness and lower CPU and IO consumption than periodic polling databases

Find pacS with the filtering conditions of the original service back-end, not only can reduce ris interconnection, but also can be obtained on demand, rather than batch pull and filter abandon (waste of bandwidth, disk IO, disk storage and CPU consumption).

The interface between the data pull module and the service back end is changed from the original data pull module based on restful API to MQ as the boundary, and the service back end obtains the interface according to its own processing capacity

Image compression postposition. The existing scheme compresses the image immediately after it is pulled, resulting in an additional decompression CPU overhead (typically 3 seconds) for the business back end and algorithm module when processing the image. The new scheme is modified to compress the image after the algorithm module is finished.

The most optimistic time analysis of the new design

Find periodic check (1 min)

if study instance nums keep the same, publish message to (rabbitmq) to Move

The most optimistic time is about 1.5 minutes (actual time is based on hospital operation)

Pull by sequence

The reception time of PACS in the test environment is 20 seconds. It should be noted that the test environment is a single sequence.

Typical studies include (anteroposterior thin/and thick layers, mediastinal window thin/and thick layers, positioning films), excluding positioning films of very small volume, usually with 2 or 4 sequences, here the average is 3.

It normally receives data at study level, which takes about 60 seconds.

If the target sequence can be pulled by series, only 40 seconds can be saved.

After the business side

The average time of the back end from receiving data to sending requests to the algorithm module is 1 minute and 30 seconds (depending on the layer thickness, this is the average time of Xiamen University data). I mainly do two things, 1 is data archiving, 2 is sequence screening. The main time is data archiving.

Data archiving is to read DICOM one by one, and then put the pictures of the same seriesUID into the same folder. Meanwhile, some key header information (patient information, examination information, sequence information, picture information, etc.) is stored in the database.

We also need to do some fault tolerance, delete the duplicate images in the same sequence (the instanceNumber is the same), and delete the images with inconsistent pixels.

What can be adjusted:

The archiving function is directly processed in pacsInterface. When pacsInterface receives data, it will also parse the DICOM header information, so it can be directly archived according to the sequence and save the header information in the database. Just like PACS servers do.

Algorithm of engineering

CT for small pulmonary nodules cancels conversion to FLOAT16 with an average reduction of 10 seconds per case.

The front-end UI

Front-end performance optimization

Nginx-based DICOM download performance (28% faster, 4 cores less CPU usage)

With lossless compression, the image volume is reduced by nearly 50% and the page loading speed is halved. (100M bandwidth, 1100 slice CT, 55 seconds before compression loading, only 26 seconds after lossless compression)

Operation/operation module

Desensitization packaging

Primary design (omits secondary logic)

Efficiency improvement

LRU CACHE: Use lru_cache to reduce the double calculation of volume estimation (lru_cache does not support list input, so only db query can be cached)

Process Pool: Use a Process Pool to desensitize images (desensitization requires CPU resources, so choose processes rather than threads. Using pool can reuse processes and reduce the consumption of constantly opening processes.

Query DB Necessary: The database only asks for the Necessary data (in the past, it was simple and simple to pull the whole document from the DB, for the document like study, only pull the required fields, which can be faster)

Memory Filesystem: after desensitization, the dicom image before packaging is output to the in-memory Filesystem, which can reduce disk IO consumption

Only Archive: DicOM images cannot be reduced by traditional compression algorithms, so Only packaging is required to reduce the CPU consumption of packaging and unpacking

OCR Server

First, thanks to the plug-in client for de-processing

The plug-in will hash the image after taking a screenshot of the GUI. If the image is not changed, it will not send the image to the OCR Server, which greatly reduces the pressure on the OCR Server.

Replace the Best model of type float with a FAST model of type INTEGER

In the past, in order to pursue the best recognition rate, the best model has been adopted, and even the fusion model (multi-model) has been adopted.

The OCR recognition rate of the INTEGER based FAST model is almost the same as that of the float based Best model. Therefore, fast model is now selected by default.

Compared with best model, fast model can save 1/2 ~ 2/3 running time.

basis

The database

Any table field that requires find must be configured with an index

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

System efficiency of radiation AI software has been improved

radiation

The basic framework of radioactivity

Total process time statistics of the test environment

Data pull module

Overall structure drawing

Pull by sequence

After the business side

Algorithm of engineering

The front-end UI

Operation/operation module

Desensitization packaging

OCR Server

basis

The database

System efficiency of radiation AI software has been improved

radiation

The basic framework of radioactivity

Total process time statistics of the test environment

Data pull module

Overall structure drawing

Pull by sequence

After the business side

Algorithm of engineering

The front-end UI

Operation/operation module

Desensitization packaging

OCR Server

basis

The database

Related Posts

Build tools using the swaggerAPI in the gin project

PyQt5’s 7 main classes | Project review

Elasticsearch series – Multi-field search