Thread safety

Thread unsafe is when a class running in multiple threads has unknown results.

The main thread safety issues are: atomic visibility order

atomic

An operation involving access to a shared variable that is indivisible to all threads except the one performing the operation is said to be atomic, and we say the operation is atomic.

  1. Two elements of atomicity problems (shared variables + multithreading)

    • Atomic operations are for operations on shared variables, local variables do not matter whether they are atomic or not (because local variables are located in the stack frame within the thread, there is no multithreading problem).
    • Atomic operations are for multithreaded environments, and there are no thread-safety issues in a single thread.
  2. Meaning of “indivisible” description of atomic operations

    • For operations that share variables, threads other than the operating thread have either not yet occurred or have finished, and they cannot see the intermediate results of the operation.
    • Access to the same set of shared variables cannot be interleaved.
  3. There are two ways to implement atomic operations :1. lock 2. CAS instruction of the processor

    Locks are typically implemented at the software level,CAS are typically implemented at the hardware level

  4. The two basic types of write operations in the Java language, long/double, are not atomic, and the other six basic types are. Using the volatile keyword to modify long/double makes it write atomic.

visibility

In a multithreaded environment, when a thread makes an update to a shared variable, subsequent threads accessing the shared variable cannot immediately obtain the updated result, or even never obtain the result. This phenomenon is called visibility problem.

  1. The read and write operations between processor and memory are not carried out directly, but through register write buffer cache and invalidation queue and other components to carry out memory read and write operations.

    CPU = = > write cache = = > cache = = > void queue | | | | | | = = = = = = = = = = = cache consistency protocolCopy the code
  2. Cache synchronization: although a processor cache content cannot be read other processors, but a processor can cache coherence protocol (msci) to read other processor cache, and update of the content of the read into its own cache, we call this process the cache synchronization.

  3. The causes of visibility problems

    • A shared variable in a program may be allocated to a processor’s register for storage. Each processor has its own register, and the contents of the register are not accessible to other processors. So when two threads are assigned to different processors and shared variables are stored in separate registers, one thread will never be able to access updates to shared variables from the other thread, and visibility problems arise.
    • Even Shared variables are assigned to stored in main memory, the processor read the main memory is done through the cache, A finish operation results Shared variables will be updated when the processor to by writing to the cache buffers, the operating results update only to write buffer, the processor to access to A Shared variable B, can appear the same visibility problem (writing was not other processor’s cache Q).
    • Visibility problems arise when the results of shared variable operations are updated from the cache to another processor’s cache, but are put into an invalidation queue by that processor, causing the contents of shared variables read by the processor to remain obsolete.
  4. How visibility assurance is implemented

    • Flush processor cache: When a processor makes an update to a shared variable, its update must eventually be written to the cache or main memory.
    • Flush processor cache: When a processor operates on a shared variable that has previously been updated by other processors, cache synchronization must be performed on the cache or main memory.
  5. The effect of volatile

    • Prompt the JIT compiler for thisvolatileDecorated variables may be shared by multiple threads, avoiding JIT compiler optimizations that might cause the program to behave improperly.
    • In the readingvolatileModify the variable before performing a refresh processor cache operation before updatingvolatileModifies the variable after the flush processor cache.
  6. The single processor implements multi-threaded operations through context switching. When switching occurs, the data in the register is also saved from “below” access, so visibility problems can occur when shared variables are stored in the register.

order

  1. The concept of reordering: The processor performs operations in a different order from the order specified by our object code

    • The bytecode order compiled by the compiler is inconsistent with the object code
    • The bytecode instruction execution order is inconsistent with the object code
    • The object code executes correctly, but other processors have an error in their perception of the execution order of the object code. For example, processor A performs operation A before operation B, but processor B thinks that processor A performs operation B first, which is A perceptual error.

    Resort is generally divided into:Instruction reorderingandStorage subsystem reordering Reordering is an optimization of memory access operations. It does not affect the correct execution of programs in single-thread mode, but does affect the correct execution of programs in multi-thread mode.

  2. Instruction reordering compiler for performance reasons, without affecting the correctness of the program to adjust the order of instruction execution, resulting in the execution order and source order is inconsistent. The Java platform has two types of compiler:

    • The static compiler (Javac), which translates Java source code into bytecode files (.class), has little instruction reordering during this period.
    • The dynamic compiler (JIT), which dynamically compiles Java bytecode into machine code, often occurs during instruction reordering.

    In order to execute efficiently, modern processors often do not execute instructions according to program order, but dynamically adjust the order of instruction execution, and execute which instruction is ready first, which is called out-of-order execution. The instruction execution results will be written to the register or main memory before, will be put to reorder buffer, then reorder buffer will submit the instruction execution results according to the program order to register, or main memory, so the out-of-order execution will not affect the validity of the single thread of execution result, but in a multithreaded environment will be unexpected results.

  3. Storage subsystem reorder (memory reorder)

    Processor-0 Processor-1
    data=1; //S1

    ready=true; //S2
    while(! ready){ }//L3

    System.out.println(data); //L4

    When neither processor-0 nor processor-1 has instruction reordering, the processor-0 executes the program in the sequence of S1-S2, but the processor-1 senses the S2 execution first. Therefore, the processor-1 may execute the program without sensing S1 After line l3-L4, the program will print data=0, which is a thread safety issue.

    The above situation is that S1 and S2 have memory reorder.

  4. It seems that serial semantic reordering is not a random order in which the compiler and processor adjust the results of instructions and memory operations, but follows certain rules. Compiler/processor adherence to this rule gives single-threaded programs an “illusion” of sequential execution, known as serial-like semantics. To ensure serial-like semantics, statements with data dependencies are not reordered, and statements without data dependencies may be reordered. For example: statements ③ depend on statements ① and ② so they cannot be reordered, but statements ① and ② have no data dependencies so statements ① and ② can be reordered.

    float price = 59.0 f; / / statements (1)
    short quantity = 5; / / statement (2)
    float subTotal = price * quantity; / / statement (3)
    Copy the code

    Statements with control dependencies are allowed to be reordered as follows: Flag and count have control dependencies and can be reordered. That is, if the value of flag is not known, count++ may be executed first for efficiency.

    if(flag){
      count++;
    }
    Copy the code
  5. Whether a uniprocessor system is affected by reordering

    • Reordering at static compile time can affect the results of a uniprocessor system

      Processor-0 Processor-1
      data=1; //S1

      ready=true; //S2
      while(! ready){ }//L3

      System.out.println(data); //L4
      Above after S1 and S2 are reordered at compile time
      ready=true; //S2

      data=1; //S1

      while(! ready){ }//L3

      System.out.println(data); //L4

      When the program switches context from processor-0 to processor-1 after S2 is executed, it is obvious that this reordering has caused unexpected results, resulting in thread safety problems.

    • Run-time reordering (JIT dynamic compilation, memory reordering) does not affect the results of a single processing system.

      When these reorders occur, the relevant instructions have not been completely executed, and the system will not switch the context until the execution of the reordered instructions is completed and submitted. So reordering in one thread has no effect on another thread after switching.

Context switch

The overhead required for context switching

Direct costs include:

  • The overhead required by the operating system to save and restore context, which is primarily processor time.
  • The overhead of thread scheduling by the thread scheduler (for example, following certain rules to determine which thread will occupy the processor).

Indirect costs include:

  • The overhead of processor cache reloading. A cut thread may be cut later on another processor to continue running. Since the processor may not have run the thread before, the variables that the thread needs to access as it continues to run still need to be reloaded into the cache by the processor from main memory or from other processors through the cache consistency protocol. It takes time.
  • Context switches can also cause the entire level 1 cache to be flushed, that is, the contents of the level 1 cache are written to the next level *(such as level 2 cache) or main memory (RAM).