Modern programmers write code no one dare to say they used generics, T this generic template can be any type you want to replace, is really amazing magic, a lot of people are also used to it, but it is interesting how the generic T the underlying help you realize, don’t know how many people are there in the underlying mechanics, this article I will try to share, not necessarily all of ha…

One: before generics

Netcore 3.1 and the latest.netframework8 have long since gone the way of the infamous ArrayList, but coincidentally this has forced the C# team to make a clean break and go back to the drawing board.

public class ArrayList { private object[] items; private int index = 0; public ArrayList() { items = new object[10]; } public void Add(object item) { items[index++] = item; }}Copy the code

In order to ensure that the Add can be filled with various types eg: int,double,class, we came up with a trick of using the ancestor class object. This introduced two major problems: boxing and unboxing and type safety.

1. Packing and unpacking

This makes sense because you are using ancestor classes, so when you Add a value type, you will automatically have a boxing operation, as in the following code:

            ArrayList arrayList = new ArrayList();
            arrayList.Add(3);Copy the code

<1> Takes up more space

This problem I am going to use Windbg to see, I believe you know that an int type takes up 4 bytes, that boxing to the heap is how many bytes, curious about 😄.

The original code and IL code are as follows:

        public static void Main(string[] args)
        {
            var num = 10;
            var obj = (object)num;
            Console.Read();
        }

    IL_0000: nop
    IL_0001: ldc.i4.s 10
    IL_0003: stloc.0
    IL_0004: ldloc.0
    IL_0005: box [mscorlib]System.Int32
    IL_000a: stloc.1
    IL_000b: call int32 [mscorlib]System.Console::Read()
    IL_0010: pop
    IL_0011: ret
Copy the code

You can clearly see that IL_0005 has a box directive, there is no problem with packing, then grab the dump file.

~0s -> ! clrstack -l -> ! do 0x0000018300002d48

0:000> ~0s ntdll! ZwReadFile+0x14: 00007ff9`fc7baa64 c3 ret 0:000> ! clrstack -l OS Thread Id: 0xfc (0) Child SP IP Call Site 0000002c397fedf0 00007ff985c808f3 ConsoleApp2.Program.Main(System.String[]) [C:\dream\Csharp\ConsoleApp1\ConsoleApp2\Program.cs @ 28] LOCALS: 0x0000002c397fee2c = 0x000000000000000a 0x0000002c397fee20 = 0x0000018300002d48 0000002c397ff038 00007ff9e51b6c93 [GCFrame: 0000002c397ff038] 0:000> ! do 0x0000018300002d48 Name: System.Int32 MethodTable: 00007ff9e33285a0 EEClass: 00007ff9e34958a8 Size: 24(0x18) bytes File: C:\WINDOWS\Microsoft.Net \ assembly \ GAC_64 \ mscorlib \ v4.0 _4. 0.0.0 __b77a5c561934e089 \ mscorlib DLL Fields: MT Field Offset Type VT Attr Value Name 00007ff9e33285a0 40005a0 8 System.Int32 1 instance 10 m_valueCopy the code

Line 5 from last Size: 24(0x18) bytes 8(synchronous block pointer) +8 (method table pointer) + 4(object size)=20, but because it is x64 bits, the memory is aligned by 8, that is to calculate by a multiple of 8, so the footprint is 8+8+8 =24 bytes, the original size is only 4 bytes because the box has been burst to 24 bytes, If you’re boxing 10,000 value types isn’t that a terrible space footprint?

<2> Stack-to-stack packing and handling, transportation, after-sales and harmless treatment all need to pay significant labor and machine costs

2. The type is unsafe

Object is the ancestor of the object type, so it is inevitable that programmers will use a variety of types. This may be unintentional, but the compiler cannot avoid it.


            ArrayList arrayList = new ArrayList();
            arrayList.Add(3);
            arrayList.Add(new Action<int>((num) => { }));
            arrayList.Add(new object());
Copy the code

Faced with these two awkward questions, the C# team decided to redesign a type to implement a lifetime, and that’s where generics came in.

Two: the emergence of generics

1. The savior

List

,List

,List< int>,List

… The underlying implementation of this technology will be the focus of this article.


public static void Main(string[] args) { List<double> list1 = new List<double>(); List<string> list3 = new List<string>(); . }Copy the code

Three: generics principle exploration

List

-> List

where does T -> int replace List

-> List

where does T -> int replace List

? You should know at least a few stages of C# code compilation, so let me draw a picture to help you understand.




Processes as you can see are either replaced in MSIL or JIT compilation…


        public static void Main(string[] args)
        {
            List<double> list1 = new List<double>();
            List<int> list2 = new List<int>();
            List<string> list3 = new List<string>();
            List<int[]> list4 = new List<int[]>();

            Console.ReadLine();
        }
Copy the code

1. Explore in the first stage

Because the first stage is MSIL code, use ILSpy to take a look at the intermediate code.

IL_0000: nop IL_0001: newobj instance void class [mscorlib]System.Collections.Generic.List`1<float64>::.ctor() IL_0006: stloc.0 IL_0007: newobj instance void class [mscorlib]System.Collections.Generic.List`1<int32>::.ctor() IL_000c: stloc.1 IL_000d: newobj instance void class [mscorlib]System.Collections.Generic.List`1<string>::.ctor() IL_0012: stloc.2 IL_0013: newobj instance void class [mscorlib]System.Collections.Generic.List`1<int32[]>::.ctor() IL_0018: stloc.3 IL_0019: call string [mscorlib]System.Console::ReadLine() IL_001e: pop IL_001f: ret .class public auto ansi serializable beforefieldinit System.Collections.Generic.List`1<T> extends System.Object implements class System.Collections.Generic.IList`1<! T>, class System.Collections.Generic.ICollection`1<! T>, class System.Collections.Generic.IEnumerable`1<! T>, System.Collections.IEnumerable, System.Collections.IList, System.Collections.ICollection, class System.Collections.Generic.IReadOnlyList`1<! T>, class System.Collections.Generic.IReadOnlyCollection`1<! T>Copy the code

From the IL code above you can see, the class definition or System. Collections. Generic. List1 \ < T >, code in the middle stage or not implemented T – > int the replacement.

2. Explore in the second stage

If you want to see jIT-compiled code, it’s not hard to say that every object has a pointer to the method table, and that pointer points to the method table, and the method table contains all the final generated methods of that type, so I’ll draw a picture if that’s not clear.

! Dumpheap-stat looks for four List objects on the managed heap.

0:00 0 >! dumpheap -stat Statistics: MT Count TotalSize Class Name 00007ff9e3314320 1 32 Microsoft.Win32.SafeHandles.SafeViewOfFileHandle 00007ff9e339b4b8 1 40 System.Collections.Generic.List`1[[System.Double, mscorlib]] 00007ff9e333a068 1 40 System.Collections.Generic.List`1[[System.Int32, mscorlib]] 00007ff9e3330d58 1 40 System.Collections.Generic.List`1[[System.String, mscorlib]] 00007ff9e3314a58 1 40 System.IO.Stream+NullStream 00007ff9e3314510 1 40 Microsoft.Win32.Win32Native+InputRecord 00007ff9e3314218 1 40 System.Text.InternalEncoderBestFitFallback 00007ff985b442c0 1 40 System.Collections.Generic.List`1[[System.Int32[], mscorlib]] 00007ff9e338fd28 1 48 System.Text.DBCSCodePageEncoding+DBCSDecoder 00007ff9e3325ef0 1 48 System.SharedStaticsCopy the code

You can see from the managed heap to find four list object, now I will choose one of the most simple System. Collections. Generic. List1 [[System. Int32, mscorlib]], The previous 00007FF9E333A068 is the method table address.

! dumpmt -md 00007ff9e333a068

0:00 0 >! dumpmt -md 00007ff9e333a068 EEClass: 00007ff9e349b008 Module: 00007ff9e3301000 Name: System.Collections.Generic.List`1[[System.Int32, mscorlib]] mdToken: 00000000020004af File: C:\WINDOWS\Microsoft.Net \ assembly \ GAC_64 \ mscorlib \ v4.0 _4. 0.0.0 __b77a5c561934e089 \ mscorlib DLL BaseSize: 0x28 ComponentSize: 0x0 Slots in VTable: 77 Number of IFaces in IFaceMap: 8 -------------------------------------- MethodDesc Table Entry MethodDesc JIT Name 00007ff9e3882450 00007ff9e3308de8 PreJIT System.Object.ToString() 00007ff9e389cc60 00007ff9e34cb9b0 PreJIT System.Object.Equals(System.Object) 00007ff9e3882090 00007ff9e34cb9d8 PreJIT System.Object.GetHashCode() 00007ff9e387f420 00007ff9e34cb9e0 PreJIT System.Object.Finalize() 00007ff9e38a3650 00007ff9e34dc6e8 PreJIT System.Collections.Generic.List`1[[System.Int32, mscorlib]].Add(Int32) 00007ff9e4202dc0 00007ff9e34dc7f8 PreJIT System.Collections.Generic.List`1[[System.Int32, mscorlib]].Insert(Int32, Int32)Copy the code

T -> int (List

); T -> int (List

);

0:00 0 >! dumpmt -md 00007ff9e339b4b8 MethodDesc Table Entry MethodDesc JIT Name 00007ff9e3882450 00007ff9e3308de8 PreJIT System.Object.ToString() 00007ff9e389cc60 00007ff9e34cb9b0 PreJIT System.Object.Equals(System.Object) 00007ff9e3882090 00007ff9e34cb9d8 PreJIT System.Object.GetHashCode() 00007ff9e387f420 00007ff9e34cb9e0 PreJIT System.Object.Finalize() 00007ff9e4428730 00007ff9e34e4170 PreJIT System.Collections.Generic.List`1[[System.Double, mscorlib]].Add(Double) 00007ff9e3867a00 00007ff9e34e4280 PreJIT System.Collections.Generic.List`1[[System.Double, mscorlib]].Insert(Int32, Double)Copy the code

So what happens if T is a reference type?

0:00 0 >! dumpmt -md 00007ff9e3330d58 MethodDesc Table Entry MethodDesc JIT Name 00007ff9e3890060 00007ff9e34eb058 PreJIT System.Collections.Generic.List`1[[System.__Canon, mscorlib]].Add(System.__Canon) 0:000> ! dumpmt -md 00007ff985b442c0 MethodDesc Table Entry MethodDesc JIT Name 00007ff9e3890060 00007ff9e34eb058 PreJIT System.Collections.Generic.List`1[[System.__Canon, mscorlib]].Add(System.__Canon)Copy the code

List

and List

. The JIT used System.__Canon instead of the reference type. This is because it wants all methods that can share code areas to save space and memory. If you don’t believe me, they all have the same memory address in the Entry column: 00007FF9E3890060.

[]>

0:00 0 >! u 00007ff9e3890060 preJIT generated code System.Collections.Generic.List`1[[System.__Canon, mscorlib]].Add(System.__Canon) Begin 00007ff9e3890060, size 4a >>> 00007ff9`e3890060 57 push rdi 00007ff9`e3890061 56 push rsi 00007ff9`e3890062 4883ec28 sub rsp,28h 00007ff9`e3890066 488bf1 mov rsi,rcx 00007ff9`e3890069 488bfa mov rdi,rdx 00007ff9`e389006c 8b4e18 mov ecx,dword ptr [rsi+18h] 00007ff9`e389006f 488b5608 mov rdx,qword ptr [rsi+8] 00007ff9`e3890073 3b4a08 cmp ecx,dword ptr [rdx+8] 00007ff9`e3890076 7422 je mscorlib_ni+0x59009a (00007ff9`e389009a) 00007ff9`e3890078 488b4e08 mov rcx,qword ptr [rsi+8] 00007ff9`e389007c 8b5618 mov edx,dword ptr [rsi+18h] 00007ff9`e389007f 448d4201 lea r8d,[rdx+1] 00007ff9`e3890083 44894618 mov dword ptr [rsi+18h],r8d 00007ff9`e3890087 4c8bc7 mov r8,rdi 00007ff9`e389008a ff152088faff call qword ptr [mscorlib_ni+0x5388b0 (00007ff9`e38388b0)] (JitHelp: CORINFO_HELP_ARRADDR_ST) 00007ff9`e3890090 ff461c inc dword ptr [rsi+1Ch] 00007ff9`e3890093 4883c428 add rsp,28h 00007ff9`e3890097 5e pop rsi 00007ff9`e3890098 5f pop rdi 00007ff9`e3890099 c3 ret 00007ff9`e389009a 8b5618 mov edx,dword ptr [rsi+18h] 00007ff9`e389009d ffc2 inc edx 00007ff9`e389009f 488bce mov rcx,rsi 00007ff9`e38900a2 90 nop 00007ff9`e38900a3 e8c877feff call mscorlib_ni+0x577870 (00007ff9`e3877870) (System.Collections.Generic.List`1[[System.__Canon, mscorlib]].EnsureCapacity(Int32), mdToken: 00000000060039e5) 00007ff9`e38900a8 ebce jmp mscorlib_ni+0x590078 (00007ff9`e3890078)Copy the code

And then looking back a List < int > and List < double >, from Entry column is not an address, that List < int > and List < double > are two completely different the Add method, read assembly can have a look at our ha…

MethodDesc Table Entry MethodDesc JIT Name 00007ff9e38a3650 00007ff9e34dc6e8 PreJIT System.Collections.Generic.List`1[[System.Int32, mscorlib]].Add(Int32) 00007ff9e4428730 00007ff9e34e4170 PreJIT System.Collections.Generic.List`1[[System.Double, mscorlib]].Add(Double) 0:000> ! u 00007ff9e38a3650 preJIT generated code System.Collections.Generic.List`1[[System.Int32, mscorlib]].Add(Int32) Begin 00007ff9e38a3650, size 50 >>> 00007ff9`e38a3650 57 push rdi 00007ff9`e38a3651 56 push rsi 00007ff9`e38a3652 4883ec28 sub rsp,28h 00007ff9`e38a3656 488bf1 mov rsi,rcx 00007ff9`e38a3659 8bfa mov edi,edx 00007ff9`e38a365b 8b5618 mov edx,dword ptr [rsi+18h] 00007ff9`e38a365e 488b4e08 mov rcx,qword ptr [rsi+8] 00007ff9`e38a3662 3b5108 cmp edx,dword ptr [rcx+8] 00007ff9`e38a3665 7423 je mscorlib_ni+0x5a368a (00007ff9`e38a368a) 00007ff9`e38a3667 488b5608 mov rdx,qword ptr [rsi+8] 00007ff9`e38a366b 8b4e18 mov ecx,dword ptr [rsi+18h] 00007ff9`e38a366e 8d4101 lea eax,[rcx+1] 00007ff9`e38a3671 894618 mov dword ptr [rsi+18h],eax 00007ff9`e38a3674 3b4a08 cmp ecx,dword ptr [rdx+8] 00007ff9`e38a3677 7321 jae mscorlib_ni+0x5a369a (00007ff9`e38a369a) 00007ff9`e38a3679 4863c9 movsxd rcx,ecx 00007ff9`e38a367c 897c8a10 mov dword ptr [rdx+rcx*4+10h],edi 00007ff9`e38a3680 ff461c inc dword ptr [rsi+1Ch] 00007ff9`e38a3683 4883c428 add rsp,28h 00007ff9`e38a3687 5e pop rsi 00007ff9`e38a3688 5f pop rdi 00007ff9`e38a3689 c3 ret 00007ff9`e38a368a 8b5618 mov edx,dword ptr [rsi+18h] 00007ff9`e38a368d ffc2 inc edx 00007ff9`e38a368f 488bce mov rcx,rsi 00007ff9`e38a3692 90 nop 00007ff9`e38a3693 e8a8e60700 call mscorlib_ni+0x621d40 (00007ff9`e3921d40) (System.Collections.Generic.List`1[[System.Int32, mscorlib]].EnsureCapacity(Int32), mdToken: 00000000060039e5) 00007ff9`e38a3698 ebcd jmp mscorlib_ni+0x5a3667 (00007ff9`e38a3667) 00007ff9`e38a369a e8bf60f9ff call mscorlib_ni+0x53975e (00007ff9`e383975e) (mscorlib_ni) 00007ff9`e38a369f cc int 3 0:000> ! u 00007ff9e4428730 preJIT generated code System.Collections.Generic.List`1[[System.Double, mscorlib]].Add(Double) Begin 00007ff9e4428730, size 5a >>> 00007ff9`e4428730 56 push rsi 00007ff9`e4428731 4883ec20 sub rsp,20h 00007ff9`e4428735 488bf1 mov rsi,rcx 00007ff9`e4428738 8b5618 mov edx,dword ptr [rsi+18h] 00007ff9`e442873b 488b4e08 mov rcx,qword ptr [rsi+8] 00007ff9`e442873f 3b5108 cmp edx,dword ptr [rcx+8] 00007ff9`e4428742 7424 je mscorlib_ni+0x1128768 (00007ff9`e4428768) 00007ff9`e4428744 488b5608 mov rdx,qword ptr [rsi+8] 00007ff9`e4428748 8b4e18 mov ecx,dword ptr [rsi+18h] 00007ff9`e442874b 8d4101 lea eax,[rcx+1] 00007ff9`e442874e 894618 mov dword ptr [rsi+18h],eax 00007ff9`e4428751 3b4a08 cmp ecx,dword ptr [rdx+8] 00007ff9`e4428754 732e jae mscorlib_ni+0x1128784 (00007ff9`e4428784) 00007ff9`e4428756 4863c9 movsxd rcx,ecx 00007ff9`e4428759 f20f114cca10 movsd mmword ptr [rdx+rcx*8+10h],xmm1 00007ff9`e442875f ff461c inc dword ptr [rsi+1Ch] 00007ff9`e4428762 4883c420 add rsp,20h 00007ff9`e4428766 5e pop rsi 00007ff9`e4428767 c3 ret 00007ff9`e4428768 f20f114c2438 movsd mmword ptr [rsp+38h],xmm1 00007ff9`e442876e 8b5618 mov edx,dword ptr [rsi+18h] 00007ff9`e4428771 ffc2 inc edx 00007ff9`e4428773 488bce mov rcx,rsi 00007ff9`e4428776 90 nop 00007ff9`e4428777 e854fbffff call mscorlib_ni+0x11282d0 (00007ff9`e44282d0) (System.Collections.Generic.List`1[[System.Double, mscorlib]].EnsureCapacity(Int32), mdToken: 00000000060039e5) 00007ff9`e442877c f20f104c2438 movsd xmm1,mmword ptr [rsp+38h] 00007ff9`e4428782 ebc0 jmp mscorlib_ni+0x1128744 (00007ff9`e4428744) 00007ff9`e4428784 e8d50f41ff call mscorlib_ni+0x53975e (00007ff9`e383975e) (mscorlib_ni) 00007ff9`e4428789 cc int 3Copy the code

Maybe you’re confused. Let me draw a picture.

Four:

The real substitution of the generic T was implemented during JIT compilation. The four List

generated four class objects with corresponding concrete types, so there were no unboxing and boxing problems, and the type qualification was constrained for us by the visualstudio compiler tools ahead of time.

It’s late at night, let’s have a rest! Hope you found this article helpful.