Unity3d underlying data transfer analysis

Author: Fan Songyang, Tencent Game client development senior engineer Commercial reprint please contact Tencent WeTest for authorization, non-commercial reprint please indicate the source. The original link: wetest.qq.com/lab/view/37…

WeTest takeaway

This article mainly analyzes how unmanaged heap, runtime, managed heap relate to each other and how to call them under the Mono framework. In memory, I introduced marshaling and the relationship and difference between classes and structures.

1. Managed Interaction (Interop)

The embedding principle is described in Mono’s official documentation. We know that Unity3d is written in C++, while C# code is compiled into CIL(Common Intermediate Language). MonoRuntime is the technology that connects the two parts. Usually the C++ portion is referred to as Unmanaged code (left) and the CIL/.NET portion is referred to as manage code (right).

Second, marshaling

In C#, string is passed to C++ via an internal call using MonoString*, which is a string type pointer to the managed heap object. This conversion is Marshalling.

Specifically, marshaling is the process of transforming the in-memory representation of an object into a format suitable for storing or sending data.

For simple data types, such as basic types such as integers and floating point numbers, marshaling is implicitly bitwise copy (blitting). Another case where marshaling is not necessary is pointer passing, such as passing a structure by reference to unmanaged code, which copies only the pointer to the structure. Of course, it is also possible to customize marshaling policies through MarshalAs.

Keep in mind that these two parts of memory are completely separate. Managed memory is allocated on the GC heap, while unmanaged memory is completely controlled by the C++ layer’s business code itself. Therefore, when the contents of the heap are accessed by C++, there is a good chance that the heap will be removed by GC because of the heap mechanism. To prevent this, you can use the fixed keyword in C# to lock variables unilaterally.

Fixed is not used in P/Invoke mode, but another common managed to unmanaged marshaling method is used:

The Runtime allocates a block of unmanaged memory.
Managed class data is copied to the newly requested unmanaged memory.
When an unmanaged method is invoked, the unmanaged memory data above is used instead of the original managed memory data. This is done so that unmanaged memory is available when GC occurs.
Copy unmanaged memory back to managed memory.

Since we cannot be sure when memory in the managed heap will fail, we should not cache any data passed in by managed code in unmanaged code.

The other case is to return a value. A class in unmanaged code cannot return a value, only a pointer. Because the heap contents cannot communicate with each other, the following steps occur when you return to managed code:

Managed code calls unmanaged code, returning a pointer to a structure in unmanaged memory.
Find and instantiate the corresponding managed class in managed code, marshalling unmanaged content to the managed class.
Memory in unmanaged code is freed by marshal.freecotAskmem ().

To avoid this allocation, return an IntPtr and use the Marshal class method to manipulate the pointer. Classes and structures are discussed in more detail later.

Three, cross-domain invocation

Managed code can call C++ in two ways: P/Invoke and Embedding.

P/Invoke

To use the P/Invoke invocation, you need to declare the C++ function public. Such as:

Then add the following declaration to the C# layer:

The __Internal keyword allows Mono to look for functions in the currently executing unmanaged code, and self-extended Marshalling allows for a large number of data types, the simplest Interop approach.

Internal calls

Internal calls register calls in C++ and access managed objects directly, controlling Marshall. For example, to return a string, we first display the registration interface in C++.

Then declare the following functions in C# :

Implement this function in C++ :

MonoString and mono_string_new are used to complete Marshalling of the string.

Four, memory allocation

Classes and structures

Classes and structures are passed differently for managed and unmanaged code.

1. Class passing

Classes are allocated on the managed heap, so they cannot be passed to unmanaged code as value types, but only as references. Take code for example:

For the following unmanaged code:

A class wrapper is available, which can be:

In managed code, we need to specify the data format for the class, which is layoutkind.auto by default. With this allocation, runtime automatically selects the appropriate memory layout to create unmanaged memory, so the memory structure is not known to the outside world. We can use layoutkind. Sequential or Layoutkind. Explicit to specify a memory allocation strategy. For example, the definition of managed code could be written as follows:

In addition, class methods have their own marshaling method. As mentioned earlier, much of the data was accessed by Marshaling. If you need to make copy rules, specify keywords [In],[Out],[In,Out], and transfer directions as shown below:

When these properties are not specified, the copying method is determined based on the data type (Value or Reference).

For example, when a reference type (class, array, string, interface) is passed as a value, it is labeled [In] for performance reasons. This is also the default flag that does not copy from unmanaged back to managed.

2. Transfer of structure

Structs differ from classes in two ways:

The structure is allocated on the Runtime Stack.
Sequential is used by default, and no additional properties need to be set when used by unmanaged code.

When a structure is passed to unmanaged code, there are cases where no memory copy is made:

When passed as values, structures are allocated on the stack and are comparable blittable types
Pass by reference

In this case, you do not need to specify [Out] as the keyword. On the other hand, if a structure contains non-specific types, such as system. Boolean, system. String, or array, you need to perform Marshalling yourself.

According to the unmanaged code definition above, the wrapper can be:

Structs in unmanaged code can be returned as values, but not ref or out. So to return a pointer to a structure, you must either use IntPtr, or define unsafe externally. If you use IntPtr as the return value, you can use the Marshal.PtrToStructure family of functions to convert the pointer to a managed structure.

Member variables

For class and struct member variables, it is prudent not to pass classes or structs that contain reference types (such as classes) to unmanaged code. Because unmanaged code cannot safely manipulate unmanaged references, managed code does not necessarily marshal data. Therefore, it is best not to include array objects, especially strings, in a packaged class. Of course, if you can’t get around it, you need to customize marshaling.

Such as:

Or:

Note that such use must ensure that there is memory allocation in managed code, for example:

5. GC security

Since Marshalling is done by copying data, it doesn’t make much sense on closer inspection. As mentioned above, IntPtr and unsafe features are typically used to handle marshaled copies. Pointers, however, need to be taken care to avoid being garbage collected while the function is running. For example:

After c.m(), the GC will reclaim the instance of C. It is very likely that the C. opatonHandle in unmanaged code is still using _Handle because it has crossed over and managed code cannot know about it. The solution is to use HandleRef instead of IntPtr in this case. It ensures that a managed object is not GC until after the unmanaged code call has finished. In.net 2.0, we can also refer to the documentation (www.mono-project.com/d…) Use SafeFileHandle or SafeWaitHandle.

Since we are going to hold, it is our responsibility to release unmanaged code from managed code. A simple way to do this is to make sure that all resources have a release function in the wrapper class and call it when the use is complete. This can be used if you do not want to wait for a unified GC

To prevent objects from entering the destructor queue and recycle resources directly.

If you don’t feel comfortable calling the destructor manually, you can surround it with a using block to ensure that it is automatically released at the end of the block. The code looks like this:

As a final note, since inheritance promotes GC generation, wrapper classes should avoid using virtual functions or as non-sealed calss. If the released member variable is an ArrayList containing other objects, then the List, the children of the container, and the recursively referenced objects in the children are all weighted higher. As we all know, the higher the GC weight, the slower the rate of collection. So the optimal strategy is that each destructor class is a leaf node, and the trunk is a tree made up of these leaves that do not refer to each other.

Six, summarized

This article mainly analyzes how unmanaged heap, runtime, managed heap are related under Mono framework, as well as how to call. In memory, I introduced marshaling and the relationship and difference between classes and structures. I was going to do some analysis with Unity3D, but the content of the article is so much that I’m afraid no one will read it. Let’s split it up, I hope there is no eunuch.

References:

www.mono-project.com/d…

en.wikipedia.org/wiki…

www.mono-project.com/d…

docs.go-mono.com/index… :System.Runtime.InteropServices.StructLayoutAttribute

docs.microsoft.com/zh…

msdn.microsoft.com/zh…

docs.microsoft.com/en…

www.uml.org.cn/c++/201… ,

docs.go-mono.com/index… :System.Runtime.InteropServices.HandleRef

docs.go-mono.com/index… :System.Runtime.InteropServices.LayoutKind.Auto

The UPA,

Click wetest.qq.com/cube/ to use it.

Developers interested in UPA are welcome to join the QQ group 633065352

Unity3d underlying data transfer analysis

WeTest takeaway

1. Managed Interaction (Interop)

Second, marshaling

Three, cross-domain invocation

Four, memory allocation

5. GC security

Six, summarized

The UPA,

Related Posts

C++ learning (1) 4.24

Memcache – Hash table – source code analysis

C++ novice tutorial:.h,.cpp,.a, so, static links, dynamic links, loading overview