1. WinForm references the Adobe PDF Reader

When I write WinForm program, I often refer to third-party components, including Com components. When I make a desktop program, I need to show PDF. When I look at some other open source components, they are not very compatible with PDF. I then referenced the Adboe PDF Reader directly to display, testing the compatibility of different PDFS was good. So how do you quote it?

  • Select items in the toolbar

  • Adding Com Components

Go to Adobe PDF Reader and click OK to add the component to the toolbox.

  • Using Com components

Create a new form or user control and drag the Adobe PDF Reader component you just added into the form to act like a WinForm control.

Axacropdflib. AxAcroPDF control is generated in the form class. When entering the control class, you can see the methods provided by the control class, including the LoadFile method for loading the display PDF, gotoFirstPage, and other page turning methods.

This control has a parent AxHost class. Enter the AxHost class for a summary:

Wrap ActiveX controls and expose them as fully functional Windows Forms controls

I got into a lot of thinking about what ActiveX controls are, how com components are used, how AxAxAcroPDFLib.AxAcroPDF classes are generated, and how Winform and COM interoperate. So I carried out some information search and study.

2. ActiveX control

ActiveX control technology is based on a foundation composed of COM, connectable objects, composite documents, property pages, OLE automation, object persistence, and system-provided font and picture objects. Control is essentially a COM object that exposes the IUnknown interface through which a client can get Pointers to its other interfaces. Controls can support licensing through IClassFactory2 and self-registration. That is to say, ActiveX control is based on COM objects, using COM technology to let different languages written controls can be called each other, and how to write ActiveX controls, can use ATL and MFC, but two I have not used! And I haven’t written it, so I’ll skip it, just to get the concept. Since it’s based on COM, let’s see what COM is.

3, COM technology

The Microsoft Component Object Model (COM) defines a binary interoperability standard for creating reusable software libraries that interact at run time. You can use COM libraries without having to compile them into your application. COM is the basis for many Microsoft products and technologies, such as Windows Media Player and Windows Server.

COM defines binary standards for many operating systems and hardware platforms. For network computing, COM defines standard wired formats and protocols for interactions between objects running on different hardware platforms. COM is implementation language-independent, which means you can use other programming languages such as C ++ and. NET Framework programming language) to create COM libraries. The COM specification provides all the basic concepts that support cross-platform software reuse: binary standards for function calls between components. A specification for grouping powerful types into interfaces. Provides basic interfaces for polymorphism, feature discovery, and object lifetime tracking. A mechanism that uniquely identifies a component and its interfaces. Component loaders that create component instances from deployment. COM has several parts that work together to create applications built from reusable components: a host system provides a COM specification to which the runtime environment conforms. Defines the interface to the element contract and the components that implement the interface. The server that provides the components to the system, and the client that uses the functionality provided by the components. A registry to track where components are deployed on local and remote hosts. A service control manager that can find components on local and remote hosts and connect servers to clients. A structured storage protocol that defines how to navigate the contents of files on a host file system. Enabling code reuse across hosts and platforms is critical for COM. Reusable interface implementations are called components, component objects, or COM objects. Component implements one or more COM interfaces. You can define a custom COM library by designing the interface that the library implements. Users of the library can discover and use its functions without having to know the details of the library’s deployment and implementation.

That’s the official definition, but there are more details at docs.microsoft.com/zh-cn/windo… These include the definition and way of implementation, objects and interfaces, interface implementation, IUnknown interface, and so on.

How does that work and how does it work? Here’s an interesting summary:

COM is primarily a set of interfaces for C/C++, but it has also been promoted to VB, Delphi, and a host of other weird platforms for Microsoft’s sake. It is primarily for publishing interface-based interfaces using DLLS. As we know, the interface of DLL is designed for C, and the functions it exports are basically C functions. In principle, after loading the DLL into memory, it will tell you the address of a set of functions, and you can call the corresponding functions by yourself. But for C++ this is a headache. Now suppose you have a class. We know that the first step to using a class is to create the class: new MyClass(). The new method uses the compiler to calculate the size of MyClass to allocate the corresponding memory space, but if the library is upgraded, the corresponding class may add new members and the size changes, so the space allocated using the old definition cannot be used in the new library. To solve this problem, we must export a CreateObject method in the DLL that replaces the constructor, and then return an interface. However, the definition of the interface can change between versions, and in order to be compatible with previous versions and provide new functionality, the object needs to return different versions of the interface. The interface is actually a C++ class with only pure virtual functions, but it has been modified to be compatible with C and other programming languages. After this modification, there is also the destructor process ~MyClass() or delete MyClass, because the same object may return many interfaces, some of which are still in use, and if one is deleted, all the others will fail, so reference counting is introduced. To allow many people to share the same object. So far, it’s not that weird. In C++, we sometimes use Factory methods instead of constructors to implement special polymorphisms, reference counting, and so on. The oddity of the COM technology is that Microsoft has been imaginative enough to create an operating system-level Factory that requires everyone’s Interface to be identified by a UUID. In the future, all you need is a UUID for each Interface you want. This avoids even linking to specific DLLS. This is like a COM programmer who, as long as he is on the Windows platform, calls another library, only needs to first turn over the magic book, and finds a strange text written “Excel = {XXX-XXX-xxxx… }, and then it just screams into the air, “Excel! CoCreateInstance, {xxx-xxx-xxxx… } “and then out of the magic circle sprang a monster, what it looks like we can’t see at all, because at this time its type is iunknown, which is a very imaginative Microsoft design for all interface base class. We further required it to be in an interface form that we could control, so we shouted the next command: “Transform, Excel 2003 form! QueryInterface, {xxx-xxx-xxxx… } “QueryInterface uses a different UUID to represent a different version of the interface. The monster then becomes the interface we need for Excel 2003, although we don’t know if it’s really 2003 or 2007 or later. When we had finished summoning the beast, we would say to him, “Go back, Summoning beast! The Release! But it’s not always obedient, because the orders given to it may not have been carried out, and it will faithfully wait until it’s done before it goes back, but of course we don’t care about those details. (Address:www.zhihu.com/question/49…)

From this summary, we can understand that all COM classes actually inherit IUnknown. After we get the IUnknown interface, we need to convert to the type we need to use, and this type may make mistakes if we use strong conversion, but Microsoft believes that it is not safe to convert directly by users. A unique identifier is needed to determine a class. Then the identifier is the GUID. The class ID is called CLSID, the interface ID is called IID, and you need a transformation function called QueryInterface. QueryInterface, as a pure virtual function in IUnknown, does a simple thing to determine whether it can be converted to a class that a GUID points to. If not, return E_NOTIMPL, or S_OK if possible, with the converted pointer as an argument. The COM component does not need a name, or a UUID, because we always use the interface in it, rather than using the COM component directly, so the interface also needs a UUID. Having said all that, the COM architecture is so complex that it definitely needs a middle tier, or ferrying person, which is COM Library (a bunch of DLLS) + registry. Application A notifies COM Library and enters the UUID of the interface. COM Library loads the DLL corresponding to the component of application B and returns the interface pointer to application A. The pointer indicates A set of function Pointers that can be used to call functions in application B.

Note: The UUID is the GUID value, and the GUDI is the GUID value.

4, aximp.exe (Windows Forms ActiveX control import program)

With the above introduction to ActiveX controls and Com components, let’s go back to the beginning of how we imported ActiveX controls. The ActiveX control importer converts the type definitions in the COM type library for ActiveX controls to Windows Forms controls. Windows Forms can only host Windows Forms controls, that is, classes derived from Control. Aximp.exe generates wrapper classes that can host ActiveX controls on Windows Forms. This allows you to use the same design-time support and programming methods that apply to other Windows Forms controls. To host an ActiveX control, you must generate a wrapper control derived from AxHost. This wrapper control contains an instance of the underlying ActiveX control. It knows how to communicate with an ActiveX control, but it shows up as a Windows Forms control. The generated control hosts the ActiveX control and exposes its properties, methods, and events as the properties, methods, and events of the generated control. This shows that when we choose to add com component to the toolkit, we actually execute the import program implicitly, generating the corresponding AxAcroPDFLib.AxAcroPDF wrapper control. AxAcroPDFLib is COM Library as mentioned in point 3.

5, validation,

Since AxAcroPDFLib is the ferrying person (interoperable assembly), we can see a reference to this COM Library

So if you have an interoperator then the interoperator has to call the COM component, call the COM component what about the UUID? Decompilate the assembly into Dnspy and you can see the ClsidAttribute tag {ca8a9780-280d-11cf-a24d-444553540000} and the UUID in the constructor.

Then we open the registry to query the corresponding value and registry condition.

6, summary

So through the above concept understanding and guess verification, basic clear design and idea of COM, as well as the ActiveX control call process.

  1. Activex control is a COM implementation.
  2. The Activex control invoked aximp.exe through a VS tool reference.
  3. The aximp.exe program generates the interoperable assembly AxAcroPDFLib, along with axHost-derived wrapper controls that can host ActiveX controls on Windows forms.
  4. The function of the reference control is invoked through the COM component when the AxAcroPDF method is called.