Takeaway: baidu credibility authentication based on artificial intelligence, big data and other technical competence as the foundation, set up the check for enterprises, institutions and individuals such as the main body systematic service platform, is aimed at providing identification for different industries and business areas, fraud, information verification, etc. Series of products and integrated solutions. Baidu reputation certification, covering risk control and credit enhancement two core services, mainly for information distribution platform, interactive entertainment platform, industry vertical category, knowledge vertical category, commercial products to provide compliant customers and credible content. Its business goal lies in: around the producer and the content and services it provides, build baidu reputation certification ecology, let users rest assured in Baidu access to information and services.

The full text is 5460 words and the expected reading time is 12 minutes.

= = =

The background,

Baidu credit certification business objectives: around the producers and the content and services they provide, build baidu credit certification ecology, let users rest assured in Baidu access to information and services. Among them: risk control ability, by providing access threshold, reduce business risk, to solve the business pain point of “who is it, who should find the problem”; For example, professional authentication of individual user identity and official authentication of institutional user are used to authenticate the producer’s professional social position, and the professional domain of the producer’s content is authenticated by the authentication of interest lovers.

Figure (1)

Ii. Business architecture Panorama

Baidu reputation certification is based on artificial intelligence, enterprise big data and other technical capabilities, set up for enterprises, institutions and individuals and other subjects of the verification system service platform, aimed at different industries and business areas to provide identity identification, anti-fraud, information verification and other product capabilities and integration solutions. By judging producers’ objective identity, field, production capacity, influence and other dimensions, output certification results, and provide risk control functions such as access screening for businesses. In addition, the disclosure of authentication information can enhance the credibility of content and service providers and assist users in their choices and decisions. Its overall architecture is shown in Figure (2) below:

Figure (2)

(1) Access layer: through unified authentication and identification, the influence of the multi-account system of the business party on the system is shielded; Provides unified system authentication capability and minimizes the implementation cost of service providers at the access layer.

(2) Certification scheme: In order to meet different certification demands of business parties, we provide a variety of certification schemes for business parties to choose, such as qualification, industry, website, etc.; If the existing authentication scheme cannot meet the requirements of the service provider, you can combine the authentication capabilities with smaller granularity to form a new authentication scheme to serve the service provider.

(3) Capability map: The certification center has accumulated a lot of certification capabilities over the years. These capabilities can not only provide certification services for enterprises or individuals independently, but also provide services for a complex certification scheme through the configuration of the process engine. With the continuous change of network technology and information, the capability map of certification center has been constantly enriched;

(4) Sharing technology: in the capability map continues to enrich at the same time, certification in technology also continues to precipitation a variety of technical components, the reuse of these technical components can make certification in the access to business development efficiency greatly improved;

(5) The bottom layer is Baidu’s big data and all kinds of infrastructure (such as BOS, BDRP, etc.) is the foundation of the certification center.

Baidu credit certification product details as shown in figure (3) :

Risk control ability, subdivided into qualification certification, real-name certification, field certification, etc., certification methods are diverse, which can meet the business access needs of the business side;

Credit enhancement ability covers identity and occupation certification, official certification, interest area certification, V certification, etc. After passing the certification, the certification results will be displayed.

(3)

= = =

Three, certification of Taiwan technical architecture

In order to meet the continuous and rapid development needs of the business and improve the access efficiency of all business parties, the R&D team has integrated the technical architecture of the certification platform into the National Platform. In 2019, the first phase of the certification Platform was completed, which has accumulated more and more abundant certification capabilities and certification schemes. The overall technical architecture is shown in Figure (4) :

Figure (4)

(1) Capability management: the objective is to carry out service abstraction and business stripping based on the existing business and the possible business in the future, including the establishment of capability standards, the registration and discovery of capabilities, the control and measurement of capabilities, etc. At present, the precipitation certification ability includes: corporate certification, qualification certification, identity professional certification and so on. (2) Scheme management: capability is the foundation, and the certification scheme is the carrier we finally provide to the front desk business. The idea of the scheme construction is to carry out the precipitation of the scenario scheme for the authentication business, and realize the arrangement of the capability and the automatic self-construction of the scheme in this layer. At present, the company has provided qualification certification programs for institutions, electronic business license certification programs, and real-name certification programs for individuals.

(3) Shared data: to realize the mutual recognition and communication of authentication data and results, reduce the repeated development of the business side and the platform side, to achieve a certification, multiple reuse;

(4) Visual central control platform: currently it is undertaken by the mobile developer Center. As a capability portal, it carries functions such as scheme introduction and capability view, and is also a service access portal where business parties apply for specific service interfaces and configure QPS.

(5) The middle platform can distribute, limit and even degrade services to avoid mutual influence between different levels of business; Traffic monitoring can know all kinds of traffic in real time, such as message traffic, interface call traffic, etc. Service invocation is likely to result in data inconsistency due to reasons such as network or downtime. Through the business consistency platform, operational tools can be provided to configure the consistency of which data for which business you want to monitor or locate. The goal of these means is to ensure that the stability of the middle platform is up to standard or higher.

Four, core certification ability introduction

4.1 Unified Authentication

In 2020, following the footsteps of the company’s strategy, ALL content producers of MEG will realize account opening, content distribution into unified data stream, and unified hundreds of brands at the B-side. Including number system unification (hundred number), nickname unification, authentication unification, homepage unification and data flow unification.

Baidu Reputation for all b-side content producers to provide unified certification services, the first phase of the business to provide universal certification services, including real name certification, identity professional certification, official certification, interest area certification, plus V certification.

In this context, the certification research and development team in Taiwan, can provide a unified certification solution for all b-side business, the goal is to minimize the access cost of each business side, the cost of joint research and development between the teams, to achieve a fast online. Finally, the unified authentication architecture is shown in Figure (5) below:

Figure (5)

(1) Data acquisition and conversion: The data sources of each authentication service (such as identity and occupation authentication and interest area authentication, etc.) are all from various business parties. We provide a variety of data access methods (synchronous pull, offline pull, DTS, etc.) to avoid unnecessary development work for business parties. We preprocess the data for each business side (processing rule configuration).

(2) calculating the core logic to handle: it’s job is to put the metadata (such as 1, 2, 3) configured in accordance with the rules into each calculation dimensions (e.g., B, C, D) value, according to each kind of certification service access and release rules (e.g., B > C > D) for the polymerization, to calculate the current compliance with access to or release.

(3) unified inspection: for those who have completed the authentication service of the business users, its authentication results are not unchanged, if it has been in line with the release rules, it will be released in time. The work of unified inspection is to realize the fast and accurate inspection, set the circuit breaker mechanism and offline mechanism, and produce and release statements and emails.

(4) Data change processing: the authentication result is delivered asynchronously through BigPIPE, and the business party only needs to subscribe the corresponding BP message to receive the result.

At present, unified certification services have covered Baidu B-end opening number access, access to rights and interests and other business scenarios, to provide diversified certification services, and in the hundred number, good-looking, know, car and other business landing.

4.2 Qualification Certification Center

Certification in Taiwan ability incubation precipitation, the first and most important one is for the qualification of the qualification certification center. Qualification certification is the most important certification ability, the most used, the most changeable, the most complex business logic; About three quarters of the relevant competencies are related to qualifications. More than 50 qualifications are involved in a single medical project.

The qualification certification process is as follows: 1. The certification Center provides business parties with the submission function of various types of qualifications; 2. After the business party submits the content, the certification Center will complete the machine/person audit; 3. Synchronize the authentication result to the service provider. The authentication process is relatively simple, but the complexity lies in that: with the increasing access of product lines, the differentiation of businesses is more and more obvious, and the submission mode and interaction mode of each business are different. And there is a complex interaction between these factors. As zhongtai, we do not want to do too much customized development for simple functions, which is a waste of time and manpower, and does not conform to the development direction of Zhongtai.

The question then is: How do you meet these business requirements? How to crack manpower shortage? How to solve the burden of business understanding cost for r&d personnel? And how to enable users to authenticate in a more efficient and better experience.

As the software engineering classic The Man-Month Myth says, there are no silver bullets when it comes to solving the fundamental problems of software engineering. A set of the most appropriate architecture design and implementation of the architecture, is the key to crack the problem.

Now that there are more and more customization requirements, is it possible to abstract out minimal granularity templates? Can you quickly and flexibly assemble these templates to implement a submission module? It’s like a jigsaw puzzle. Namely: product solution = combination of different authentication capabilities + reuse.

The design idea is shown in Figure (6) below:

Figure (6)

(1) Data abstraction and process abstraction: we abstract the authentication data to the minimum granularity, such as: business license is a complete authentication item, which includes three fields: number, photo, period of validity; Each field corresponds to a data type, such as a text number; You can also set attributes for each field on that type, such as minimum and maximum length limits on the number of text types.

(2) Data reuse and automatic rendering: when a user has submitted data of a certain authentication item in a business and another business has the same authentication item, there is no need to repeatedly define authentication items and submit authentication items, which not only saves development time but also improves user experience. Automatic rendering means that after we define the fields, types and attributes of the authentication item in (1), the front end will automatically generate a corresponding submission template according to the convention, and there is no need to customize the front end development.

(3) Platformization and self-management: the definition, combination and setting of all certification items can be completed independently by the product manager students on the platform and set up and down online independently, without the participation of the r&d students in the whole process.

Under the guidance of this idea, we have realized the general qualification certification platform. The platform is divided into three parts, as shown in Figure (7) below:

(7)

(1) Authentication configuration management module: realizes unified management of authentication data, completes configuration, and automatically generates the qualification submission page template

(2) Product authentication logic engine module: distributes different authentication logic to different services through service distribution. In the process of service processing, for the general logic, directly go through the general authentication process, for the logic that cannot be processed by the general authentication, through the realization of a unified service interface, to achieve a small amount of code, you can integrate the individual logic into the whole system. Before and after the authentication service is implemented, different listening interfaces are set up. For some complicated authentication logic, listening interfaces are implemented. For example, for authentication such as public account verification, users can submit public information through the general data submission retrieval logic, but after the completion of information submission, they also need to interact with Baifubao, so that BaiFuBao can make payment, and the payment logic can be realized in the back monitoring. In short, through the interface, to achieve the unity of the whole framework;

(3) Universal service module: common logic is extracted to realize data interface templating and reduce repeated development

The automatic generation of authentication content is essentially a system design for abstract process. From the previous requirements – oriented development, into role – oriented development. The Usercase of the system extends from user to product designer, UI interaction designer, data template designer, audit template designer, trigger rule designer, tour rule designer, and finally to user. These roles in the qualification center through the combination of scaffolding, can quickly set up a qualification product.

4.3 Process Engine

There are all kinds of solutions that come out of these scaffolds. You can’t just switch to a working system. For developers, there is a need to keep a lot of knowledge about business processes in their knowledge base, and in many projects, process problems tend to dominate the software problems. How to free mid-stage developers from the sea of knowledge of business understanding. So we designed a set of workflow engine to realize the orchestration of authentication capability. The core components of the engine are decision nodes and routers, process configuration centers, and plug-in containers.

The modeling methods for process designers are as follows: 1. Comb ->2. Design modeling →3. The engine runs on the ODP extension, which is more friendly to PHper developers. It is configured based on ZK and can do process upgrade and delivery at runtime. The node reserves synchronous and asynchronous triggers to implement the cooperative execution of multiple instances.

The design idea of the process engine is shown in Figure (8) below:

Figure (8)

(1) Each service is an independent component that can be developed and deployed separately.

(2) Support common node, decision node, aggregation node and other node types to meet various business needs;

(2) Through the automatic flow of the engine to achieve a complete authentication solution, reduce the coupling of too much business logic code.

4.4 Intelligent Authentication

Since 2017, with the explosive growth of the customer scale of content ecology, we have realized that the existing certification scheme will become a bottleneck restricting business development; We started to try to innovate the certification scheme through AI. Paddlepaddle classification model was used to do OCR recognition of text content. At that time, it was the rapid construction period of Xiong Zhang brand. We applied the separate identification model to the machine audit of Xiong Zhang, which to some extent improved the efficiency of audit and reduced the workload of audit.

For users, there is still no qualitative change in the way of application. The above is to submit the qualification and wait for the result of manual audit. At the same time, there is too little support for qualification categories and generalization ability is limited.

The r&d team further analyzed, what is the nature of certification? In simple terms, certification is to collect information about business qualifications, business personnel and business sites to identify authenticity. Previously, we developed a lot of MIS systems for this purpose. These certification processes, in essence, have clear rules to follow and can be used as directions for AI applications. In order to realize the collection of diverse and complex authentication information, we need the authentication end to have the same native ability as mobile APP and better migration ability. In my factory small program is still in the stage of internal testing we will develop the first version of intelligent certification solution based on small program.

Due to printing, lighting, shading and other practical reasons; At the beginning, our identification model for IDL qualification template identification in the factory and the identification model for identification outside the factory were compared. The overall recall rate of the most common qualification in the production environment was less than 80%, which could not meet the requirements of ground promotion ability. When analyzing the badcase that cannot be recalled, it is found that people can accurately infer what the unclear word is because of knowledge reserve + reasoning. Although we cannot make machines possess the knowledge system that humans possess, we can make use of enterprise credit big data to make up for the shortcomings in recognition through data capabilities.

Figure (9)

Therefore, we further launched intelligent qualification certification service, as shown in Figure (10) :

Figure (10)

Qualification intelligent certification is a certification solution based on enterprise credit big data, image recognition technology to provide common qualifications; From qualification identification, classification, official data verification. In the interaction, it can provide the machine audit service API oriented to the background and intelligent authentication small program oriented to the user/customer.

Index the whole enterprise information in ES, carry out similarity retrieval on the identified content, and make secondary authentication and official comparison on similar data. Overall qualification accuracy rate: 98%; The certification efficiency has been greatly improved, and has been widely used in many product lines.

Development and thinking

In the future, in terms of business, we will continue to deepen content certification and explore service certification, so that users can obtain more accurate and reliable content and services from Baidu. In technology, certification in Taiwan research and development team through subdivision certification nodes and other ways to improve the efficiency of certification, further increase the ability of intelligent machine audit, reduce the cost of manual certification; In label coverage, it will change from passively waiting for users to actively exploring potential users, so as to promote these users to further become quality content producers, thus contributing more and better quality content for Baidu and serving netizens.

Job Information:

The R&D Department of Baidu APP technology Platform is responsible for the construction of Baidu APP and 100 Technology platform, and also carries the construction of a series of benchmarking platforms such as PUSH, message, interaction, transaction, log, performance, audit, and B-terminal. Welcome to join us and look forward to your arrival!

Whether you are backend, front-end, or big data, there are several positions waiting for you here. Welcome to submit your resume. Baidu APP Technology Platform R&D Department is looking forward to your joining us!

Resume delivery: [email protected] (Delivery remarks [Baidu APP technology]

Recommended reading:

Graph database in Baidu Chinese application

A look at the next generation front-end application framework from Lowcode

Short video go research and development framework practice

Hundreds of billions of models in offline consistency assurance scheme details

———- END ———-

Baidu said Geek

Baidu official technology public number online!

Technical dry goods, industry information, online salon, industry conference

Recruitment information · Internal push information · technical books · Baidu surrounding

Welcome to your attention