Common-sense concept map is a kind of knowledge map built around the entities established by common-sense concept and the relations between the entities, focusing on the scenarios of the United States. This paper introduces the Schema of meituan common sense concept atlas construction, the challenges encountered in the atlas construction and the algorithm practice in the construction process, and finally introduces some current common sense concept atlas application in business.

One, the introduction

In natural language processing, we often think about how to understand natural language well. For us humans, understanding the text information of a natural language is usually based on the current information, associating the relevant information stored in our brain, and finally understanding the information. For example, “he doesn’t like apples, but he likes ice cream”, people associate cognitive information in their brain when they understand: apples are sweet and a bit crunchy; Ice cream, sweeter than apples, soft and cold, can relieve the heat in summer; Children prefer sweets and ice cream. So with this knowledge, you can deduce a number of reasons for preferring ice cream. However, much of the current work of natural language understanding is still focused on the level of information. The current work is similar to a Bayesian probability, which seeks the maximum text information that meets the conditions from the known training text.

It is the ultimate goal of natural language processing to understand text like a human, so now more and more research is introducing some additional knowledge to help machines understand natural language text. Simple text information is only the expression of external objective facts, while knowledge is the induction and summary of external objective facts based on text information. Therefore, auxiliary knowledge information should be added to natural language processing to make natural language understanding better.

Building a body of knowledge is a direct way to understand natural language more accurately. The knowledge graph is proposed around this idea, hoping that by giving machines explicit knowledge, they will be able to reason like humans. Therefore, in 2012, Google formally proposed the concept of Knowledge Graph, which was originally intended to optimize the results returned by search engines and enhance users’ search quality and experience.

2. Introduction of common-sense concept atlas

Common-sense concept mapping is to establish relationships between concepts and facilitate understanding of natural language texts. At the same time, our common-sense concept map focuses on The Meituan scenario to help improve the search, recommendation, feed flow and other effects in the Meituan scenario.

According to the requirements of comprehension, there are mainly three dimensions of comprehension ability:

  1. What is it, what is the concept, what are the core concepts. For example, “repair the washing machine”, what “repair” is, what “washing machine” is.
  2. What? An attribute of one aspect of the core concept, a refinement of one aspect of the core concept. “Restaurant with terrace”, “parent-child amusement park”, “Fruit melaleuca cake”, “with terrace”, “parent-child” and “fruit melaleuca cake” are all attributes of a certain aspect of the core concept, so it is necessary to establish the correlation between the corresponding attributes and attribute values of the core concept.
  3. To what, solve the Gap between the concept of search and undertake, such as “reading”, “shopping”, “walking baby” and other clear corresponding supply concept, so the establishment of the search and supply of the concept of the association between the network, to solve this kind of problem.

1. Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy At the same time, POI (Point of Interesting), SPU (Standard Product Unit), and Tuandan are examples in Meituan scenarios, which need to be connected with the concepts in the atlas.

Starting from the construction objectives, the construction work of the overall common sense concept atlas was disassembled and divided into three types of nodes and four types of relations. The specific contents are as follows.

2.1 Three types of nodes

1. Taxonomy node: a conceptual map of the United States and the United States. 2. Taxonomy node: a conceptual map of the United States and the United States. The other is as a way to define the core category, for example, color, style, style. Both types of nodes are defined to help you understand search, recommendation, and so on. The Taxonomy node is defined as follows:

Atom concept node: consists of the minimum semantic unit node of the atlas, and has the smallest granularity of independent semantics, such as Internet celebrity, dog coffee, face, water, etc. The Taxonomy node of the atomic definition.

Composite concept node: a concept node composed of atomic concepts and corresponding attributes, such as facial hydration and facial hydration. Compound concepts need to establish a contextual relationship with their corresponding core word concepts.

2.2 Four types of relationship of atlas

Synonym/up-down relationship: semantic synonym/up-down relationship, for example, syn-face hydration. Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy

Conceptual Property relation: is a typical concept-property-value (CPV) relation, which describes and defines concepts from various attribute dimensions, such as hotpot – taste – not spicy, hotpot – specification – single, etc., as shown in the following examples:

There are two types of conceptual attribute relationships.

Pre-defined concept attributes: Currently, we pre-defined typical concept attributes as follows:

Open concept attributes: In addition to the public concept attributes defined by ourselves, we also mine and supplement specific attribute words from the text. For example, posture, theme, comfort, word of mouth, etc.

Concept undertaking relationship: this kind of relationship mainly establishes the link between user search concept and Meituan undertaking concept, such as spring – place – botanical Garden, decompression – project – boxing, etc.

Taking “event” as the core, concept continuation relationship defines “place”, “item”, “crowd”, “time”, “effect” and other supply concepts that can meet user needs. Take the event “whitening” as an example, “whitening”, as the needs of users, can be met by different supply concepts, such as beauty parlors, water and light needles, etc.. Currently, several types of continuity relationships are defined as follows:

POI/SPU- Concept relationship: POI is the instance in the Meituan scenario, and instance-concept relationship is the last stop in the knowledge graph, which is often where the business value of the knowledge graph can be better utilized. In search, recommendation and other business scenarios, the ultimate goal is to show POI that meets users’ needs. Therefore, the establishment of POI/SPU- concept relationship is an important part of the common sense concept atlas of meituan scenarios, as well as valuable data.

Thirdly, common-sense concept atlas construction

The overall framework of atlas construction is shown in the figure below:

3.1 Concept Mining

The relationships of common-sense concept atlas are all built around concepts, and the mining of these concepts is the first step of the construction of common-sense concept atlas. According to atomic concept and compound concept, corresponding methods are adopted to mine respectively.

3.1.1 Atom concept mining

Atom concept candidates come from the smallest fragments after text segmentation such as Query, User Generated Content (UGC) and blob. The criteria for atom concept judgment is to meet the requirements of popularity, significance and integrity.

  1. Epidemic: A concept should be a word with high popularity in some or some corpus, which is mainly measured by frequency features. For example, the search volume of the word “desk Ben Kill” is very low and the frequency in UGC corpus is also very low, which does not meet the requirement of epidemic.
  2. Meaningful, a concept should be a meaningful word, which is mainly measured by semantic features, such as “Cat”, “dog” usually only a simple name without other actual meanings.
  3. Completeness. A concept should be a complete word, which is mainly measured by the proportion of independent searches (the number of searches for the word as Query/the total number of searches for the word as Query). For example, “child set” is a false word segmentation candidate, which has a high frequency in UGC, but the proportion of independent searches is low.

Based on the above characteristics of atomic concepts, the XGBoost classification model is trained to judge whether atomic concepts are reasonable by combining manual annotation and training data automatically constructed by rules.

3.1.2 Complex concept mining

Compound concept candidates come from the combination of atomic concepts, and the judgment of compound concepts is more complicated than that of atomic concepts because of the combination involved. Compound concept requires some cognition in meituan while guaranteeing complete semantics. According to the type of problem, Wide&Deep model structure is adopted. Deep side is responsible for semantic judgment, and Wide side introduces the information in the station.

The structure of the model has the following two characteristics to make a more accurate judgment on the rationality of the composite concept:

  1. Wide&Deep model structure: The combination of discrete features and depth model is used to judge whether the composite concept is reasonable.
  2. Graph Embedding features: The association information between phrase collocation is introduced. For example, “food” can match with “crowd”, “cooking way” and “quality”.

3.2 Mining the relationship between concepts

1, What is it? 2, What is it? 3, What is it? 3, What is it?

1. Concept -Taxonomy

1) Taxonomy: a Taxonomy of concepts: a Taxonomy of concepts: a Taxonomy of concepts: a Taxonomy of concepts: a Taxonomy of concepts: a Taxonomy of concepts: At the same time, a concept may have several types in the Taxonomy system. For example, “lime fish” is an “animal” and also a “food”. Entity Typing is the task at which the concept and its corresponding context are input. Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy Taxonomy

3.2.2 Concepts – Upper – lower Relationship between Concepts

The knowledge system can understand what a concept is by means of the manually defined type, but the manually defined type is always limited. If the upper word is not in the manually defined type, such upper-lower relationship cannot be understood. 1. At first, there was a conceptual Taxonomy of “western Musical Instruments”, “erhu” and “Musical Instruments”. Based on the above problems, the following two methods are adopted to excavate the relationship between concepts:

The method based on lexical rules: mainly solve the upper and lower relation between atomic concept and compound concept, and use candidate relation to dig the upper and lower relation of lexical inclusion relation (such as Western musical instrument-musical instrument).

Context – based approach: Lexical rules can solve the problem of determining the context pairs where lexical inclusion exists. For a pair without lexical inclusion relation, such as “erhu – musical instrument”, it is necessary to discover the relationship between upper and lower firstly, and extract the candidate relationship of “Erhu – musical instrument”, then judge the relationship between upper and lower, and judge that “erhu – musical instrument” is a reasonable relationship between upper and lower. When considering people in explaining an object to the object types related introduction, such as when to explain the concept of “erhu” mentioned “erhu is a kind of traditional musical instrument”, from the explanatory text, both “erhu – instruments” such a relationship can be candidate of extracted, also can realize the relationship between the candidate at the same time the reasonable judgment. There are two parts in the mining of contextual relationship: candidate relationship description extraction and contextual relationship classification.

  1. Candidate relationship description extraction: 2. < p style = “border-bottom: 0px; border-bottom: 0px; border-bottom: 0px; border-bottom: 0px; At first, there is a Taxonomy of candidate relation pairs. At first, there is a Taxonomy of candidate relation pairs. Then, there is a Taxonomy of candidate relation pairs.
  2. Classification of upper and lower position: After get the candidate relationship description, need combined with the context of up and down a relationship if reasonable judgment, there will be two concepts in the text of the starting position and end position marked with a special mark, and the two concepts in the article starting position marker of vector pieced together as a representation of the relationship, according to the classifying a relationship between up and down, Vector represents the results output by BERT, and the detailed model structure is shown in the figure below:

In the structure of training data, because a relationship between expression of sentence is very sparse, a large number of co-occurrence sentences did not explicitly expressed candidate whether to have up and down a relationship, using up and down a relationship remote supervision ways to build training data is not workable, so directly using the manual annotation of training set to train the model. Due to the limited number of manual annotations with the order of thousands, the model effect can be improved by combining with UDA (Unsupervised Data Augmentation), a semi-supervised learning algorithm of Google, and the final Precision can reach 90%+. See Table 1 for detailed indicators:

3.3 Mining conceptual attribute relationships

The attributes contained in a concept can be divided into public attributes and open attributes according to whether the attributes are common or not. Common attributes are manually defined attributes that most concepts have, such as price, style, quality, and so on. Open attributes refer to attributes only contained in certain concepts. For example, “hair transplant”, “eyelash beauty” and “playbook” contain open attributes of “density”, “warping” and “logic” respectively. There are far more open attributes than public ones. For these two genus relationships, the following two methods are used to excavate them respectively.

3.3.1 Mining public attribute relations based on composite concepts

Due to the universality of public attribute, Value and Concept in CPV are usually combined in the form of composite concepts, such as price shopping mall, Japanese cuisine and red movie HD. We transform the relationship mining task into dependency analysis and fine-grained NER task (refer to “Exploration and Practice of NER Technology in Meituan Search”). Dependency analysis identifies core entities and modifiers in composite concepts, and fine-grained NER judges specific attribute values. For example, given the composite concept “red movie HD”, dependency analysis identifies the core concept of “movie”, “red” and “HD” are the attributes of “movie”, and fine-grained NER predicts that the attribute values are “Style” and “quality evaluation (HD)” respectively.

Dependency analysis and fine-grained NER have mutually useful information, such as the entity types of “graduation dummy”, “Time” and “Product”, and the dependency information of “dummy” is the core word, which can promote training each other, so the two tasks are combined to learn. However, since the correlation degree between the two tasks is not clear and there is a large noise, meta-lSTM is used to optimize the feature-level joint learning into function-level joint learning, change the hard sharing into dynamic sharing, and reduce the impact of noise between the two tasks.

The overall architecture of the model is as follows:

At present, the overall accuracy of concept modification relationship is about 85%.

3.3.2 Mining specific attribute relations based on open attribute words

Open attribute word and attribute value mining

Open attribute relation needs to mine the unique attribute and attribute value of different concepts, and its difficulty lies in the recognition of open attribute and open attribute value. By observing the data, it is found that some common attribute values (e.g., good, bad, high, low, much, little) are usually paired with attributes (e.g., good environment, high temperature, high traffic). Therefore, we adopt a template-based Bootstrapping method to automatically mine attributes and attribute values from user comments, and the mining process is as follows:

After mining open attribute words and attribute values, the mining of open attribute relation is divided into “concept-attribute” duality mining and “concept-attribute-attribute value” triplet mining.

Concept-attribute mining

Mining of “concept-attribute” duality, that is, judging whether Concept Concept contains attribute Property. The mining steps are as follows:

  • According to the co-occurrence characteristics of concepts and attributes in UGC, TFIDF variant algorithm is used to mine typical attributes corresponding to concepts as candidates.
  • The candidate concept attributes are constructed as simple natural expression sentences, and the fluency language model is used to judge the smoothness of sentences, and the concept attributes with high smoothness are reserved.

Concept – attribute – attribute value mining

After obtaining the “concept-attribute” binary, the steps of mining the corresponding attribute values are as follows:

  • Seed digging. Seed triples are mined from UGC based on co-occurrence feature and language model.
  • Template mining. Use seed triples to construct suitable templates from UGC (e.g., “Suitable water temperature is an important criterion for choosing a natatorium.” ).
  • Relationship generation. Using seed triples to fill the template, training the mask language model for relation generation.

At present, the accuracy rate of conceptual attribute relationship in open domain is about 80%.

3.4 Mining concept continuation relationship

Concept undertaking relationship is to establish the association between user search concept and Meituan undertaking concept. For example, when a user searches for “hiking”, the real intention is to find “good places for hiking”, so the platform is followed by concepts such as “country parks” and “botanical gardens”. Relationship mining needs to be carried out from 0 to 1, so the whole concept of relationship mining according to different stages of mining focus design of different mining algorithms, can be divided into three stages: (1) the initial seed mining; (2) Depth discrimination model mining in the middle stage; (3) Late relationship completion. Details are as follows.

3.4.1 Mining seed data based on co-occurrence features

In order to solve the problem of cold start in relation extraction tasks, Bootstrapping is often used to automatically expand data from corpus by manually setting a few seeds and templates. However, the Bootstrapping method is not only limited by the quality of the templates, but also has natural limitations when applied to meituan scenarios. The main source of Meituan corpus is user comments, which are very colloquial and diversified, making it difficult to design a common and effective template. Therefore, we abandoned the template-based approach and constructed a ternary comparison learning network based on co-occurrence features and category features among entities, which automatically mined potential correlation information among entity relationships from unstructured texts.

Specifically, we observed significant differences in the distribution of entities in user reviews under different merchant categories. For example, UGC under the food category often involves “dining party”, “ordering food” and “restaurant”. UGC under the fitness category often involves “weight loss”, “personal education” and “gym”. General entities such as “decoration” and “hall” appear under each category. Therefore, we build a ternary comparison learning network, which enables users’ comments under the same category to indicate proximity, while users’ comments with different purposes to indicate distance. Similar to Word2Vec and other pre-trained word vector systems, the word vector layer obtained by this contrastive learning strategy naturally contains rich relationship information. In the process of prediction, a batch of high-quality seed data can be obtained by calculating the semantic similarity between any user search concept and all the following concepts, supplemented by the statistical features of the search business.

3.4.2 Training depth model based on seed data

Pre-trained language models have made great progress in the NLP field in recent two years. Fine-tuning downstream tasks based on large pre-trained models is a very popular practice in the NLP field. Therefore, in the middle of relationship mining, we adopted bert-based relationship discrimination model (refer to Meituan BERT’s Exploration and Practice), and utilized a large amount of language knowledge BERT learned during pre-training to help relationship extraction task.

The model structure is shown in the figure below. Firstly, candidate entity pairs are obtained according to the co-occurrence characteristics between entities, and user comments containing candidate entity pairs are recalled. Then, the entity marking method in MTB paper is used to insert special symbols at the beginning and end positions of the two entities respectively. After BERT modeling, the special symbols at the beginning positions of the two entities are splicing together as a relational representation. Finally, input the relationship representation into the Softmax layer to determine whether there are relationships between entities.

3.4.3 Complete the relationship based on the existing atlas structure

Through the above two stages, a rudimentary map of conceptual continuity has been constructed from unstructured text information. However, due to the limitation of semantic model, there are a lot of missing triples in the current atlas. In order to further enrich the concept atlas and complete the missing relationship information, we applied TransE algorithm and graph neural network to complete the existing concept atlas.

Relational Graph Attention Network (RGAT) was used to model the structure information of known graphs. RGAT uses relational attention mechanism to overcome the defect that traditional GCN and GAT cannot model edge types, and is more suitable for modeling heterogeneous networks such as concept atlas. After obtaining dense embedding of entities using RGAT, we use TransE as the loss function. TransE regards r in the triplet (H, R,t) as the translation vector from H to T, and agrees that H +r≈ T. This method is widely applied to knowledge graph completion task and shows strong robustness and extensibility.

The details are shown in the figure below. In RGAT, the features of nodes at each layer are weighted by the mean of the features of neighboring nodes and the mean of the features of adjacent edges. Through the mechanism of relational attention, different nodes and edges have different weight coefficients. After getting the last layer of node and edge features, we use TransE as training goal, the training set of each pair of triple (h, r, t), and minimize the | | h + r = t | |. During the prediction, for each head entity and each relationship, all nodes of the graph are taken as candidate tail entities and their distances are calculated to obtain the final tail entities.

At present, the overall accuracy rate of concept undertaking relationship is about 90%.

3.5 POI/SPU- Conceptual relationship construction

To establish the association between the atlas concept and meituan instance, the POI/SPU name, category, user reviews and other dimensions of information will be used. The difficulty of establishing association lies in how to obtain the information related to the concept of atlas from the diversified information. Therefore, we use synonyms to recall all semantically related clauses, and then use the discriminant model to judge the degree of correlation between concepts and clauses. The specific process is as follows:

  • Synonym clustering. For the concept to be marked, according to the atlas synonym data, obtain a variety of concepts.
  • Candidate clause generation. Based on the result of synonym clustering, candidate clauses are recalled from multiple sources such as merchant name, group name, and user reviews.
  • Discriminant model. The concept – text association discriminant model (as shown in the figure below) is used to judge whether concepts and clauses match.

  • Marking results. The final discriminant result is obtained by adjusting the threshold value.

4. Application and practice

4.1 To the construction of comprehensive word atlas

Meituan comprehensive business covers a wide range of knowledge fields, including parent-child, education, medical beauty, leisure and entertainment, etc. Meanwhile, each field contains more small sub-fields, so the construction of knowledge maps in different fields can assist in search recall, screening, recommendation and other businesses.

In addition to common sense concept data, common sense concept map also includes meituan scene data and precipitation of basic algorithm ability. Therefore, common sense map ability can be used to help build map data of comprehensive words.

With the help of common sense atlas, the missing category word data is supplemented, and a reasonable category word atlas is constructed to improve the search and recall through search rewriting, POI marking and other ways. At present, in the field of education, the scale of atlas has been expanded from 1000+ nodes to 2000+, and synonyms have been expanded from thousand to 20,000 +, which has achieved good results.

The construction process of category word atlas is shown in the figure below:

4.2 Review search guidance

Comment search SUG recommendation, while guiding users’ cognition, helps to reduce the time for users to complete the search and improve the search efficiency. Therefore, SUG recommendation needs to focus on two objectives: (1) to help enrich users’ cognition, from the review OF POI, category search to increase the cognition of natural text search; ② Refine the user’s search demand, when the user is searching for some more general category words, to help refine the user’s search demand.

In the common-sense concept map, a rich set of concepts and the relationships between corresponding attributes and their attribute values are established. A relatively general Query can generate a corresponding refined Query. For example, cake can produce strawberry cake and cheesecake through the taste attribute, and 6-inch cake and pocket cake through the specification attribute.

The output example of search lead word Query is shown in the figure below:

4.3 To the comprehensive medical beauty content marking

In the display of medical beauty content, users are usually interested in a specific medical beauty service content, so different service labels will be provided in the product form to help users select accurate medical beauty content and accurately meet user needs. However, when labels are associated with medical beauty content, there are many association errors, and users often see content that does not meet their needs after screening. Improved marking accuracy helps users focus more on their needs.

The accuracy of tag-content can be improved by using the conceptual POI marking capability of atlas and the conceptual UGC marking relationship. Through mapping capability marking, the accuracy and recall rate have been significantly improved.

  • Accuracy: Through the concept – content marking algorithm, compared with keyword matching, the accuracy is improved from 51% to 91%.
  • Recall rate: The recall rate was improved from 77% to 91% through concept synonymous mining.

V. Summary and prospect

We give a detailed introduction to the commonsense concept map construction and its application in meituan scenarios. In the whole common sense concept map, there are three types of nodes and four types of relationships according to business needs, and concept mining algorithms and different types of relationship mining algorithms are introduced respectively.

At present, our common sense concept atlas has 2 million + concepts, and 3 million + relationships among concepts, including the relationship between lower and lower, synonym, attribute, and continuity, excluding the relationship between POi-concepts. At present, the overall accuracy of the relationship is about 90%, and the algorithm is constantly optimized to expand the relationship and improve the accuracy. Our common-sense concept map will continue to be improved, and we hope it can be refined and complete.

The resources

  • [1] Onoe Y, Interpretable entity Representations through large scale typing[J]. ArXiv PrePrint arXiv:2005.00147, 2020.
  • [2] Bosselut A, Rashkin H, Sap M, et al. Comet: A search algorithm for automatic Knowledge Graph construction[J]. ArXiv PrePrint arXiv:1906.05317, 2019.
  • [3] Soares L B, FitzGerald N, Ling J, et al. Matching the blanks: Distributional similarity for relation learning[J]. arXiv preprint arXiv:1906.03158, 2019.
  • [4] Peng H, Gao T, Han X, et al. Learning from context or names? An Empirical Study on Neural relation Extraction [J]. ArXiv PrePrint arXiv:2010.01923, 2020.
  • [5] Jiang, Zhengbao, et al. “How can we know what language models know? .” Transactions of the Association for Computational Linguistics 8 (2020): 423-438.
  • [6] Li X L, Liang P. Prefix-Tuning: Tip Tip for Generation. [J].
  • [7] Malaviya, Chaitanya, et al. “Commonsense knowledge base completion with structural and semantic context.” Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. No. 03. 2020.
  • [8] Hanyu Li, Li Qian, Pengfei Zhou.” Sentiment Analysis and Mining for Product review text.” Information Science 35.1 (2017): 51-55.
  • [9] Yan Bo, Zhang Ye, Su Hongyi et al. A product attribute clustering method based on user comments.
  • [10] Wang, Chengyu, Xiaofeng He, and Aoying Zhou. “Open relation extraction for chinese noun phrases.” IEEE Transactions on Knowledge and Data Engineering (2019).
  • [11] Li, Feng-Lin, et al. “AliMeKG: Domain Knowledge Graph Construction and Application in E-commerce.” Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2020.
  • [12] Yang, Yaosheng, et al. “Distantly supervised ner with partial annotation learning and reinforcement learning.” Proceedings of the 27th International Conference on Computational Linguistics. 2018.
  • [13] Luo X, Liu L, Yang Y, et al. AliCoCo: Alibaba e-commerce cognitive concept net[C]//Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2020: 313-327.
  • [14] Devlin J, Chang M W, Lee K, et al. Bert: Research on deep Bidirectional Transformers [J]. ArXiv Preprint arXiv:1810.04805, 2018.
  • [15] Cheng H T, Koc L, Harmsen J, et al. Wide & deep learning for recommender systems[C]//Proceedings of the 1st workshop on deep learning for recommender systems. 2016: 7-10.
  • [16] Liu J, Shang J, Wang C, et al. Mining quality phrases from massive text corpora[C]//Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 2015: 1729-1744.
  • [17] Shen J, Wu Z, Lei D, et al. Hiexpan: Task-guided taxonomy construction by hierarchical tree expansion[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018: 2180-2189.
  • [18] Huang J, Xie Y, Meng Y, et al. Corel: Seed-guided topical taxonomy construction by concept learning and relation transferring[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020: 1928-1936.
  • [19] Liu B, Guo W, Niu D, et al. A user-centered concept mining system for query and document understanding at tencent[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019: 1831-1841.
  • [20] Wang Y, Wang Y, Wang Y, et al. Typing of ultra-fine entity in computer Science [J]. ArXiv Preprint arXiv: 187.04905, 2018.
  • [21] Xie Q, Dai Z, Hovy E, Label-supervised data augmentation and their correlation with consistency training[J]. Appl Microbiol, 2010, 38 (1) : 105-110.
  • [22] Mao X, Wang W, Xu H, et al. Relational Reflection Entity Alignment[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2020: 1095-1104.
  • [23] Chen J, Qiu X, Liu P, et al. Meta multi-task learning for sequence modeling[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2018, 32 (1).

Author’s brief introduction

Zong Yu, Jun Jie, Hui Min, Fu Bao, Xu Jun, Xie Rui, Wu Wei, etc., are from Meituan search and NLP department-NLP center.

Recruitment information

Meituan Search and NLP Department /NLP Center is the core team responsible for THE RESEARCH and development of MEituan ARTIFICIAL intelligence technology, and its mission is to build the world’s first-class natural language processing core technology and service capabilities, relying on NLP, Deep Learning, Knowledge Graph and other technologies. Process meituan massive text data and provide intelligent text semantic understanding services for various businesses of Meituan.

NLP is looking for an expert in natural language processing/machine learning algorithms. Interested students can send their resumes to [email protected].

| this paper Meituan produced by the technical team, the copyright ownership Meituan. You are welcome to reprint or use the content of this article for non-commercial purposes such as sharing and communication. Please mark “Content reprinted from Meituan Technical team”. This article shall not be reproduced or used commercially without permission. For any commercial activity, please send an email to [email protected] for authorization.

Read more technical articles from meituan’s technical team

Front end | | algorithm back-end | | | data security operations | iOS | Android | test

| in the public bar menu dialog reply goodies for [2020], [2019] special purchases, goodies for [2018], [2017] special purchases such as keywords, to view Meituan technology team calendar year essay collection.

| this paper Meituan produced by the technical team, the copyright ownership Meituan. You are welcome to reprint or use the content of this article for non-commercial purposes such as sharing and communication. Please mark “Content reprinted from Meituan Technical team”. This article shall not be reproduced or used commercially without permission. For any commercial activity, please send an email to [email protected] for authorization.