Welcome to visit netease Cloud Community to learn more about Netease’s technical product operation experience.
Here is an in-depth perspective on how to manage forum ads.
Generally, in the early stage of development, there is not much content generated spontaneously by users, and every piece of data is precious, so anti-spam demand is hardly taken into account. With the expansion of the product scale, it attracted the attention of the grey black industry, accompanied by all kinds of junk advertising information. The first type of risk mainly comes from regulatory authorities, mainly including illegal information related to politics, pornography and violence. In recent years, national regulatory authorities’ monitoring of network information has reached an unprecedented height, and the number of enterprises facing punishments such as closure and rectification has been rising every year. At the same time, the industry is facing another challenge is advertising psoriasis. In a sense, the higher the flow of a product, the more hot property in the eyes of the black industry.
When faced with the problem of spam, the conventional method is to add keywords to the text, and then manual review, the detection of images is basically pure manual. Taking keywords as an example, it is common to encounter a dilemma: First, setting fatal words (delete them if they hit) is simple and crude, sacrificing user experience behind it. Second, the setting of suspect words (manual review after matching) will bring risks and costs. The timeliness of manual audit is not as good as that of machine, and with the increase of content volume, audit manpower will continue to be invested. A simple example, the “oral sex” is a buzzword in pornography, the conventional approach is to put the word automatic block or enter the pending, but for machine testing will be “out of context”, when this kind of words appear in the normal context when will generate matching error, such as “interface transition”, “24 port switch”.
There is a problem, there is a solution, this is the anti-waste operation must master the skill. Risk and cost are the core of operation. To deal with risks, lies in the deep analysis and understanding of risks, so as to draw inferences from one example and control them in advance. For example, for illegal information, the operation needs to have high sensitivity and scale grasp. Need to have a basic understanding of laws and regulations and be able to break them down to enforceable objective standards. What can be sent, what can not be sent, can be sent to what extent in the heart to have a ruler. This degree of control is related to the balance of risk and product flow, here is no longer one by one. Similarly, when it comes to advertising, the operational challenge is a variation of the sample. Here’s the tip of the iceberg:
· Homonyms: different Chinese characters with the same pinyin. The most typical example is part-time and part-time.
· Pictographs: Chinese characters with similar shapes. For example: jian job and faint job, word similar, pronunciation is different, homophonic solution can not be applied.
· Word disassembling: Use the relationship between radicals and radicals of Chinese characters to disassemble and bypass. For example: and ear only
· Interference words: Bypass general fuzzy matching and add interference characters in the middle of keywords. For example, to hold a part-time job
The good or bad of anti – garbage lies in how deep and wide it is. The most direct manifestation is how many times variation samples need to be tried to bypass, and behind the support is the maximum prevention and control of new samples.
In the face of such advertisements, netease Cloud Security (ESHIELD) provides a number of content security cloud services, such as text detection and image detection, based on netease’s 20 years of technological accumulation and security big data.
How to build SBT to compile Scala development of Android project 【 recommended 】 In-depth analysis of SQL Server high availability image implementation principle