background

When using Selenium + Chromedriver to crawl information on a website, people assume that it will not be detected by the site’s anti-crawler mechanism. However, many of the parameters are not the same as the actual browser, so it’s easy for the site to determine if you’re using Selenium + Chromedriver to simulate the browser. Among them

window.navigator.webdriver
Copy the code

It’s a very important one.

Problem to spy out

Normal browser opening looks like this

The simulator opens like this

ChromeOptions options = null; IWebDriver driver = null; try { options = new ChromeOptions(); options.AddArguments("--ignore-certificate-errors"); options.AddArguments("--ignore-ssl-errors"); // options.AddExcludedArgument("enable-automation"); // options.AddAdditionalCapability("useAutomationExtension", false); var listCookie = CookieHelp.GetCookie(); if (listCookie ! = null) { // options.AddArgument("headless"); } // string ss = @"{ ""source"": ""Object.defineProperty(navigator, 'webdriver', { get: () => undefined})""}"; // options.AddUserProfilePreference("Page.addScriptToEvaluateOnNewDocument", new ssss() { source = " Object.defineProperty(navigator, 'webdriver', { get: () => undefined }) " }); ChromeDriverService service = ChromeDriverService.CreateDefaultService(System.Environment.CurrentDirectory); service.HideCommandPromptWindow = true; driver = new ChromeDriver(service, options, TimeSpan.FromSeconds(120)); ////session.Page.AddScriptToEvaluateOnNewDocument(new OpenQA.Selenium.DevTools.Page.AddScriptToEvaluateOnNewDocumentCommandSettings() ////{ //// Source = @"Object.defineProperty(navigator, 'webdriver', { get: () => undefined })" ////} //// );Copy the code

So, if the site gets this parameter through js code, the return value of undefined means that it is a normal browser, and the return value of true means that Selenium emulates the browser.

The solution

In this case, how do you prevent this parameter from telling the site that you’re emulating the browser during crawler development? Execute the corresponding js to change its value.

IJavaScriptExecutor js = (IJavaScriptExecutor)driver; string returnjs = (string)js.ExecuteScript("Object.defineProperties(navigator, {webdriver:{get:()=>undefined}});" );Copy the code

Running effect

Perfect, achieve the desired effect.