This is the first day of my participation in Gwen Challenge

Use of xpath in Python crawlers or automation
Whether it’s crawler or browser automation, we need to get elements first, and xpath is a very efficient way to do that

First, a conceptual illustration of xpath: xpath is a language for finding information in XML documents. XPath can be used to traverse elements and attributes in AN XML document. The simple idea is that xpath looks for specific elements in an XML document based on node attributes. HTML and XML are both markup languages, based on text editing and modification. So xpath can also be used to get information in THE HTML language

Main syntax of xpath
  1. One slash (/) and two slashes (//)

    A slash ‘/’ when retrieving an element represents starting at the root of the entire document

    For example/HTML /body/div/ul/li[1] this [1] corresponds to the subscript of the element from the array

  2. The two slashes represent the start from the current node matched in the document

    For example, “//div[@class = ‘table-con’]/ul/li[3]/input” instead of starting with the root HTML

  3. The ‘@’ expression represents selection properties such as href class ID

Example: Here is my xpath fetch in the project. Get the node of an inPut box to operate on first find its parent node to see if it is unique. If unique uses relative paths directly // to start xpath syntax validation on the console

graph TD
Start --> Stop

Xpath several commonly used functions
  1. Contains () : //div[contains(@id,’in’)] selects a div node whose ID contains ‘in’
  2. Text () : Since the text value of a node is not an attribute, such as ‘baidu’, use the text() function to match the node: //a[text()=’baidu’]. Contains [contains(text(),’ baidu ‘)]
  3. Starts () : //div[boot-with (@id,’in’)
  4. //input[@name= ‘identity’ and not(contains(@class, ‘a’)))]; //input[@name= ‘identity’ and not(contains(@class, ‘a’))] //input[@id] //input[not(@id)] //input[@id]

The xpath axis is not currently used and will be completed if it is used in the future