background

A friend asked me to develop a ticket snatching class of tools, just recently have read python3 books, incidentally practice hand agreed to her. Off-topic: It is the customer reservation function in the CRM system of a company. It is very difficult to reserve financial products with a quota of less than 2 million yuan (while there is no need to grab products with a quota of 5 million yuan), and there are only a few quotas for each product in the country. Since my friend went to the company shortly (new) less than 2 million products is her main source of income and performance. The purpose of writing this article is just to record the points involved in all aspects (stepped on the pit) down, hoping to help beginners.

  1. Use of switch_to() in Selenium
  2. How are log-in users simulated in Requests
  3. Selenium + Requests implements omnipotent operations
  4. Pyqt communicates with QML files
  5. UI is stuck with multithreading

Development iterative process

Edition 1: Slow Automation — Selenium

At first, I thought it was just a hand speed tool, so I used Selenium to simulate human actions. The tool was written very quickly, just 100 lines of code. Pits encountered:

  • Due to the frameset structure used in the web page, the switch_to() method is used, so you need to pay attention to the relative position.
 iframes = self.driver.driver.find_elements_by_tag_name('iframe')
 iframe1 = iframes[1]
 print('Get booking page address:' + iframe1.get_attribute('src'))
 self.driver.driver.switch_to.frame(iframe1)  # Switch to the product reservation page iframe
Copy the code
  • Resolve chrome errors caused by XSS
chrome_opt = Options()
chrome_opt.add_argument('--disable-xss-auditor')  
self.driver_name = 'chrome'
self.driver = Browser(driver_name=self.driver_name,chrome_options=chrome_opt)
Copy the code

In the process of the actual robbery, I still didn’t get it. Although it was much faster than people, it seems that we need to speed up, which reminds me of Requests.

Version 2: Requests for Speedy manual operations

As expected, the speed is still very fast, but due to the CRM system, it is impossible for people in other industries to operate it (cookies, product search and product RESERVATION URL need to be obtained through review elements, etc.).

# Prepare to search
postdata = {
    'start': '0'.'Search': '1'.'Key': product_name,
    'loadStore': 'true'.'extResponse':  'true',
}
headers={
    'User-Agent': 'the Mozilla / 5.0 (Macintosh; Intel Mac OS X 10.14; The rv: 65.0) Gecko / 20100101 Firefox 65.0 / '.'Accept': 'text/html,application/xhtml+xml,application/xml; Q = 0.9, image/webp, * / *; Q = 0.8 '.'Accept-Language': 'zh-CN,zh; Q = 0.8, useful - TW; Q = 0.7, useful - HK; Q = 0.5, en - US; Q = 0.3, en. Q = 0.2 '.'Accept-Encoding': 'gzip, deflate'.'Connection': 'keep-alive'.'X-Requested-With': 'XMLHttpRequest'.'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8'.'Cookie': header_cookies
}
search_count = 0
while True:
    rep = s.post(search_prod, data=postdata, headers=headers)

    product_json = json.loads(rep.text)
    search_count = search_count + 1
    if search_count % 10= =1:
        print('(%d) search underway... ' % search_count)
    try:
        results = product_json['results']
        if results > 1:
            print('Error: Multiple products found, please exit and re-enter product name')
            break

        elif results == 1: # Find the product
            proudct_id =  product_json['records'] [0] ['id']
            proudct_CPJC =  product_json['records'] [0] ['CPJC']
            print('>>> Find the reserved product: id:'+proudct_id+'CPJC:'+proudct_CPJC)
            break
        elif results == 0:
            # loop read
            time.sleep(0.001)
    except json.decoder.JSONDecodeError as e:
        print('Incorrect parameters')
        exit(0);
Copy the code

However, because the URL of this CRM system is also dynamic, containing the oprateId (different pages, and dynamic change, no rule found), we can only find the corresponding URL and simulation header information from the review elements (only effective for a short time). Besides, it’s impossible that I rob her every time, hence the idea of Selenium +requests

Version 3: All-powerful combinations — Selenium + Requests

Selenium obtains cookies and Operateids and tokens on page addresses before searching for products and submitting reservations.

Get cookies through Selenium’s get_cookies()

cookies = self.driver.driver.get_cookies()
for cookie in cookies:
    if cookie['name'] = ='JSESSIONID':
        self.jsessionid = cookie['value']
        break

print('cookie information:)
print('jsessionid:' + self.jsessionid)
        
Copy the code

Get the operateId and token on the page address using Selenium switch_to()

iframe = self.driver.driver.find_element_by_tag_name('iframe')
self.driver.driver.switch_to.frame(iframe)  # Switch to the lower half of the home page iframe
self.driver.click_link_by_text("Product Reservation")
time.sleep(1)
self.driver.driver.switch_to.parent_frame()
iframes = self.driver.driver.find_elements_by_tag_name('iframe')
iframe1 = iframes[1]
# print(' iframe1.get_attribute(' SRC '))
self.driver.driver.switch_to.frame(iframe1)  # Switch to the product reservation page iframe

self.driver.click_link_by_id('ext-gen32')  Click the search page
time.sleep(1)
self.driver.driver.switch_to.parent_frame()
iframes = self.driver.driver.find_elements_by_tag_name('iframe')
iframe2 = iframes[2]
search_url = iframe2.get_attribute('src')
# print(' search url: '+ search_URL)
parsed_search_url = urllib.parse.urlparse(search_url)
# print(parsed_search_url)
query_str = parsed_search_url.query
query_parms = query_str.split('&')

dict_query = self._parseQuery(query_parms)  Handle URL parameters
token = dict_query['Token']
operateid = dict_query['OperateID']
self.Token = token
# self.SearchOperateID = operateid
self.YuyueOperateID = operateid
print('Get Token:' + self.Token)
print('Get the booking page opcode:' + self.YuyueOperateID)
Copy the code

This version is close to perfect with automated login (acceptance code still requires manual entry), automatic product search, and automatic booking when a product is released. After testing, all 4 products have been booked. However, the login username, password, customer’s phone, the amount of the reservation, and the product to be booked are all written in a Python file. Let her change a few words (let her write code, I accept), I was going to solve it through a configuration file. I haven’t tried UI manipulation in Python (I used C +GTK a long time ago), so I decided to try PyQt +QT Creator (the template only wants visual manipulation).

Version 4: Put programs in a shell — PYQT+Qt Creator

Qt Creator is designed to enable developers to use Qt as an application framework to complete development tasks quickly and easily. Qt Creator includes a project generation wizard, an advanced C++ code editor, a tool for browsing files and classes, integration with Qt Designer, Qt Assistant, Qt Linguist, graphical GDB debugging front end, integration with qmake build tools, and more. Qt Creator can create a variety of projects, I chose to use QML files, quickly made QML files, but how to communicate with Python? Doc. Qt. IO/qtforPython…

  1. To invoke Python on an interface trigger event, declare pyqtSlot() as a slot function in Python and set the context association so that it can be called directly from QML files.
PyqtSlot () # QML
def begin(self):
    # code slightly
    pass

if __name__ == '__main__':

app = QGuiApplication(sys.argv)
qml = QQmlApplicationEngine('ui.qml')
rootObject = qml.rootObjects()[0]

instance = Reserve(qml.rootContext() ,rootObject)   # make a reservation
qml.rootContext().setContextProperty('con',instance ) Create an association with the QML file

sys.exit(app.exec())
Copy the code

The QML file is bound to events that trigger calls to Python slot functions

Connections {
    target: button_start
    onClicked: con.begin()
}
Copy the code
  1. How do you actively change values in the UI? For example, we used print to print logs before. Now we need to display all logs on the interface and customize methods in QML files, similar to javascript
function updatelog(log) {// Define the function
    textArea.append(log)
}
function clearlog() {
    textArea.clear()
}
Copy the code

And then you can directly

 def __init__(self,context,parent=None):
    super(Reserve,self).__init__(parent)
    self.win = parent
    self.ctx = context
    
 QML methods can be called directly
self.win.showlog("Test log")
Copy the code

Okay, the interface is there, and the program is packaged, so let’s get the tickets. What’s going on? The interface is stuck. I learned before that I need to separate the interface from the program using threads. Start by creating a thread class

class WorkThread(QThread):
    signal = pyqtSignal(type(""))
    clearsignal = pyqtSignal()
    message=""
    yuyue=""

    def __int__(self,parent=None):
        super(WorkThread,self).__init__(parent)

    def __del__(self):
        self.wait()
    # set
    def setup(self, instance):
        self.yuyue = instance

    # Output information for internal/external (thread) use
    def log(self,message):
        self.signal.emit(message)

    def run(self):
        self.yuyue.config()
        self.yuyue.login()
        self.yuyue.start()
        Signal after execution
        self.log("Operational complete")
Copy the code

In the main program _Init__ method, the thread is started with two signals, one for printing the log and one for clearing the log (the log reaches a certain threshold).

     def __init__(self,context,parent=None):
        super(Reserve,self).__init__(parent)
        self.win = parent
        self.ctx = context
        
        chrome_opt = Options()
        chrome_opt.add_argument('--disable-xss-auditor')  # Fix chrome error caused by XSS.
        self.driver_name = 'chrome'
        self.driver = Browser(driver_name=self.driver_name,chrome_options=chrome_opt)

        Start a thread and set the connection channel
        self.thread = WorkThread()
        self.thread.signal.connect(self.callbacklog)
        self.thread.clearsignal.connect(self.callbackclear)
        Send a message to the channel
        self.thread.log("Initialization completed")
        # save session
        self.s = requests.session()

    # Slot function (channel end)
    def callbacklog(self, log):
        self.win.updatelog(log)  Call a method in QML
        pass

    def callbackclear(self):
        self.win.clearlog()  Call a method in QML
        pass
Copy the code

Instead of using print() to print the log, replace it with self.thread.log(). The interface doesn’t stick at all.

Conclusion: There’s a lot of ground to cover for a python novice, but there’s a lot of ground to gain. I have always been a god in the eyes of many of my friends (in fact, I know these are tricks). When my friend finished the second version, he said something that made me very happy:

You made my dream come true. That’s amazing.

Of course, there are still areas that can be improved, such as the verification code can not be manually input, can use machine learning, training of the verification code, and then automatic recognition. But there’s really no need. There’s really no need for her to use it alone.