As a little programmer in my programming field, I am currently working as team lead in an entrepreneurial team. The technology stack involves Android, Python, Java and Go, which is also the main technology stack of our team. Contact: [email protected]

In my last article I looked at the source code for a simple Python scheduler and was surprised to find that the core library was a single file with less than 700 lines of code. It’s great learning material. To my surprise, this library was written by the same author of Python Tricks, a book I recently read! Now let’s look at the realization of the great God.

0 x00 to prepare

The project address

Github.com/dbader/sche…

Checkout your code locally

The environment

PyCharm+venv+Python3

0 x01 usage

This was introduced in the last article, very simple

import schedule

Define the methods that need to be executed
def job(a):
    print("a simple scheduler in python.")

Set the scheduling parameters, in this case, every 2 seconds
schedule.every(2).seconds.do(job)

if __name__ == '__main__':
    while True:
        schedule.run_pending()

# Execution result
a simple scheduler in python.
a simple scheduler in python.
a simple scheduler in python.
...
Copy the code

The library is also well documented, and you can browse schedule.readthedocs. IO/for an overview of the library’s usage

0x02 Project structure

(venv) ➜ schedule git:(master) tree-l2. ├ ─ ─ requirements - dev. TXT ├ ─ ─ the schedule │ └ ─ ─ just set py ├ ─ ─ setup. Py ├ ─ ─ test_schedule. Py ├ ─ ─ tox. Ini └ ─ ─ venv ├ ─ ─ bin ├─ include ├─ ├─ PIP selfcheck. Json ├─ pyvenv.cfg8 directories, 18 files

Copy the code
  • scheduleThere’s only one in the directory__init__.pyDocuments, that’s what we need to focus on.
  • setup.pyA file is a configuration file for a publishing project
  • test_schedule.pyIs a unit test file. You can start by looking at the unit tests as well as the documentation to see how the library is used
  • requirements-dev.txtDevelopment environment dependency library files, if the core library is not required by third party dependencies, but unit testing does
  • venvIs mycheckoutAfter the creation of the original project is not

0x03 schedule

We know that __init__.py is the required file to define Python packages. Methods and classes defined in this file can be imported into a project and then used when using the import command.

The schedule of source code

The following modules are used in Schedule, which are all internal Python modules.

import collections
import datetime
import functools
import logging
import random
import re
import time

logger = logging.getLogger('schedule')
Copy the code

It then defines an instance of a log printing tool

This is followed by the architecture of the three Exception classes that define the module, which are derived from Exception: ScheduleError, ScheduleValueError, and IntervalError

class ScheduleError(Exception):
    """Base schedule exception"""
    pass

class ScheduleValueError(ScheduleError):
    """Base schedule value error"""
    pass

class IntervalError(ScheduleValueError):
    """An improper interval was used"""
    pass

Copy the code

A CancelJob class is also defined to cancel the continuation of the scheduler

class CancelJob(object):
    """ Can be returned from a job to unschedule itself. """
    pass

Copy the code

For example, returning the CancelJob class in a custom method that needs to be scheduled can implement a one-time task

Define the methods that need to be executed
def job(a):
    print("a simple scheduler in python.")
    CancelJob can stop subsequent execution of the scheduler
    return schedule.CancelJob
Copy the code

Next comes the library’s two core classes Scheduler and Job.

class Scheduler(object):
    """ Objects instantiated by the :class:`Scheduler 
      
       ` are factories to create jobs, keep record of scheduled jobs and handle their execution. "
      ""
    
class Job(object):
    """ A periodic job as used by :class:`Scheduler`. :param interval: A quantity of a certain time unit :param scheduler: The :class:`Scheduler 
      
       ` instance that this job will register itself with once it has been fully configured in  :meth:`Job.do()`. Every job runs at a given fixed time interval that is defined by: * a :meth:`time unit 
       
        ` * a quantity of `time units` defined by `interval` A job is usually created and returned by :meth:`Scheduler.every` method, which also defines its `interval`. "
       
      ""
Copy the code

Scheduler is the Scheduler implementation class that is responsible for the creation and execution of scheduled jobs.

A Job is an abstraction of the task that needs to be performed.

These two classes are the core of the library, as we’ll see in more detail later. Next comes the creation of the default scheduler default_Scheduler and the task list Jobs.

# The following methods are shortcuts for not having to
# create a Scheduler instance:

#: Default :class:`Scheduler <Scheduler>` object
default_scheduler = Scheduler()

#: Default :class:`Jobs <Job>` list
jobs = default_scheduler.jobs  # todo: should this be a copy, e.g. jobs()?
Copy the code

Default_scheduler is created by default after the import schedule is executed. The constructor of Scheduler is

def __init__(self):
    self.jobs = []
Copy the code

When initialization is performed, the scheduler creates an empty task list.

At the end of the document defined some chain call methods, use is also very human, worth learning. The methods here are defined in modules and encapsulate calls to default_Scheduler instances.

def every(interval=1):
    """Calls :meth:`every 
      
       ` on the :data:`default scheduler instance 
       
        `. "
       
      ""
    return default_scheduler.every(interval)


def run_pending():
    """Calls :meth:`run_pending 
      
       ` on the :data:`default scheduler instance 
       
        `. "
       
      ""
    default_scheduler.run_pending()


def run_all(delay_seconds=0):
    """Calls :meth:`run_all 
      
       ` on the :data:`default scheduler instance 
       
        `. "
       
      ""
    default_scheduler.run_all(delay_seconds=delay_seconds)


def clear(tag=None):
    """Calls :meth:`clear 
      
       ` on the :data:`default scheduler instance 
       
        `. "
       
      ""
    default_scheduler.clear(tag)


def cancel_job(job):
    """Calls :meth:`cancel_job 
      
       ` on the :data:`default scheduler instance 
       
        `. "
       
      ""
    default_scheduler.cancel_job(job)


def next_run():
    """Calls :meth:`next_run 
      
       ` on the :data:`default scheduler instance 
       
        `. "
       
      ""
    return default_scheduler.next_run


def idle_seconds():
    """Calls :meth:`idle_seconds 
      
       ` on the :data:`default scheduler instance 
       
        `. "
       
      ""
    return default_scheduler.idle_seconds
Copy the code

Take a look at the entry method run_pending(), which you can see from the Demo at the beginning of this article is the method that starts the scheduler. Here it executes the method in default_Scheduler.

default_scheduler.run_pending()
Copy the code

So let’s focus on the corresponding method of the Scheduler class

def run_pending(self):
    """ Run all jobs that are scheduled to run. Please note that it is *intended behavior that run_pending() does not run missed jobs*. For example, if you've registered a job that should run every minute and you only call run_pending() in one hour increments then your  job won't be run 60 times in between but only once. """
    runnable_jobs = (job for job in self.jobs if job.should_run)
    for job in sorted(runnable_jobs):
        self._run_job(job)
Copy the code

This method first filters the jobs from the JOBS list and puts them in the runnable_JOBS list, then sorts them and executes the internal _run_JOB (job) method in sequence

def _run_job(self, job):
    ret = job.run()
    if isinstance(ret, CancelJob) or ret is CancelJob:
        self.cancel_job(job)
Copy the code

The _run_job method calls the run method of the job class and determines whether the job needs to be canceled based on the return value.

At this point we need to look at the implementation logic of the Job class.

The first thing we need to look at is when the Job was created. So let’s start with the code in the Demo

schedule.every(2).seconds.do(job)
Copy the code

The schedule.every() method is executed first

def every(interval=1):
    """Calls :meth:`every 
      
       ` on the :data:`default scheduler instance 
       
        `. """
       
      
    return default_scheduler.every(interval)
Copy the code

This method is the every method in the Scheduler class

def every(self, interval=1):
    """ Schedule a new periodic job. :param interval: A quantity of a certain time unit :return: An unconfigured :class:`Job 
      
       ` """
      
    job = Job(interval, self)
    return job
Copy the code

A job is created here, and the parameters Interval and scheduler instances are passed into the constructor, and the job instance is returned for the chained call.

Jump to Job constructor

def __init__(self, interval, scheduler=None):
    self.interval = interval  # pause interval * unit between runs
    self.latest = None  # upper limit to the interval
    self.job_func = None  # the job job_func to run
    self.unit = None  # time units, e.g. 'minutes', 'hours', ...
    self.at_time = None  # optional time at which this job runs
    self.last_run = None  # datetime of the last run
    self.next_run = None  # datetime of the next run
    self.period = None  # timedelta between runs, only valid for
    self.start_day = None  # Specific day of the week to start on
    self.tags = set()  # unique set of tags for the job
    self.scheduler = scheduler  # scheduler to register with
Copy the code

It mainly initializes the interval configuration, the method to be executed, and the various time units of the scheduler.

The every method is followed by the call to the seconds attribute method

@property
def seconds(self):
    self.unit = 'seconds'
    return self
Copy the code

Set the unit of time, this is set to seconds, of course, there are other similar property methods minutes, hours, days, etc.

Finally, the do method is executed

def do(self, job_func, *args, **kwargs):
    """ Specifies the job_func that should be called every time the job runs. Any additional arguments are passed on to job_func when the job runs. :param job_func: The function to be scheduled :return: The invoked job instance """
    self.job_func = functools.partial(job_func, *args, **kwargs)
    try:
        functools.update_wrapper(self.job_func, job_func)
    except AttributeError:
        # job_funcs already wrapped by functools.partial won't have
        # __name__, __module__ or __doc__ and the update_wrapper()
        # call will fail.
        pass
    self._schedule_next_run()
    self.scheduler.jobs.append(self)
    return self
Copy the code

Here we use partial in the FuncTools tool to wrap our custom methods into callable objects

Then call the _schedule_next_run method, which mainly analyzes the time and sorts the jobs according to the time. I think this method is the technical point of this project, and the logic is a little complicated. If you read it carefully, you can understand it. Due to space, I won’t post the code here.

This completes the addition of the job. The task can then be executed by calling the run_pending method.

0x04 To sum up

The Schedule library defines two core classes Scheduler and Job. A Scheduler object is created by default and the task list is initialized when the package is imported. The Schedule module provides an interface for chain calls. When the Schedule parameter is configured, the task object job is created and added to the task list. Finally, our custom method is called when the run_pending method is executed. The core idea of this library is to use object-oriented methods to accurately abstract things, its overall logic is not complex, is a good example of learning source code.

0x05 Learning Materials

  • Github.com/dbader/sche…
  • schedule.readthedocs.io