Scheduled task scenario

The so-called scheduled task actually has two situations. One is to trigger the execution of a task at a specific time point, such as every morning, every Saturday at 2 PM, etc. The other is to trigger a task at a specific interval or frequency, such as once every hour, etc.

In our actual work, there are many scenarios for using timed tasks, such as:

  1. Collect some business data and make reports every day
  2. The order will be cancelled if the user fails to pay within 30 minutes after placing the order
  3. Send a message to the user at a specific time (blessing SMS, etc.)
  4. Compensation mechanism that periodically scans databases and logs to compare disparate data and compensate
  5. And so on…

Just because of the wide range of application scenarios, our predecessors have racked their brains and created many practical tools and frameworks for us. Only today can we stand on the shoulders of giants, see farther and go faster.

Here’s a list of these methods and tools.

crontab

Crontab is not strictly Java. It is a tool that comes with Linux and periodically executes a shell script or command.

However, since crontab is widely used in actual development and crontab expressions are similar to crON expressions of other scheduled task frameworks we will introduce later, crontab will be introduced here first

Crontab can be used with:

crontabExpression command
Copy the code

First, command can be a Linux command (e.g. Echo 123), a shell script (e.g. Test.sh), or a combination of the two (e.g. CD/TMP; sh test.sh)

CrontabExpression looks something like this

The command is executed at the 5th minute of every hour
5 * * * * Command 
# specify that the command will be executed once a day at 6:30 p.m
30 18 * * * Command 
Run the command at 7:30 p.m. on the 8th of each month
30 7 8 * * Command
# specify that the command is executed once a year at 5:30 on June 8
30 5 8 6 * Command 
# specify that the command is executed every Sunday at 6:30 a.m
30 6 * * 0 Command 
Copy the code

CrontabExpression contains five columns with the following meanings:

  1. The first column indicates minutes. The value ranges from 0 to 59
  2. The second column indicates that yes, the value ranges from 0 to 59
  3. The third column indicates the date
  4. The fourth column indicates the month and the value ranges from 0 to 12
  5. Column 5 indicates the week

In addition, each column could be *? – and so on special characters, the specific meaning can refer to this article, which summarizes better, I will not say more here

timer

Java.util. Timer and java.util.TimerTask are classes provided in the JDK.

TimerTask represents a specific task, while Timer schedules the task.

Here’s a simple example:

import java.util.Timer;
import java.util.TimerTask;

public class TimerTest extends TimerTask {

    private String jobName = "";

    public TimerTest(String jobName) {
        super();
        this.jobName = jobName;
    }

    @Override
    public void run() {
        System.out.println("execute "+ jobName); } public static void main(String[] args) { Timer timer = new Timer(); long delay1 = 1 * 1000; long period1 = 1000; Job1 timer.schedule(new TimerTest()"job1"), delay1, period1); long delay2 = 2 * 1000; long period2 = 2000; Job2 timer.schedule(new TimerTest()"job2"), delay2, period2); }}Copy the code

Of course, timers are not recommended in production environments. It has certain problems in multi-threaded environment:

1. When a thread throws an exception, the entire timer stops running. If job1 throws an exception, job2 will not run again. If the processing time of a thread is very long, the scheduling of other jobs will be affected. For example, if joB1 takes 60 seconds to process, joB2 becomes a 60-second run.Copy the code

For these reasons, timers are generally not used anymore.

ScheduledExecutorService

ScheduledExecutorService is one of several custom thread pools in the JDK.

From the API, it feels like it’s meant to replace the Timer, and it’s totally replaceable. I just don’t know why Timer is still not marked expired, there must be some application scenarios

First, ScheduledExecutorService does everything a Timer can do;

Secondly, ScheduledExecutorService can perfectly solve the two problems mentioned above:

1. When an exception is thrown, new threads are created in the thread pool even if the exception is not caught. Therefore, the scheduled task will not stop. Since ScheduledExecutorService is handled by different threads, no matter how long the running time of one thread is, it does not affect the running time of another thread.Copy the code

Of course, ScheduledExecutorService is not a panacea. For example, ScheduledExecutorService is a bit cumbersome to implement if I want to implement the requirement that I execute a line of code every Saturday at 2pm.

ScheduledExecutorService is better suited for scheduling these simple tasks at a specific frequency. And for the rest, it’s time for our famous Quartz.

quartz

In the Java world, Quartz is the absolute king of Mount Rushmore. Most of the open source scheduling frameworks on the market are based directly or indirectly on this framework.

Let’s start with a simple example of Quartz.

Use the cron expression to make Quartz perform a task every 10 seconds:

Let’s start with maven dependencies:

<! -- https://mvnrepository.com/artifact/org.quartz-scheduler/quartz --> <dependency> < the groupId > org. Quartz - the scheduler < / groupId > < artifactId > quartz < / artifactId > < version > 2.3.2 < / version > < / dependency >Copy the code

Write code:

import com.alibaba.fastjson.JSON;
import org.quartz.*;
import org.quartz.impl.StdSchedulerFactory;

import static org.quartz.CronScheduleBuilder.cronSchedule;
import static org.quartz.JobBuilder.newJob;
import static org.quartz.TriggerBuilder.newTrigger;

public class QuartzTest implements Job {

    @Override
    public void execute(JobExecutionContext jobExecutionContext) throws JobExecutionException {
        System.out.println("Here are your scheduled tasks:"+ JSON.toJSONString( jobExecutionContext.getJobDetail())); } public static void main(String[] args) {try { StdScheduler is actually a QuartzScheduler agent Scheduler Scheduler. = StdSchedulerFactory getDefaultScheduler (); Scheduler scheduler.start(); // Create a new Job, specify the execution class as QuartzTest, specify a data type K/V, JobDetail job = newJob(quartztest.class).usingJobData("jobData"."test")
                    .withIdentity("myJob"."group1") .build(); // create a newTrigger that represents the scheduling plan of JobDetail, where the cron expression is executed every 10 seconds Trigger = newTrigger().withidentity ("myTrigger"."group1")
                    .startNow()
                    .withSchedule(cronSchedule("0/10 * * * *?")) .build(); ScheduleJob (job, trigger); // Scheduler.shutdown (); Thread.sleep(10000000); } catch (SchedulerException se) { se.printStackTrace(); } catch (InterruptedException e) { e.printStackTrace(); }}}Copy the code

This simple example already contains some of quartz’s core components:

Scheduler - Can be understood as an instance of a Scheduler that is used to schedule tasks Job - This is an interface that represents tasks to be executed. Similar to timerTask.jobdetail - used to define instances of jobs. Further encapsulates and expands the specific instance of Job Trigger - defines the scheduling plan of JobDetail. JobBuilder - Used to define/build JobDetail instances. TriggerBuilder - Used to define/build trigger instances.Copy the code
1. Scheduler

Scheduler is an interface that has four implementations:

JBoss4RMIRemoteMBeanScheduler
RemoteMBeanScheduler
RemoteScheduler
StdScheduler
Copy the code

Our example above uses StdScheduler, which represents scheduling directly locally (the others all have the word remote, obviously referring to remote calls).

Take a look at the comments and constructors for StdScheduler

/**
 * <p>
 * An implementation of the <code>Scheduler</code> interface that directly
 * proxies all method calls to the equivalent call on a given <code>QuartzScheduler</code>
 * instance.
 * </p>
 * 
 * @see org.quartz.Scheduler
 * @see org.quartz.core.QuartzScheduler
 *
 * @author James House
 */
public class StdScheduler implements Scheduler {

    /**
     * <p>
     * Construct a <code>StdScheduler</code> instance to proxy the given
     * <code>QuartzScheduler</code> instance, and with the given <code>SchedulingContext</code>.
     * </p>
     */
    public StdScheduler(QuartzScheduler sched) {
        this.sched = sched; }}Copy the code

Original StdScheduler is merely an agent, it is called org. Quartz. Core. The QuartzScheduler class methods.

Check RemoteScheduler and other three implementations, also are proxies QuartzScheduler just.

So obviously, the core of Quartz is the QuartzScheduler class.

So take a look at QuartzScheduler’s Javadoc comment:

/**
 * <p>
 * This is the heart of Quartz, an indirect implementation of the <code>{@link org.quartz.Scheduler}</code>
 * interface, containing methods to schedule <code>{@link org.quartz.Job}</code>s,
 * register <code>{@link org.quartz.JobListener}</code> instances, etc.
 * </p>
 * 
 * @see org.quartz.Scheduler
 * @see org.quartz.core.QuartzSchedulerThread
 * @see org.quartz.spi.JobStore
 * @see org.quartz.spi.ThreadPool
 * 
 * @author James House
 */
public class QuartzScheduler implements RemotableQuartzScheduler {
	...
}
Copy the code

QuartzScheduler is the heart of Quartz, which indirectly implements the org.Quartz.Scheduler interface, including methods for scheduling jobs and registering jobListeners

The Scheduler interface is described as an indirect implementation, but if you look at its inheritance diagram, you will see that it has nothing to do with the Scheduler interface. It is completely independent of the Scheduler interface, with almost all scheduler-related logic implemented in it

As can be seen from the RemotableQuartzScheduler in this inheritance diagram, QuartzScheduler is inherently capable of supporting remote scheduling (with rmI remotely triggering scheduling, the management and execution of scheduling can be separated).

Of course, this is the way it works in most applications, but in our simplest case, locally triggered scheduling, locally executed tasks.

2. Job, JobDetail

Job is an interface that defines only one execute method, which represents the logic of task execution.

public interface Job {
    void execute(JobExecutionContext context)
        throws JobExecutionException;
}
Copy the code

The default implementation of JobDetail is JobDetailImpl. Inside JobDetail, the implementation class of JobDetail is specified, and some new parameters are added:

1. Name and group are combined into a JobKey object which serves as the unique ID of this JobDetail 2. JobDataMap passes some extra parameters for Job 3. This is where Quartz differs from regular Timers. His job can be persisted to the databaseCopy the code

As you can see, Job detail is actually an enhancement of the Job class. Job is used to represent the execution logic of tasks, while JobDetail is more related to Job management.

3. Trigger

The Trigger interface is arguably the core feature of Quartz. Quartz is a scheduled task scheduling framework, and the scheduling logic of scheduled tasks is implemented in Trigger.

Take a look at the Trigger implementation classes, which at first glance seem quite numerous. But the ones circled in red are real implementation classes, and the others are interfaces or implementation classes:

In fact, the only ones we use most are SimpleTriggerImpl and CronTriggerImpl, which represent simple scheduling logic such as executing every minute. The latter can use CRon expressions to specify more complex scheduling logic.

Obviously, in the simple example above we used CronTriggerImp

Note that the Cron expression for Quartz is different from the cron expression for Linux crontab, which can be directly expressed at the second level:

1. Seconds 2. Minutes 3. Hours 4. Day-of-month 5."0 0 12? * WED"- That means"Every Wednesday at 12:00 p.m.".Copy the code

When using CronTrigger, it is error-prone to write cron expressions directly, so it is a good idea to have a tool to verify that your cron expressions are written correctly and that the firing time is as expected.

This work has been done for us, such as the following website:

tool.lu/crontab/

The actual effect is as follows:

So that’s a primer for Quartz. But it’s really just a primer. In fact, Quartz is far more complex and powerful than this example suggests.

Such as:

Quartz can be configured in cluster mode, which provides failure transfer, load balancing and other functions. It improves computing power and system availability. Quartz also supports JTA transactions, which can run some jobs in a transaction. Quartz could theoretically run tens of thousands of jobs as long as the server resources supported it. Etc etc...Copy the code

Of course, Quartz is not without its faults; The whole framework focuses on “scheduling” to the exclusion of other aspects, such as interaction and performance.

  1. Interactively, Quartz simply provides a scheduler.scheduleJob(job, trigger) API. It does not provide any management interface, which is very dehumanizing.

  2. Quartz does not have native sharding support. This can lead to a very long run time when running a large task. For example, if you have to run 100 million member data, you may not be able to run it all in a day. It’s much easier if you support sharding. Can split 100 million members into multiple instances to run, higher performance.

On these two points, some other frameworks do a better job.

Elastic – job and xxlJob

Elastics-job and xxl-Job are two excellent distributed task scheduling frameworks that are at least in the top two of all the distributed scheduling frameworks I’ve used (because I’ve used both, hahaha)

These two frameworks have their own characteristics, which are common: distributed, lightweight, user-friendly interaction

elastic-job

Elastice-job is a powerful distributed framework that dangdang has open-source based on quartz secondary development. But in my experience, there are two things that make Elastice-Job great: job sharding and elastic scaling

1. Job sharding refers to dividing a large task into multiple sub-tasks as mentioned above, and then processing these sub-tasks by multiple job nodes, so as to shorten the job time. Elastic capacity expansion and reduction is closely related to job sharding. A simple understanding is to increase or decrease a job node to ensure that each sharding has nodes to process and each node has sharding to process.Copy the code

For more information on the principles and knowledge of Elastice-Job, please refer to the website.

xxl-job

Xxl-job is another widely used distributed task scheduling framework. Earlier xxlJob was also developed based on Quartz, but now it is gradually removed from Quartz and changed into a self-developed scheduling module.

I prefer xxl-job to elastic-job, which has the following advantages:

Xxl-job supports almost all functions supported by elastic-job. I tried to take a screenshot, but it turned out I couldn't take a screenshot. There is a real separation of scheduling and execution, but I don't like the fact that scheduling and execution are mixed together and embedded in the service system. 3. Xxl-job has a rich and flexible management background. One of my favorite features is that you can see the log of the task execution in the console.Copy the code

Also, because the official documentation is very detailed, so I can’t introduce it more than the official website. So for more features and principles, you can go to the official website

conclusion

From simple to complex, this paper introduces 6 kinds of scheduling task processing schemes. Of course, it is generally recommended to use elastice-job and xxl-job in production environments. However, if it is a simple task, it is not impossible to use simple crontab. I often use crontab for scheduled tasks related to business.

Of course, with the increasing amount of data and the rapid development of big data technology, many excellent scheduling frameworks have emerged in ecosystems such as Hadoop and Spark. But that is beyond the scope of this article.