preface

Crawlab is an open source distributed crawler management platform. Recently, the V0.6 beta version was released, which has made a lot of optimization in performance, stability, ease of use and other aspects. How does it differ from the previous V0.5 release? What are the major optimizations? As a Crawlab user should I migrate to a new version? This article will introduce the main differences between Crawlab V0.6 and Crawlab V0.6, as well as how to install and use the new version.

New Features

As we all know, the new version of V0.6 does a lot of low-level optimizations, but the authors are not going to cover the underlying principles, which will be covered in a later article. This section introduces the main new features and features of the new release in terms of usage.

File Upload Optimization

Before Crawlab V0.6, there were two ways to upload crawler files: ZIP package or using CLI tools. ZIP package uploads prove to be a poor way to do it, especially when it comes to Scrapy projects, where scrapy. CFG must be packed (the root directory) or Crawlab’s built-in Scrapy AIDS will fail to recognize them. Or runtime errors due to inconsistent directory hierarchies; In addition, similar directory hierarchy problems occur even when CLI uploads are used. In the new version, we keep the CLI upload mode, remove the ZIP upload mode, and add drag and drop, select directory, select file and other non-ZIP upload mode.

Drag and drop to upload

Drag and drop uploading is very simple, just like in the IDE, drag and drop files from the operating system directory to one of the directories in the file editor navigation bar to upload.

Select a directory to upload

If you want to upload the entire crawler directory, you can also do so in Crawlab’s new version. The operation mode is very simple, click the file upload button, click the directory, click the upload button and select the directory to be uploaded, you can upload the whole crawler project to Crawlab.

Select file upload

In addition to drag and drop uploads and select directory uploads, Crawlab also allows you to select file uploads individually. The operation is similar to selecting a directory to upload.

File editor optimization

The older version of the file editor (V0.5) did some basic file editing, but it didn’t seem to be as sophisticated as the mainstream file editor software, and it simply didn’t work. The new version of Crawlab makes a lot of improvements to the file editor, almost a do-over. The goal is to make it more comfortable for users to manipulate files on the interface, rather than having to resort to the crappy “local edit and upload and run” approach. Most of the time, crawler engineers need to debug the code in time, and the new version of Crawlab hopes to make it as convenient as possible for users to operate files on the interface through this optimization, so as to get rid of the dependence on local editing. In addition to the optimized file directory upload method mentioned earlier, Crawlab file editor also supports more modern code editor features, including file tagging, right-click menu operations, drag and drop movement, smart ICONS, and more.

The file tag

Sometimes editing code requires multiple files at the same time, which is implemented in mainstream code editors and is also supported in the new Crawlab version.

You can double-click a file to open multiple tabs.

You can also click on the file label to switch back and forth.

There are more complex operations, such as drag and drop and right-click menu operations.

Right-click menu operation

The Crawlab file editor allows you to right-click on a file or directory on the left for more operations, creating a file, creating a directory, renaming, copying, deleting, and more.

Drag and drop to move

The Crawlab file editor can also drag and drop files or directories like mainstream ides.

Smart icon

Many people may be used to the automatic icon of various types of files and directories in ides such as JebBrains IDEA or VS Code, but Crawlab file editor also supports this feature, allowing developers to see at a glance what each file or directory is used for.

The Crawlab file editor’s smart ICONS are implemented based on atom-material- ICONS.

Table function optimization

Table optimization is enhanced in Crawlab’s new release. Table is an effective way to show lists, such as crawler list, task list and so on are presented in table form. In the new version, Crawlab allows users to perform some advanced operations, such as search, filter, customize columns, and more.

search

The Crawlab table can be searched for related items by clicking the filter button in the table header.

screening

Users can also perform filtering operations in Crawlab tables.

The custom column

You can customize the columns you want to show or hide in the Crawlab table and sort them.

How to install

Some readers may be itching to get started with some of the new features outlined above. While v0.6 hasn’t been officially released yet, it’s easy to get a taste of the latest beta, which is deployed in a similar way to previous versions of Docker. This is a Demo of local pseudo-multi-node deployment (MND). For other deployment modes, including generation environment deployment, see official documents. Also, you can refer to the official documentation for quick Start.

The pseudo-multi-node deployment procedure is as follows.

  1. Install Docker and Docker-compose

  2. Create docker-comemage.yml and configure it as follows:

    Version: '3.3' services: master: image: crawlabteam/crawlab: latest environment: CRAWLAB_NODE_MASTER: "Y" CRAWLAB_NODE_NAME: "Master Node" CRAWLAB_MONGO_HOST: "mongo" GOPROXY: https://goproxy.cn,direct ports: - "8080:8080" depends_on: - mongo worker01: image: crawlabteam/crawlab:latest environment: CRAWLAB_NODE_MASTER: "N" CRAWLAB_NODE_NAME: "Worker Node 01" CRAWLAB_GRPC_ADDRESS: "master" CRAWLAB_FS_FILER_URL: "http://master:8080/api/filer" GOPROXY: https://goproxy.cn,direct depends_on: - master worker02: image: crawlabteam/crawlab:latest environment: CRAWLAB_NODE_MASTER: "N" CRAWLAB_NODE_NAME: "Worker Node 02" CRAWLAB_GRPC_ADDRESS: "master" CRAWLAB_FS_FILER_URL: "http://master:8080/api/filer" GOPROXY: https://goproxy.cn,direct depends_on: - master mongo: image: mongo:4 restart: alwaysCopy the code
  3. Perform docker-compose pull to pull the Docker image

  4. Docker-compose up -d to start the Docker container

After starting, you can access the Crawlab main interface by opening http://localhost:8080 in your browser. The default login user name and password are admin and admin.

conclusion

This article takes a quick look at some of the new features and features in Crawlab V0.6 Beta. Among them, this article illustrated the file upload, file editor, table advanced functions several very practical ease of use optimization. Of course, the optimizations and new features in Crawlab’s new release go beyond ease of use, and more new features and capabilities will be introduced in future articles, including optimizations for robustness, scalability, and maintainability, such as plug-in frameworks.

community

If you think Crawlab is helpful to your work or study, you can add the author to wechat TikazyQ1 to communicate and learn with Crawlab users and developers in the technology discussion group. At the same time, if you think Crawlab is easy to use, please share it with friends who may need crawler management platform. Your share may be of great value to others.