In architecture Oriented Programming, I explained my view of the relationship between architecture and code: “Code needs to reflect architecture!”

This paper verifies this point of view by designing and implementing the core function of file service. The design process is a blend of use-case-driven design and domain-driven design!

This and subsequent articles will design and develop several real-world systems, while attempting to summarize an appropriate architectural design and development process. Welcome to discuss!

function

The core function of the file server is two: “file upload” and “file download”! Upload may support breakpoint continuation and fragment upload. Download protection may be required. For example, the download cannot be performed on a non-specified client.

In addition to these two core functions, there is usually an additional function, which is “conversion”! Transformations include:

  • Image size conversion: an image needs to be split into several different sizes
  • Add a watermark: You need to add a watermark to an image or video
  • Format conversion:
  • File format conversion: Office to PDF, PDF to Word, PDF to picture, Office to picture, etc
  • Video format conversion: MP4 to M3U8, bit rate conversion, etc

In addition to the business functions above, there are the following non-functional constraints:

  • Security: Whether authentication is required before uploading or downloading
  • Scalability: Supports capacity expansion to increase traffic volume
  • Availability: As a basic service, the availability is no less than 4 nines
  • Configurability: Provides configurability for conversion mode, upload and download mode
  • Extensibility: Easy to extend functions, such as the extension of transformation mode

Preliminary process

  • Upload process

  • Download process

Preliminary module division

According to functions, it can be divided into the following functional modules:

  • Upload module (core module) : Handles file uploads
  • Download module (core module) : Handles file downloads
  • Conversion module: Handles file type conversion
  • Configuration module: Configures the file service
  • Security module: Security protection for file services

Architecture design

Firstly, a general division of modules is carried out through hierarchical architecture, in accordance with the hierarchical mode of domain design:

  • Application layer: configuration module, security module
  • Domain layer: upload module, download module, convert module

As you can see from the above flow, the “upload module” has some dependency on the “transform module”, as follows:

However, the “upload module” is a core module, while the “transform module” is a non-core module. The functions of core modules are relatively stable, while those of non-core modules are relatively unstable. Making a stable module depend on an unstable module will cause the stable module to become unstable, so the dependency needs to be “inverted”.

Dependency inversion solves the problem of module dependency. However, conversion is a very time-consuming process. For example, if a user uploads a video, the response can be received as long as the upload is completed without conversion. However, if the conversion is carried out, it may take twice or even three or four times as long to get feedback, which makes the experience very bad. In general, the timeliness of uploading and viewing does not require immediacy, so the conversion should be an asynchronous process.

Asynchronous execution can be done in many ways, such as event-based, custom threading, etc. This is handled in the form of an event. (Domain events refer to Domain Design: Domain Events)

UploadEvent is created when a file is uploaded. UploadListener listens for UploadEvent events, and when UploadEvent is listened on, the conversion is performed.

After the transformation process is asynchronized, how do you inform the client of the transformation result? There are several options:

  • After the upload is complete, the file service returns a token. Subsequent service systems use the token to obtain the transformed URL. This scenario requires two requests from the business system.
  • After the file service is converted, it is stored in the database, and the service system obtains the file service from the database. This scenario also requires the business system to request it twice and has different implementations for different business needs.
  • After the file service conversion is complete, the service system is called back. This scenario may require implementing different business callback interfaces.
  • The file server returns a pre-generated URL that returns a specific status code when the file conversion is complete and returns the file when the conversion is complete. For some scenarios, the URL cannot be generated in advance, such as office image transfer, one document will be converted into multiple images, and the image URL cannot be known before the conversion

At present, the mainstream approach is the first one, but in order to ensure the applicability of the file server, it needs to be able to support multiple solutions. Therefore, notifications after transformation are also processed based on events. After transformation, corresponding events are created and corresponding processing is performed by focusing on the object of the event. One possible process is as follows:

  • After the upload is complete, the file server returns the original file address and token. The business system creates a listener for this token in Redis
  • The file server creates a conversion event after the conversion is complete, and the conversion event listener listens to this event and sends a notification to Redis
  • The service system receives the notification and updates the URL

In addition, for downloading, it is actually done directly through a Web server such as Nginx, so the downloading module can be directly independent.

For configuration modules, there are two types of configuration:

  • Configuration information required by the file service. For example, upload a file directory. This is “static configuration.”
  • The individual configuration required by each calling system. For example, some systems need to cut 100*100 graphs, while others need to cut 200*200 graphs. This is “dynamic configuration”

“Static configuration” can be configured using the properties file. Dynamic Configuration needs to be configured based on different systems. Therefore, create a configuration class for resource configurations such as images and videos and use Respository to dynamically build the class based on the parameters.

The overall structure is as follows:

Process adjustment

Based on the above design, the process needs to be adjusted accordingly.

  • Upload process

Download process remains the same, there is a new process to obtain the converted file link:

Adjust the

The corresponding module has also been adjusted. A message module has been added to handle message sending and listening. This message is a domain event, so it is placed at the domain level.

The architecture verification

Business process validation

Upload process:

  • The client uploads files
  • Pass security Module authentication. If the authentication fails, the authentication failure message is displayed
  • If the verification succeeds, upload the file through upload module
  • The Upload module builds the Upload event and adds it to the message bus
  • Upload complete, return user message. The message contains the original file URL and, if conversion is required, the token corresponding to the conversion
  • “Conversion Module” listens to “upload event” and converts according to the configuration of “Configuration module”
  • The Transformation module builds the transformation message and adds it to the message bus
  • The corresponding listening module listens to the conversion message and performs subsequent processing. For example, information storage or notification to the business system

Download process:

  • The client downloads the file
  • Pass security Module authentication. If the authentication fails, the authentication failure message is displayed
  • If the verification succeeds, download the file through Download Module

Get the real link flow:

  • The client carries a token to obtain the real link
  • Download Module queries whether the file is successfully converted based on the token
  • If the conversion succeeds, the converted URL is returned
  • Otherwise, the status code is returned

Validation of non-functional constraints

  • Security: Guaranteed by the security module
  • Scalability: For downloads, CDN processing is available. For uploading, the file service itself has no status, which facilitates expansion
  • Availability: Multi-point deployment is supported and common failover methods can be used
  • Configurability: Guaranteed by configuration modules
  • Extensibility: Functions are extended by adding event response objects based on event handling

For example, now we need to add a “second transfer function”, that is, for the existing files on the server, no longer upload operation, directly return the file URL! Then you need to do the following extension:

  • Added storage logic to store the relationship between file addresses and file hash
  • New interface to check file hash, return file URL if hash already exists, false otherwise
  • Add a UploadEvent sync listener event. When the file is uploaded successfully, hash the file and save the data to the table created above

The above changes do not require any changes to the existing process.

Technology selection

  • The core technical language of the company is Java, so Java is preferred for development
  • The framework is based on SpringBoot, based on the following considerations:
  • SpringBoot is the de facto standard framework for JavaEE development today
  • It can be deployed independently or upgraded to SpringCloud-based microservices for easy migration to microservices architecture
  • Configuration information The decision to use properties file configuration instead of database is based on the following considerations:
  • Static configuration You do not need to modify the configuration
  • Dynamic configuration changes are not likely, and SpringBoot itself supports real-time configuration updates if necessary
  • Microservice deployment can be combined with distributed configuration servers to achieve dynamic configuration
  • There is no need to deploy databases and design table structures, saving deployment and design time. But for extensibility, the configuration logic needs to be abstracted to support other persistence methods
  • The conversion result information is stored in a file based on the following considerations:
  • The resulting information is read once and infrequently
  • As a file service, it makes sense to use file storage
  • There is no need to deploy databases and design table structures, saving deployment and design time

implementation

Structure is consistent with the architecture diagram

Event implementation

Events chain the whole upload process:

  • File upload triggers UploadEvent
  • UploadListener listens to UploadEvent and delegates each Converter for file processing
  • ConvertEvent is fired after the conversion is complete
  • ConvertListener Listens to ConvertEvent and processes the converted information

Since most of them are internal events, Spring events are used to handle them. The code logic is as follows:

Spring's default thread pool is not set to size. If blocking occurs, OOM@Bean("eventThread") public TaskExecutor TaskExecutor () {ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor(); // Set the number of core threads. Conversion is a time-consuming process, so queue executor.setCorePoolSize(1); // Set the maximum number of threads executor.setMaxPoolSize(1); / / set the queue capacity executor. SetQueueCapacity (100); / / sets the thread active time (in seconds) executor. SetKeepAliveSeconds (60); / / set the default thread name executor. SetThreadNamePrefix (" eventThread - "); / / set to refuse strategy executor. SetRejectedExecutionHandler (new ThreadPoolExecutor. CallerRunsPolicy ()); / / waiting for the end of all the tasks and then close the thread pool executor. SetWaitForTasksToCompleteOnShutdown (true); return executor; Internal message bus * * * *} / / @ Service @ EnableAsyncpublic class EventBus implements ApplicationEventPublisherAware {private ApplicationEventPublisher publisher; @Override public void setApplicationEventPublisher(ApplicationEventPublisher applicationEventPublisher) { this.publisher  = applicationEventPublisher; } public void add(ApplicationEvent event) { publisher.publishEvent(event); Public UploadEvent extends ApplicationEvent {public UploadEvent(Object source) {super(source); } } public class ConvertEvent extends ApplicationEvent { public ConvertEvent(Object source) { super(source); @componentPublic Class UploadListener {@EventListener @async ("eventThread") // Use a custom thread pool public void process(UploadEvent event) { } } @Componentpublic class ConvertListener { @EventListener @Async("eventThread") public void process(ConvertEvent event) { } }Copy the code

Configuration management implementation

To improve the flexibility of the file server, the transformation logic can be configured. If no configuration is performed, no processing is performed.

The following four classes are configurations for each file type:

  • ImageConfig: Cut image size
  • OfficeConfig: Conversion type, whether to get the page number
  • PdfConfig: Conversion type, whether to get the page number
  • VideoConfig: Convert type, whether to get length, whether to get frame

The corresponding Respository is the repository class for storing and restoring it:

  • ImageConfigRespository
  • OfficeConfigRespository
  • PdfConfigRespository
  • VideoConfigRespository

This is based on attribute configuration (see “technical selection” for reasons)! Take VideoConfigRespository as an example.

@Configuration@ConfigurationProperties(prefix = "fileupload.config") public class VideoConfigRespository { private List<VideoConfig> videoConfigList; /** * public List<VideoConfig> find(String group) {if (videoConfigList) == null) { return new ArrayList<>(); } else { return videoConfigList.stream().filter(it -> it.getGroup().equals(group)).collect(Collectors.toList()); } } public List<VideoConfig> getVideoConfigList() { return videoConfigList; } public void setVideoConfigList(List<VideoConfig> videoConfigList) { this.videoConfigList = videoConfigList; }}Copy the code

Configure the properties from the properties file into videoConfigList using Spring’s ConfigurationProperties annotation.

# video configuration fileupload. Config. VideoConfigList [0]. Group = GROUP1 # default configuration fileupload. Config. VideoConfigList [1]. The group = GROUP2 Fileupload. Config. VideoConfigList [1]. The type = # webm converted to webm fileupload. Config. VideoConfigList [1]. The frameSecondList [0] = 3 # Take the picture of the third secondCopy the code

Transformation result realization

The conversion result is represented by ConvertResult and ConvertFileInfo:

  • ConvertResult contains source file information, as well as multiple conversion results. ConvertFileInfo represents a conversion result
  • ConvertResult is Entity and ConvertFileInfo is VO
  • ConvertResult and ConvertFileInfo are one-to-many relationships
  • Both form aggregations, where ConvertResult is the aggregative root (see Domain Design: Aggregations and aggregative roots for details on aggregations and aggregative roots)

ConvertResultRespository is the repository for the aggregation, which is used to save and restore the aggregation. Instead of using a database, save directly in text form (see “Technical Selection” for reasons).

@Componentpublic class ConvertResultRespository { ...... ** @param result * @return */ public void save(ConvertResult result) {Path savePath = Paths.get(tokenPath, result.getToken()); try { if(! Files.exists(savePath.getParent())) { Files.createDirectories(savePath.getParent()); } Files.write(savePath, gson.toJson(result).getBytes(UTF8_CHARSET)); } catch (IOException e) { logger.error("save ConvertResult[{}} error! ", result, e); ** @param token * @return */ public ConvertResult find(String token) {Path findPath = Paths.get(tokenPath, token);  try { if (Files.exists(findPath)) { String result = new String(Files.readAllBytes(findPath), UTF8_CHARSET);  return gson.fromJson(result, ConvertResult.class);  } } catch (IOException e) { logger.error("find ConvertResult by token[{}} error!", token, e); } return null; } }Copy the code

Transformation service implementation

The transformation service performs the corresponding operations according to the utility class corresponding to the configuration delegate (the code is omitted) :

  • Convert video using FFMPEG
  • Convert PDF using PDfBox
  • Use LibreOffice to convert office

Security implementation

  • Security is achieved through the Spring interceptor
  • Add intercepts as required

use

Provides two interfaces:

*/@ResponseBody@GetMapping(value = "/realUrl/{token}") public ResponseEntity realUrl(@pathVariable String)  token) { ..... } /** * upload file */@ResponseBody@PostMapping(value = {"/partupload/{group}"}) public ResponseEntity upload(HttpServletRequest request, @PathVariable String group) { ..... }Copy the code
  • Upload files through the Upload interface and upload files in fragments

  • After the upload is complete, the upload result is returned as follows:

    { “code”: 1, “message”: “maps.mp4”, “token”: “key_286400710002612”, “group”: “GROUP1”, “fileType”: “VIDEO”, “filePath”: “Www.abc.com/15561725229…” }

  • Where filePath is the original filePath

  • Through the token, the realUrl interface can be used to obtain the converted file information. The structure is as follows:

    {“token”: “key_282816586380196”, “group”: “SHILU”, “fileType”: “IMAGE”, “filePath”: “www.abc.com/SHILU/1/155…” , “convertFileInfoList”: [ { “fileLength”: 0, “fileType”: “IMAGE”, “filePath”: null, “imgPaths”: [“www.abc.com/SHILU/1/155… “]]}

configuration

Fileupload.server. name=http://www.abc.com## libreoffice home Office. The home = / snap/libreoffice / # 115 / lib/libreoffice file upload save the path fileupload. Upload. The root = / home/files # # dynamic configuration file server configuration, Cut the figure of 100 * 200 fileupload. Config. ImageConfigList [0]. Group = group1 fileupload. Config. ImageConfigList [0]. Width = 100 Fileupload. Config. ImageConfigList [0]. Height = 200 # # video configuration default configuration, Convert m3u8 fileupload. Config. VideoConfigList [0]. Group = group1 # conversion webm, Cut to figure 3 seconds fileupload. Config. VideoConfigList [1]. The group = group2. Fileupload config. VideoConfigList [1]. The type = webm Fileupload. Config. VideoConfigList [1]. FrameSecondList [0] = 3 # office configuration, Turn the default PNG fileupload. Config. OfficeConfigList [0]. Group = turn group1 # PDF fileupload. Config. OfficeConfigList [0]. Type = # PDF PDF configuration, Turn the PNG fileupload. Config. PdfConfigList [0]. Group = group1 # upload file size, Does not support the current end shard upload set when spring. Servlet. Multipart. Max - file - size = 1024 MB spring. Servlet. Multipart. Max - request - size = 1024 MBCopy the code

conclusion

This paper presents a relatively complete architecture design and implementation process of file service. The whole architecture design process is as follows:

  • Sorting out business functions
  • Comb through the use-case flow
  • Based on the business function, the preliminary module division
  • Architecture is designed in conjunction with the use-case flow, and adjustments may be made to the modules and flow in turn
  • Validate the architecture
  • Business process validation: Apply use cases to the architecture for validation
  • Non-functional constraint verification: Simulate non-functional constraint scenarios for verification
  • Technology selection (architecture design is technology agnostic)
  • Follow the architecture design to implement code, test (and possibly adjust the architecture)
  • Complete process verification with instructions

In the whole process, corresponding decisions are made and verified for each constraint. The code structure matches the architectural design perfectly. The code logic can be understood by tracing the architectural design diagram to the diagram.

If there is something wrong or careless, welcome everyone to discuss and advise!