Metadata for Minio

Data is stored

The MinIO object storage system has no metadata database and all operations are object-level granularity. The advantages of this approach are as follows:

  • Individual object failures do not spill over into larger system failures.

  • Easy to implement the “strong consistency” feature. This feature is important for machine learning and big data processing.

Data management

Metadata and data are stored together on disks: Data fragments are stored on disks, and metadata is stored in metadata files (xl.json) in plain text. Assume that the object name is obj-with-metadata, the bucket name is bucket_name, and disk is the path of any disk in the erasable group where the object resides.

disk/bucket_name/obj-with-metadata 
Copy the code

Records information about this object on this disk. It reads as follows:

xl.json

Xl. json is the metadata file for this object. The contents of the xl.json object metadata file are json strings of the following form:

Fields that
The format field

This field specifies that the format of the object is XL. MinIO stores data in two main data formats: XL and FS. MinIO started with the following command uses the storage format fs:

This pattern is mainly used for testing, and many of the object store apis are pile functions that are not really implemented. In the production environment, the storage format is XL in local distributed cluster deployment, Alliance deployment, and CSA deployment.

Part.1: The first data shard of the object

The stat field

Record the state of this object, including size and modification time, as shown below:

Erasure field

This field records information about this object and erasure codes, as shown in the following figure:

The algorithm indicates that this object adopts erasure code implemented by Klaus Post, and the generation matrix is van der Monde matrix.

  • Data, parity Indicates the number of data disks and parity disks in the erasure group.

  • BlockSize specifies the size of the object to be divided into blocks. The default value is 5M (see data distribution and balancing in the previous section).

  • Index Indicates the number of the current disk in the erasure correction group.

  • Distribution: The number of data disks and parity disks in each erasure group is fixed, but the order in which fragments of different objects are written to different disks in the erasure group is different. This is the order of distribution.

  • Checksum: The number of fields below it is related to the number of shards for this object. In older versions of MinIO, the checksum calculated by the hash function for each shard was recorded at this location in the metadata file. The latest versions of MinIO count checksum directly into the first 32 bytes of a shard file (part.1, etc.).

Highwayhash256S indicates that the checksum value is written to the shard file.

Minio’s integrated Java client

The file server is using Minio, and there is no independent microservice or starter extraction, so simply test the integration and extraction starter, create the Springboot project integration minio and upload the file successfully

Pom dependencies for Maven environments

<dependency> <groupId> IO. Minio </groupId> <artifactId>minio</artifactId> <version>6.0.11</version>Copy the code

Yml configuration for Spring:

Minio: endpoint: http://192.168.8.50:9000 accessKey: admin secretKey: 123123123Copy the code

Configuration class MinioProperties:

@Data
@ConfigurationProperties(prefix = "minio")
public class MinioProperties {
    / / the connection url
    private String endpoint;
    / / user name
    private String accessKey;
    / / password
    private String secretKey;
}
Copy the code

Utility class MinioUtil

import cn.hutool.core.util.StrUtil;
import com.team.common.core.constant.enums.BaseResultEnum;
import com.team.common.core.exception.BusinessException;
import io.minio.MinioClient;
import lombok.AllArgsConstructor;
import lombok.SneakyThrows;
import org.springframework.stereotype.Component;
import org.springframework.web.multipart.MultipartFile;
import java.io.InputStream;
@AllArgsConstructor
@Component
public class MinioUtil {
    private final MinioClient minioClient;
    private final MinioProperties minioProperties;

    /** * HTTP file upload *@param bucketName
     * @param file
     * @returnAccess address */
    public String putFile(String bucketName,MultipartFile file) {
        return this.putFile(bucketName,null,file);
    }

    /** * HTTP file upload (add root path) *@param bucketName
     * @param folder
     * @param file
     * @returnAccess address */
    public String putFile(String bucketName,String folder,MultipartFile file) {
        String originalFilename = file.getOriginalFilename();
        if (StrUtil.isNotEmpty(folder)){
            originalFilename = folder.concat("/").concat(originalFilename);
        }
        try {
            InputStream in = file.getInputStream();
            String contentType= file.getContentType();
            minioClient.putObject(bucketName,originalFilename,in,null.null.null, contentType);
        } catch (Exception e) {
            e.printStackTrace();
           throw new BusinessException(BaseResultEnum.SYSTEM_EXCEPTION.getCode(),"File upload failed");
        }
        String url = minioProperties.getEndpoint().concat("/").concat(bucketName).concat("/").concat(originalFilename);
        return url;
    }

    /** * Create bucket *@param bucketName
     */
    public void createBucket(String bucketName){
        try {
            minioClient.makeBucket(bucketName);
        } catch (Exception e) {
            e.printStackTrace();
            throw new BusinessException(BaseResultEnum.SYSTEM_EXCEPTION.getCode(),"Failed to create bucket"); }}@SneakyThrows
    public String getBucketPolicy(String bucketName){
        returnminioClient.getBucketPolicy(bucketName); }}Copy the code

Assembly:

import io.minio.MinioClient;
import io.minio.errors.InvalidEndpointException;
import io.minio.errors.InvalidPortException;
import lombok.AllArgsConstructor;
import org.springframework.boot.autoconfigure.condition.ConditionalOnBean;
import org.springframework.boot.context.properties.EnableConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@AllArgsConstructor
@Configuration
@EnableConfigurationProperties(MinioProperties.class)
public class MinioAutoConfiguration {
    private final MinioProperties minioProperties;

    @Bean
    public MinioClient minioClient(a) throws InvalidPortException, InvalidEndpointException {
        MinioClient  client = new MinioClient(minioProperties.getEndpoint(),minioProperties.getAccessKey(),minioProperties.getSecretKey());
        return  client;
    }

    @ConditionalOnBean(MinioClient.class)
    @Bean
    public MinioUtil minioUtil(MinioClient minioClient,MinioProperties minioProperties) {
        return newMinioUtil(minioClient,minioProperties); }}Copy the code
The spring.factories configuration file

Remove the main entry function, remove the application.properties configuration file (create a new SpringBoot project for test use and take the configuration file there) and the most important remaining step: Create a meta-INF/Spring. factories file under Resources and add the classes you want to auto-assemble to the configuration file

Org. Springframework. Boot. Autoconfigure. EnableAutoConfiguration = \ com * (your path). MinioAutoConfigurationCopy the code

demo:

import com.team.common.core.web.Result;
import com.team.common.minio.MinioUtil;
import io.swagger.annotations.Api;
import io.swagger.annotations.ApiOperation;
import io.swagger.annotations.ApiParam;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.PutMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.multipart.MultipartFile;

@api (value = "uploadFile", tags = "uploadFile")
@RequestMapping("uploadFile")
@RestController
public class UploadFileController {

    @Autowired
    private MinioUtil minioUtil;

    @apiOperation (value = "common file upload ")
    @PutMapping("/upload")
    public Result uploadFile(@apiparam (" Bucket name ") String bucketName,@ ApiParam (" file ") MultipartFile file) {
        String url = null;
        try {
           url =  minioUtil.putFile(bucketName,file);
        } catch (Exception e) {
            e.printStackTrace();
        }
       returnResult.success(url); }}Copy the code

You can install maven directly from the same repository as the local test, create a SpringBoot project, and add the starter dependency to the POM.

<dependency>
            <groupId>com.jxwy</groupId>
            <artifactId>minio-starter</artifactId>
            <version>0.0.1 - the SNAPSHOT</version>
</dependency>
Copy the code

Comparison of other OSS services

Vendor support

There are many domestic manufacturers that use Ceph and self-developed storage manufacturers based on Ceph. If you encounter problems in using Ceph (sometimes, you even need to modify, enhance or re-implement Ceph itself), you can turn to the relevant manufacturers for support. On the international front, Ceph has long been acquired by Red Hat, which was recently acquired by IBM.

MinIO is developed and supported only by MinIO. Because of its advanced architecture and advanced language, MinIO’s own programs are easy to read and modify. Hiring Golang programmers to maintain MinIO costs less than hiring c++ programmers to maintain Ceph.

Multilingual client SDK

Both have clients in common programming languages, such as Python, Java, etc. MinIO object storage software development SDK also supports pure functional language Haskell.

Technical documentation

The documentation MinIO for the internal implementation is almost non-existent. To see the internal implementation and even the technical staff involved in the development, go to minio.slack.com/, talk to the MinIO developers directly, or read the code yourself. Ceph implementation documents, algorithm documentation is very rich. Ceph is much more mature than MinIO in this respect.

Ceph vs. MinIO

Open source object storage software MinIO,Ceph as the typical representative. This section describes the features of the object storage system and the object storage system to help you select an appropriate product.

MinIO advantage

Extremely simple to deploy

MinIO system service program only MinIO an executable file, basically does not rely on other shared libraries or RPM/APT package. Minio has very few configuration items (mostly system level Settings such as the kernel) and can operate without them. Search engines such as Baidu, Google, and Bing have no web pages about MinIO deployment problems. In practice, few users encounter such problems.

In contrast, the modules of Ceph system have many RPM and APT packages and many configuration items, which are difficult to deploy and tune. Some Linux distributions even have buggy Ceph packages that require users to manually change the Python script of Ceph to complete the installation.

Easy secondary development

MinIO object storage system uses Golang language except for a small number of code implemented by assembly. The Ceph system is written in c++, a language that is notoriously difficult to learn and use. Golang language, due to its late generation, has absorbed many lessons from other languages, especially c++, and its language features are relatively modern.

Relatively speaking, MinIO system maintenance, secondary development is relatively easy.

NMS mode Supports multiple storage devices

Through the gateway mode, the MinIO object storage back-end can interconnect with various existing common storage types, such as NAS system, Microsoft Azure Blob storage, Google Cloud storage, HDFS, Alibaba OSS, Amazon S3, etc., which is very helpful for enterprises to reuse existing resources. In this way, enterprises can smoothly upgrade from existing systems to object storage at a low cost (the hardware cost is about zero, and you only need to deploy MinIO object storage software).

Ceph advantage

Data redundancy policies are richer. Ceph supports both duplicate and erasure codes, while MinIO only supports erasure codes. Ceph object storage is more suitable for individual units that require high data reliability.

Refer to the hardware

MinIO is software-defined storage SDS compliant, compatible with leading X86 servers and ARM/ Feiteng platforms, and can be ported to hardware platforms such as Sunway (Alpha architecture) and Loong Son (Mips architecture).

The following industry-standard, widely used servers are made through MinIO Inc. Optimize tested servers that perform well with MinIO object storage software:

conclusion

From the above discussion, it can be seen that as object storage software, MinIO and Ceph are excellent, and each has its own advantages. Users need to select appropriate object storage software based on their requirements and technical reserves.

The resources

  • Hao123.blog.csdn.net/article/det…

  • Github.com/krishnasrin…

  • Docs.aws.amazon.com/zh_cn/Amazo…

  • Klaus Post official website: klauspost.com/

  • Github.com/klauspost/r…

  • Developer.ibm.com/articles/cl…

  • github.com/minio/dsync

  • Github.com/minio/dsync…

  • Github.com/minio/minio…

  • Min. IO/resources/d…

  • Github.com/minio/minio…

  • Github.com/klauspost/r…

  • github.com/minio/dsync

  • Min. IO/resources/d…

  • Github.com/minio/minio…

  • Docs. Min. IO/docs/minio -…