Traditional SAN storage devices use the dual-controller architecture, which supports each other. Configure two switches to connect to front-end servers. This dual-controller architecture has the following disadvantages:

1. Network bandwidth can easily become the bottleneck of the entire storage performance.

2. If a controller is damaged, the performance of the system will be greatly reduced, affecting the normal use of storage.

The limitations of the traditional storage architecture are as follows:

1. Poor horizontal scalability

Due to the external service capability of the front-end controller, vertically expanding the number of disks cannot improve the external service capability of the storage device. At the same time, the horizontal expansion ability of front-end controller is very limited, and only a few controllers can be implemented horizontally at most in the industry. Therefore, the front-end controller becomes the bottleneck of the overall storage performance.

2, the difference between different manufacturers of traditional storage management problems

The management and usage modes of devices from different vendors vary. Due to constraints such as tight software and hardware coupling and different management interfaces, unified resource management and elastic resource scheduling cannot be achieved, and storage utilization is low. Therefore, the existence of different storage affects the convenience and utilization of storage.

Distributed storage uses a distributed system structure. Multiple storage servers are used to share storage load and location servers are used to locate storage information. It not only improves the reliability, availability and access efficiency of the system, but also is easy to expand and minimize the instability caused by common hardware. The advantages are as follows:

1. A high performance

A high-performance distributed depository typically manages read and write caches efficiently and supports automatic storage tiering. Distributed storage improves system response speed by mapping data in hotspot areas to high-speed storage. Once these areas are no longer hot, the storage system moves them out of high-speed storage. The write cache technology can significantly change the performance of the whole storage system with high-speed storage. According to a certain policy, data is written to high-speed storage first, and then synchronously dropped at an appropriate time.

2. Support tiering

Due to loosely coupled links over the network, distributed storage allows high-speed and low-speed storage to be deployed separately, or mixed in any proportion. The benefits of tiered storage are best used in unpredictable business environments or in agile applications. This solution solves the biggest problem of tiered cache storage, that is, when a performance pool read fails to hit, the granularity of data extracted from the cold pool is too large, resulting in high latency and overall performance jitter.

3. Consistency of multiple copies

Unlike traditional storage architectures that use RAID to ensure data reliability, distributed storage uses the multi-copy backup mechanism. Before storing data, distributed storage fragments the data and stores the fragmented data on cluster nodes according to certain rules. To ensure consistency among multiple data copies, distributed storage uses the strong consistency technology that writes data from one copy and reads data from multiple copies, and uses mirroring, striping, and distributed verification to meet tenants’ requirements for reliability. If data fails to be read, the system reads data from another copy and writes data to the copy again to recover the data, ensuring that the total number of copies is fixed. If data is inconsistent for a long time, the system automatically restores data and tenants can set bandwidth rules for data recovery to minimize the impact on services.

4. Dr And backup

In distributed storage DISASTER recovery (Dr), an important method is the multi-point in time snapshot technology, which enables the production system to save all versions of data at a certain interval. In particular, time more technical support to extract multiple snapshots to recover some samples at the same time, this logic errors for many disaster positioning is very useful, if the user has multiple servers or virtual machine can be used as a system recovery, through the comparison and analysis, which can quickly find the point at which the point is to answer the difficulty of fault location is reduced. Shorten the positioning time. This feature is also very useful for failure recurrence for analysis and research to avoid future disasters. Multi-copy technology, striped data placement, multi-point in time snapshot, and periodic incremental replication ensure high reliability of distributed storage.

5. Elastic extension

With a reasonable distributed architecture, distributed storage can be predicted and flexibly expanded in computing, storage capacity, and performance. Horizontal scaling of distributed storage has the following features:

1) After node expansion, the old data will be automatically migrated to the new node to achieve load balancing and avoid single point overheating;

2) Horizontal expansion only needs to connect the new node and the original cluster to the same network, the whole process will not affect the business;

3) When a node is added to a cluster, the overall capacity and performance of the cluster system will expand linearly. After that, the resources of the new node will be taken over by the management platform for allocation or recycling.

6. Storage System Standardization

With the development of distributed storage, the standardization process in the storage industry is advancing. Distributed storage preferentially uses industry standard interfaces (SMI-S or OpenStack Cinder) for storage access. At the platform level, by using a heterogeneous storage resources for abstraction, the traditional storage equipment level of operation in the operation of the storage resource oriented, so as to simplify the operation of heterogeneous storage infrastructure, in order to realize the centralized management of the storage resources, and can automatically perform to create, change, recycling, etc. The whole process of storage life cycle. Based on the heterogeneous storage integration function, users can implement Dr Across different brands and media. For example, low-end storage arrays are used for high-end Dr, and different disk arrays are used for flash Dr, reducing storage procurement and management costs.

Compared with traditional SAN and NAS devices, distributed storage has the following advantages:

1. Performance: When a distributed storage device reaches a certain scale, its performance exceeds that of traditional SAN and NAS devices. A large number of disks and nodes, combined with an appropriate data distribution strategy, can achieve very high aggregation bandwidth. Traditional SAN and NAS devices have performance bottlenecks. Once the maximum scalability is reached, the performance will not change or even decrease.

2. Price: The price of traditional SAN and NAS is relatively high. Especially for SAN devices, the cost of optical fiber network is high. In addition, expansion cabinets will need to be added later. The cost is too high. Distributed storage requires only IP networks, several X86 servers and built-in hard disks, and the initial cost is relatively low. Extension is also very convenient, add a server on the line.

3, sustainability: the traditional SAN, NAS expansion capacity is limited, a head can take up to several hundred disks. If you want more than a PB shared storage, distributed storage is only the best choice. Don’t worry about scaling.

Disadvantages:

1. Users with strong technical ability, operation and maintenance ability and even development ability are required. Traditional storage is out of the box, with hardware provided by the manufacturer and complete documentation and services. However, many distributed systems are open source or some companies provide support services based on open source systems. Version iteration is relatively fast, and problems may need to be solved by themselves.

2. Data consistency problem. For application scenarios that require high data consistency, such as ORACLE RAC, the performance of distributed storage may be weaker. Because of the distributed structure, data synchronization is a major problem. Although the technology is improving, it is not as reliable as traditional storage devices.

3. Stability problem. Distributed storage is very dependent on the network environment and bandwidth. For example, once an IP conflict occurs, the entire distributed storage may be inaccessible. Traditional storage uses dedicated SAN or IP networks, which are more reliable in terms of stability.

The reason for the rapid growth of hyper-converged architecture is that it has significant advantages and can bring high customer value. The hyper-converged architecture manages and schedules computing, storage, and network resources in a unified manner. It provides flexible scalability and provides optimal efficiency, flexibility, scale, cost, and data protection for data centers. The integrated platform of computing and storage superfusion replaces the traditional server and centralized storage architecture, making the whole architecture more clear and simple, and greatly simplifying the design of complex IT systems.

From the perspective of users, the reasons for choosing hyper-converged architecture are as follows:

(1) performance

Business scale, data availability, business continuity, and performance requirements are rapidly increasing, which cannot be met by traditional IT architectures or costs too much. Hyper-converged architectures can easily achieve hundreds of thousands of IOPS. If all-flash super fusion is adopted, the performance is much better than common SAN array.

(2) the cost

Traditional IT architectures are too expensive to provide the same performance. Cost is not the biggest advantage of hyperconvergence, but IT can still save investment compared to traditional solutions.

(3) the old

Profitability is not what hyperconvergence is supposed to do, but it is a real need. Hyperfusion supports common standard x86 server hardware, thus enabling deployment on existing servers to protect investment.