The previous chapter has explained the basic concepts and knowledge related to "big data introduction", this chapter we learn HDFS. If there are any errors...
Namespace: Allocates multiple tables to a group for unified management. Table: A Table consists of one or more column families; Data attributes such as timeout...
On September 15, 2018, Deng Jie, senior big data engineer of Data Platform Department of Ping An Technology, delivered a speech titled "HBase Application and...
Hadoop plays an important role in big data technology system. Hadoop is the foundation of big data technology. This is an article documenting my own...
Thank you for reading the 13th article of "Meitu Data Technology team", and pay attention to our continuous access to the latest data technology trends...
In the demand of big data development positions, salaries are rising, and many programmers will choose to switch to programming when facing the career bottleneck....
Copyright Notice: This set of technical column is the author (Qin Kaixin) usually work summary and sublimation, through extracting cases from real business environment to...
This paper mainly introduces the HDFS architecture and its execution process, and gives a programming example of read and write operation, hoping to have a...
The Hadoop ecosystem is a large and fully functional ecosystem, but it still revolves around the distributed system infrastructure named Hadoop. Its core components are...
A brief review of HDFS writing process, MapReduce basic knowledge and mechanism understanding, more details can be found in the MapReduce section after my home...
In fact, big data technology is an innovative application of distributed technology in the field of data processing. Its essence is to use more computers...
Objective: Understand the background and definition of HDFS Advantages and disadvantages of HDFS Component architecture of HDFS Shell operation of HDFS 1 HDFS Overview 1.1...
DataNode loss exceeds the specified loss percentage, so the HDFS automatically enters the safe mode. So why does this happen? Because I lost power earlier.
This is a DISTRIBUTED file system written in C language, written by taobao architects and open source. FastDFS is tailor-made for the Internet. The functions...
This is probably the most detailed analysis of erasure code technology on the whole web, if you are confused about the advantages of Hadoop3.x version...
Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating, and moving large amounts of log data from many different sources into...
We've finished updating ZooKeeper for the high concurrency series from scratch, and the previous zooKeeper didn't incorporate big data into the description. On the one...
A brief summary of HDFS, including storage policy, architecture evolution, metadata management, double buffering mechanism, and other content, there are two articles about HDFS content,...
Spark is an open source Hadoop MapReduce-like general parallel framework developed by UC Berkeley AMP Lab. It is a fast and universal big data processing...