Tag: hdfs

What is HDFS? You know what? I don’t know what I’m telling you.

January 15, 2024

by 周琬婷

No Comments

The previous chapter has explained the basic concepts and knowledge related to "big data introduction", this chapter we learn HDFS. If there are any errors...

CarbonData, the power of Huawei in China

January 15, 2024

by Dana Davis

No Comments

In this article, we'll take a closer look at what's new in the dark over the evolution of storage formats. Huawei opened the parquet column...

HBase Doesn’t Sleep chapter 5 – HBase Internal Adventures

January 15, 2024

by Victoria Johnson

No Comments

Namespace: Allocates multiple tables to a group for unified management. Table: A Table consists of one or more column families; Data attributes such as timeout...

In the face of massive data storage, how to ensure the efficiency and stability of HBase clusters

January 15, 2024

by Michael Horn

No Comments

On September 15, 2018, Deng Jie, senior big data engineer of Data Platform Department of Ping An Technology, delivered a speech titled "HBase Application and...

Hadoop pseudo-distributed mode learning Notes

January 15, 2024

by Robert Scott

No Comments

Hadoop plays an important role in big data technology system. Hadoop is the foundation of big data technology. This is an article documenting my own...

Meitu offline ETL practice

January 15, 2024

by Christopher Marsh

No Comments

Thank you for reading the 13th article of "Meitu Data Technology team", and pay attention to our continuous access to the latest data technology trends...

The code of life

To become a top programmer of big data, first pass the following questions! (Attached answer analysis)

January 13, 2024

by Amber Maldonado

No Comments

In the demand of big data development positions, salaries are rising, and many programmers will choose to switch to programming when facing the career bottleneck....

Big data is the way of God

January 13, 2024

by Francesca Savage-Hope

No Comments

Many people know that big data is very hot, employment is very good, high salary, want to develop in the direction of big data. But...

Flink Distributed Cache Application Case

January 13, 2024

by Judith Bennett

No Comments

Copyright Notice: This set of technical column is the author (Qin Kaixin) usually work summary and sublimation, through extracting cases from real business environment to...

Hadoop basic HDFS introduction

January 9, 2024

by Nayantara Deshpande

No Comments

This paper mainly introduces the HDFS architecture and its execution process, and gives a programming example of read and write operation, hoping to have a...

HDFS detailed analysis in Getting started with Hadoop (2)

January 9, 2024

by 王惠如

No Comments

The Hadoop ecosystem is a large and fully functional ecosystem, but it still revolves around the distributed system infrastructure named Hadoop. Its core components are...

Big Data (3) — MapReduce introduction

January 8, 2024

by 邵雅涵

No Comments

A brief review of HDFS writing process, MapReduce basic knowledge and mechanism understanding, more details can be found in the MapReduce section after my home...

This section describes the architecture and working principles of HDFS, MapReduce, and Yarn

January 8, 2024

by Bradley Norman

No Comments

In fact, big data technology is an innovative application of distributed technology in the field of data processing. Its essence is to use more computers...

Third, a preliminary understanding of HDFS

January 6, 2024

by 王志偉

No Comments

Objective: Understand the background and definition of HDFS Advantages and disadvantages of HDFS Component architecture of HDFS Shell operation of HDFS 1 HDFS Overview 1.1...

Big data System and Large-scale Data Analysis — INTRODUCTION to HDFS and HBase programming

January 6, 2024

by Kiaan Bir

No Comments

Hello everyone, I am [Bean dried peanut], this time I brought big data HDFS, HBase programming introduction ~ can be said to be very specific,...

HDFS can’t get up today

January 2, 2024

by Michelle Miller

No Comments

DataNode loss exceeds the specified loss percentage, so the HDFS automatically enters the safe mode. So why does this happen? Because I lost power earlier.

What is a fastDFS

December 30, 2023

by Michael Brown

No Comments

This is a DISTRIBUTED file system written in C language, written by taobao architects and open source. FastDFS is tailor-made for the Internet. The functions...

Hadoop -HDFS process parsing

December 26, 2023

by Christopher White

No Comments

The client invokes the DS module to request the NameNode to upload the file. Assume that the file size is 200M. The client requests to...

Java integrates the FastDFS file service

December 23, 2023

by Hazel Sanghvi

No Comments

1. Fastdfs server setup 2. Java code part

The code of life

Hadoop learning notes – 02HDFS theoretical basis and read and write process

December 20, 2023

by Justine Twidle

No Comments

This paper introduces the storage model, architecture design and read and write process of HDFS in detail. As the core of divide and conquer and...

Artificial intelligence (ai)

Technical analysis of erasure code in Hadoop 3.x

December 20, 2023

by 孫飛

No Comments

This is probably the most detailed analysis of erasure code technology on the whole web, if you are confused about the advantages of Hadoop3.x version...

Artificial intelligence (ai)

Talk about the design idea of HDFS

December 20, 2023

by Arnav Dube

No Comments

If you are new to big data and are first coming into contact with HDFS, this article will be your first choice to get started...

Artificial intelligence (ai)

This section describes the HDFS high availability and federation

December 20, 2023

by Robert Maddox

No Comments

What is the high availability (HA) and Federation (Federation) of HDFS? It includes a mind map to help you understand better

HDFS: From RAID to HDFS, look at the birth of big data storage king

December 20, 2023

by 解琬婷

No Comments

From RAID to HDFS: Vertical expansion has an end, but horizontal expansion has no end. RAID is over, and HDFS has become the de facto...

Big Data (2) – HDFS read and write process and some important policies

December 19, 2023

by Noah Castillo

No Comments

The Distributed FileSystem is a remote procedure call to the open method of the NameNode. The open method is used to retrieve the locations of...

Introduction to Apache Flume

December 19, 2023

by Donald Coleman

No Comments

Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating, and moving large amounts of log data from many different sources into...

Take you into the pit of big data (a) – HDFS basic concepts

December 19, 2023

by Leslie Allen

No Comments

We've finished updating ZooKeeper for the high concurrency series from scratch, and the previous zooKeeper didn't incorporate big data into the description. On the one...

Basic summary and architecture evolution of HDFS

December 19, 2023

by 李淑娟

No Comments

A brief summary of HDFS, including storage policy, architecture evolution, metadata management, double buffering mechanism, and other content, there are two articles about HDFS content,...

Artificial intelligence (ai)

What’s wrong with HDFS storing lots of small files? How to store a large number of small files?

December 19, 2023

by Fiona Atkinson

No Comments

How does HDFS store lots of small files? Do you only know Archive and SequenceFile? Check it out, and here's a list of solutions that...

Hello Spark! | Spark, from entry to the master

December 19, 2023

by Mishti Sibal

No Comments

Spark is an open source Hadoop MapReduce-like general parallel framework developed by UC Berkeley AMP Lab. It is a fast and universal big data processing...