preface

Git is the most popular version control system in the world.

Apply to the reader

This article is intended for readers who have no experience with any version control tool, especially non-computer professionals.

Use folders for backups

Let’s take a look at how you can do version control without relying on Git. By understanding this approach, Git can be easily understood.

The original way

Li Lei and Lin Tao wrote the paper together. Li Lei creates a folder called papers on his desktop, goes into this folder and puts something in it, including papers and related materials, that looks something like this:

│ ├─ ├─ PDF │ ├─ VSDX │ ├─ VSDX │ ├─ data ├─ XLSXCopy the code

In the middle of the paper, Li Lei planned to make some modifications to the architecture diagram. The modifications were relatively large, including the original picture and backup. Li Lei was very lazy, so he would hold down the Ctrl key and drag the mouse to copy a copy when backing up for three times.

│ my thesis paper - (2) a copy of the docx │ my thesis - (3). A copy of the docx │ my thesis - copy. Docx │ my thesis. Docx │ ├ ─ resources │ zhang SAN paper. PDF paper │, dick, and harry. PDF │ ├ ─ │ images Interaction diagrams. VSDX │ architecture figure - copy (2). VSDX │ architecture diagram - (3). A copy of the VSDX │ architecture diagram - copy. VSDX │ architecture diagram. VSDX │ └ ─ experiment data. XLSXCopy the code

The number of files with similar names grew dizzying. You have a big hard drive anyway, so why not just back up the entire directory? This was transformed into the following structure:

├ ─ paper ├ ─ paper copy ├ ─ paper copy (2) ├ ─ copy paper (3) └ ─ copy paper - (4)Copy the code

Add more and more folders to your desktop, and then drag all the backup files to the backup directory, which will look like this:

├ ─ backup │ ├ ─ copy paper - │ ├ ─ paper copy (2) │ ├ ─ paper copies (3) │ └ ─ copy paper - (4) └ ─ papersCopy the code

It’s a lot drier.

Refine folder description

Later, Li Lei found himself regretting the changes and needed to go back to the old version of the backup. The file name looked like it didn’t know what was changed in each version, so he added the change point to the directory name and it looked like this:

├─ Backup │ ├─ Paper - Detailed algorithm description │ ├─ paper - Improved the end │ ├─ paper - added some documents │ ├─ paper - Fixed some data errorsCopy the code

I don’t know which edition is new and which is old, so I add the date and it looks like this:

├─ Backup │ ├─ exercises - 2022-01-01 Added some documents │ ├─ exercises - 2022-01-01 Fixed some errors in the data │ ├─ exercises - 2022-01-01 refined the description of the algorithm │ ├─ exercises - 2022-01-01 improved the end of the exercisesCopy the code

There may be several changes on the same day, so add the version number as well. To:

├ ─ backup │ ├ ─ paper - V1 │ the 2021-12-24 increased the number of literature ├ ─ paper 2022-01-01 - V2 │ to fix the data errors ├ ─ paper - V3 2022-01-01 the detailed algorithm description │ └ ─ paper - V4 2022-01-31 Improved the end of └─ paperCopy the code

Later, in order to identify who made the modification, evolved into:

├─ Backup │ ├─ v1-2021-12-24 Added some documents by Li Lei │ ├─ v2-2022-01-01 Fixed the data error by Lin Tao │ ├─ v2-2022-01-01 refined the algorithm by Li Lei │ ├ ─ 078 (2) -1-8Copy the code

Fast backup

Later found that the Word is not stable, sometimes crash, need to increase the frequency of backups, and sometimes wrong operation in the paper the directory to change something, need a similar to the game’s “fast archive”, and has built a temporary directory, if there is no landmark changes need to be archived, just fast backup, it is used.

Li Lei has developed a habit of saving a few changes every time, and then mindlessly copying the contents of the paper catalog to the temporary directory to cover the original file. Each time he archived to a backup directory, instead of copying over the papers directory, he copied over the staging directory and renamed it because he did not trust the papers directory to be reliable.

The directory structure evolved to:

├─ Backup │ ├─ v1-2021-12-24 Added some documents by Li Lei │ ├─ v2-2022-01-01 Fixed the data error by Lin Tao │ ├─ v2-2022-01-01 refined the algorithm by Li Lei │ ├─ ├─ 08.02.02Copy the code

Adding a Recovery Record

Li Lei found that even the files in the backup folder could still be accidentally modified. Therefore, when he backup, the folder RAR compression, and in the compression of WinRAR checked the “join the recovery record” option, this is not afraid of accidental changes in the file. The directory structure becomes:

├─ Backup │ ├─ v1-2021-12-24 Added some documents by Li Le. rar │ ├─ v2-2022-01-01 Fixed the error of the data by Tao. rar │ ├─ v2-2022-01-01 refined the description of the algorithm ├─ ├─ PDF, ├─ PDF, PDF, PDF, PDF, PDF, PDF, PDF, PDFCopy the code

Modify more advanced snapshots

Later, as the work became more and more detailed, Li Lei felt that the name of the file name was not enough to describe the difference between the two backups, and wanted to save the hard disk space, and considering the needs of file verification, so he thought of the following method:

High energy warning ahead! The method is complicated, if you follow through, Git will understand half of it!

The backup process

Li Lei first set up a.git directory under the thesis directory for backup. The directory structure of the.git directory is as follows:

Git │ HEAD │ Index │ ├─objects ├─ ─ self │ ├─ self │Copy the code

Then I put a few files in the root directory to make it:

. Git │ HEAD │ index │ ├ ─ objects └ ─ refs ├ ─ heads │ │ my master thesis paper. Docx │ ├ ─ resources │ zhang SAN paper. PDF paper │, dick, and harry. PDF │ ├ ─ │ images VSDX │ ├ ─ data ├ ─ XLSXCopy the code

Start backing up one by one:

Sha-1 is a secure hash algorithm that is more secure than MD5. Li lei used the SHA-1 calculator to calculate the SHA-1 hash value for each file. He added the hash value to the file name and stored it in.git/objects directory for backup. The hash value of each file is almost never the same, so file names are never duplicated. The directory structure becomes similar to the following:

The git │ HEAD │ index │ ├ ─ objects │ my thesis _c2ed167304265c64979820f5f2221a27dc408d. Docx file (new) │ Zhang SAN thesis _cf272986596cd1aed9a37eec3f7145771b8f57. PDF file (new) │ li si's thesis _c739bcb758ff50cb73bbfee27952d57066a4a4. PDF file (new) │ Interaction diagrams _201f0ef48dba6682f2d2fe655ea3c08fbfbef2. VSDX (additional file) │ architecture figure _5393ab7dcca7cb8c33ab8db10f40afec4d9119. VSDX │ (new file) Experimental data _9de29bb2d1d6434b8b29ae775ad8c2e48c5391. XLSX file (new) └ ─ refs ├ ─ heads │ master my thesis. Docx resources │ zhang SAN paper. PDF paper │, dick, and harry. PDF Images │ interaction, VSDX │ architecture, VSDX data, experimental data, XLSXCopy the code

Why.git/objects files are stored flat, undivided directory structure? Because the directory structure may be adjusted between backups, the storage is flat to maximize file reusability.

So how do you identify the directory structure of the different backups? Li Lei made the following steps:

Create a new file named Resources directory.txt with the following contents:

Zhang SAN's thesis _cf272986596cd1aed9a37eec3f7145771b8f57. PDF li si thesis _c739bcb758ff50cb73bbfee27952d57066a4a4. PDFCopy the code

Git /objects. This file will also be renamed to the.git/objects directory. The other directories will also be renamed to.git/objects.

The resources directory _aad636258f59ab361053de3b1057cc3e0bf6fd42. TXT images directory _86db1b97672601980f892bf96c8c57652171193a. TXT The data catalogue _9316d4771f71435c71272f339d2a991b873921f9. TXT _c2ed167304265c64979820f5f2221a27dc408d my thesis. DocxCopy the code

Will value to rename the file calculation SHA – 1 (set was renamed the root directory _f3c56785ec4f1186892a2e09cb40c01cc5b387f8. TXT) in the git/objects directory.

The directory tree becomes:

The git │ HEAD │ index ├ ─ objects │ my thesis _c2ed167304265c64979820f5f2221a27dc408d. Docx │ Zhang SAN's thesis _cf272986596cd1aed9a37eec3f7145771b8f57. PDF │ li si thesis _c739bcb758ff50cb73bbfee27952d57066a4a4. PDF │ Interaction diagrams _201f0ef48dba6682f2d2fe655ea3c08fbfbef2. VSDX │ architecture figure _5393ab7dcca7cb8c33ab8db10f40afec4d9119. VSDX │ Experimental data _9de29bb2d1d6434b8b29ae775ad8c2e48c5391. XLSX | resources directory _aad636258f59ab361053de3b1057cc3e0bf6fd42. TXT file (new) | The images directory _86db1b97672601980f892bf96c8c57652171193a. TXT file (new) | data directory _9316d4771f71435c71272f339d2a991b873921f9. TXT file (new) | Root directory _f3c56785ec4f1186892a2e09cb40c01cc5b387f8. TXT file (new) └ ─ refs ├ ─ heads │ master my thesis. Docx resources │ zhang SAN's paper. PDF paper │, dick, and harry. PDF Images │ interaction, VSDX │ architecture, VSDX data, experimental data, XLSXCopy the code

To back up a snapshot, do this:

First. The git/modify the content of the index file into and root _f3c56785ec4f1186892a2e09cb40c01cc5b387f8. TXT file.

Then create a new file snapshot. TXT with the following contents:

Modified Description: The end is optimized and echoes the beginning and end of the abstract part. Directory: f3c56785ec4f1186892a2e09cb40c01cc5b387f8 last version: noCopy the code

Git /index contains the summary value of the.git/index file.

Add the sha-1 value to the.git/objects directory and change it to:

The git │ HEAD │ index ├ ─ objects │ my thesis _c2ed167304265c64979820f5f2221a27dc408d. Docx │ Zhang SAN's thesis _cf272986596cd1aed9a37eec3f7145771b8f57. PDF │ li si thesis _c739bcb758ff50cb73bbfee27952d57066a4a4. PDF │ Interaction diagrams _201f0ef48dba6682f2d2fe655ea3c08fbfbef2. VSDX │ architecture figure _5393ab7dcca7cb8c33ab8db10f40afec4d9119. VSDX │ Experimental data _9de29bb2d1d6434b8b29ae775ad8c2e48c5391. XLSX | resources directory _aad636258f59ab361053de3b1057cc3e0bf6fd42. TXT | The images directory _86db1b97672601980f892bf96c8c57652171193a. TXT _9316d4771f71435c71272f339d2a991b873921f9 | the data catalogue. TXT | The root directory _f3c56785ec4f1186892a2e09cb40c01cc5b387f8. TXT | snapshot _724b578ff18799ee98f439833e13ff5f38297a80. TXT file (new) └ ─ refs ├─ Heads │ ├─ Master my essay. Docx reference materials. PDF │ ├─ Interactive map. VSDX │ structure chart. VSDX data │ experimental dataCopy the code

Then the git/content is modified to 724 b578ff18799ee98f439833e13ff5f38297a80 HEAD file.

The next time that changes the architecture diagram and papers, backup again, generate a new snapshot, the snapshot. TXT of the previous version This field should fill 724 b578ff18799ee98f439833e13ff5f38297a80. Git /objects creates a structure that looks like the following:

The git │ HEAD │ index ├ ─ objects │ my thesis _c2ed167304265c64979820f5f2221a27dc408d. Docx │ Zhang SAN's thesis _cf272986596cd1aed9a37eec3f7145771b8f57. PDF │ li si thesis _c739bcb758ff50cb73bbfee27952d57066a4a4. PDF │ Interaction diagrams _201f0ef48dba6682f2d2fe655ea3c08fbfbef2. VSDX │ architecture figure _5393ab7dcca7cb8c33ab8db10f40afec4d9119. VSDX │ Architecture diagram _b975318497fca65e447d310fba597637619b8c48. VSDX (additional file) _9de29bb2d1d6434b8b29ae775ad8c2e48c5391 │ experimental data. The XLSX | Resources directory _aad636258f59ab361053de3b1057cc3e0bf6fd42. TXT | image directory _86db1b97672601980f892bf96c8c57652171193a. TXT | The images directory _8b9d9e9f14e312b609f9bb31ad3ef5753e1fafb9. TXT file (new) | data directory _9316d4771f71435c71272f339d2a991b873921f9. TXT | The root directory _f3c56785ec4f1186892a2e09cb40c01cc5b387f8. TXT | snapshot _724b578ff18799ee98f439833e13ff5f38297a80. TXT | Snapshot _e6238652872781a640e9c3a4fc62346d14443611. TXT file (new) └ ─ refs ├ ─ heads │ master my thesis. Docx resources │ zhang SAN paper. PDF paper │, dick, and harry. PDF Images │ interaction, VSDX │ architecture, VSDX data, experimental data, XLSXCopy the code

Then the git/amended as e6238652872781a640e9c3a4fc62346d14443611 HEAD file content.

Reduction process

How do you know which snapshot is currently backed up? Look. The git/SHA in the HEAD file – 1 value e6238652872781a640e9c3a4fc62346d14443611 is known, In the git/objects directory to find the snapshot _e6238652872781a640e9c3a4fc62346d14443611. TXT file, open the file corresponding to the root directory of the can know, and then in the git/objects directory to find corresponding to the root directory of the file, Scaling down, you get the entire backup. If you want to restore, just rename the corresponding file to the root directory can be overwritten.

What if you want to restore the previous snapshot? See the snapshot _e6238652872781a640e9c3a4fc62346d14443611. TXT file of the previous version This field records is 724 b578ff18799ee98f439833e13ff5f38297a80, You can find the previous snapshot, restore the process as above.

How to check

After calculating the sha-1 value for the file name, compare it to the file name to see if the file was accidentally modified. The specific principle can refer to the information summary algorithm, this paper will not repeat.

summary

As you can see from this article, the easiest way is to back up the entire directory at once. Check in order to minimize the volume, and supporting documents, use the file to modify the file name after the information stored to flatter directory, and use the document directory structure, through the information of SHA – 1 value can locate specific directories and files, so the time of each snapshot operation, so as to realize the effect of multiple versions of a snapshot backup. Understanding this approach helps you understand how Git works. With the foundation of this chapter, Git will be further explained in more depth in the following chapters. Stay tuned!