An in-depth understanding of Git and use of GitHub hosting services

Source code management systems (SCM) and version control

Version control is a system that records changes to the contents of several files so that future revisions to a particular version can be reviewed.

Local version control system

Many people make a habit of copying the entire project directory to save different versions, perhaps with a different name and backup time. The only good thing about doing this is that it’s simple, and the bad thing is that you sometimes get confused about where you’re working, and there’s no way to undo the wrong file. To solve this problem, many native version control systems have been developed for a long time. Most of them use some simple database to record the differences between the file updates.

Centralized version control system

Then there was the question of how to get developers working on different systems to work together. Thus, Centralized Version Control Systems were created. These systems, such as CVS and Subversion, have a single centralized management server that holds revisions of all files, and people working together connect to this server through clients to retrieve the latest files or submit updates. This has been standard practice for version control systems over the years.

This has many advantages, especially over native version control systems. Now, everyone can see to some extent what everyone else in the project is doing, and administrators can easily get access to each developer and see each person’s daily commit record. The biggest drawback to this is the single point of failure of the central server. If it goes down for an hour, no one will be able to commit updates for an hour and no one will be able to work together. If the disk on the central server fails and happens not to be backed up or backed up in a timely manner, there is a risk of data loss. The worst case scenario is to completely lose all historical change records for the entire project, except for some snapshot data extracted by the client, but there is no guarantee that all data has been extracted in full beforehand.

A similar problem exists with local version control systems, as long as the entire project’s history is kept in a single location, there is a risk that all historical updates will be lost.

Distributed version control system

Distributed version management systems, such as Git, Mercurial, Bazaar, and Darcs, are designed to solve this single point of problem, where the client doesn’t just extract the latest snapshot of the file, but mirrors the entire original repository. This way, a failure of any of the co-operating servers can be later recovered using any of the mirrored local repositories. Because each extraction operation is actually a complete backup of the code repository. Furthermore, such systems can specify interactions with several different remote code repositories. As a result, you can work with people in different groups on the same project. You can set up different collaboration flows as needed.

Differences between Git and SVN and other common version control software

Git is a fast, scalable distributed version control system with an extremely rich command set that provides advanced operations and full access to internal systems.

Git was born in 2005 because of the withdrawal of BitKeeper, the version control system used by the Open source Linux kernel project, which forced the Linux open source community (and Linux creator Linus Torvalds in particular) to learn a lesson. Developing your own version control system is the only way to prevent this from happening again. Their goals for the new system are as follows:

Fast, simple design, strong support for non-linear development patterns (allowing thousands of parallel development branches), complete distribution, and the ability to efficiently manage very large scale projects like the Linux kernel (speed and quantity).

After continuous improvement, Git has always maintained the following characteristics:

1. Record snapshots directly, not differences.

The main difference between Git and other version control systems is that Git only cares about whether the file data as a whole has changed, while most other systems only care about specific differences in the file content. These systems (CVS, Subversion, etc.) record which files are updated each time, and what contents of which rows are updated, as shown below:

Git does not keep data on these differences. In practice, Git is more like taking snapshots of files that have changed and recording them in a tiny file system. Each time an update is committed, it looks through the fingerprint information of all the files and takes a snapshot of the file, then saves an index pointing to that snapshot. To improve performance, Git doesn’t save the file again if it hasn’t changed. Instead, it links only to the last snapshot saved. Git works like this:

This is an important difference between Git and other systems. It completely overturns the traditional version control routine, and makes a new design for the implementation of each link. Git is more like a small file system, but it also provides a lot of powerful tools based on it, not just a simple VCS.

2. Support offline work (almost all operations are performed locally), local submissions can be later submitted to the server;

3. Keep data integrity at all times;

4. Most operations only add data;

Three states of a file

For any file, there are only three states in Git: Committed, Modified, and staged. Committed indicates that the file is securely stored in the local database. Modified indicates that a file has been modified but has not been submitted for saving. Staging means putting the modified file in the list to save for the next commit. This shows you the three working areas for file transfers when Git manages projects: Git’s working directory, staging area, and local repository.

Every project has a.git directory. If git clone –bare is used, the new directory is itself a Git directory. Git is used to store metadata and object databases. This directory is so important that every time a mirror repository is cloned, the data in this directory is actually copied.

Pulling all the files and directories from a version of a project to start work on is called the working directory. These files are actually extracted from a database of compressed objects in your Git directory and can then be edited in your working directory.

A staging area is nothing more than a simple file, usually in a Git directory. This file is sometimes referred to as an index file, but the standard term is still the staging area.

The basic Git workflow is as follows:

1. Modify some files in the working directory. 2. Take a snapshot for the modified file and save the snapshot to the temporary storage area. 3. Commit the update to permanently dump the file snapshots saved in the staging area to the Git directory.

If a particular version of a file is stored in a Git directory, it is committed. If it has been modified and put into the temporary storage area, it belongs to the temporary storage state; It is modified if it has been modified since the last retrieval but has not been placed in the staging area.

The use of Git

Git supports many data transfer protocols, including local transport, Git ://, HTTP (s):// or SSH transport user@server:/path.git. All protocols except HTTP require Git to be installed and run on the server.

First, use local warehouse

For local project management, some specific operation methods can refer to this article: Git Basics, which explains in detail how to use Git each operation and effect. Since the main purpose of this article is to demonstrate how GitHub can be used to participate in open source projects, I won’t go into the details, but I think it’s helpful to take a closer look at these uses for productivity.

With a local repository, the user is only one person, so there is no problem with collaboration, and no matter how much you play, there is usually no problem. With remote repositories, things get a lot more complicated and interesting.

Second, use remote warehouse

To collaborate on any Git project, you must understand how to manage remote repositories. Remote repositories are project repositories that are hosted on the web. There may be several of them, some of which you can only read and some of which you can write. When collaborating with others on a project, you need to manage these remote warehouses so that you can push or pull data and share your progress. Manage remote repositories, including adding remote libraries, removing obsolete remote libraries, managing various remote repository branches, defining whether or not to track these branches, and so on.

If I create a project on GitHub, for example, which is popular today, I’m essentially creating a new server-side repository with git init. If I want to develop locally, I need Git clone to my local location. After some development, I can git push local changes to the server repository. As the project grows and someone else wants to work on it, he can Fork my project on GitHub so he has write access to it and can save his work to GitHub’s servers. If he wants to submit his work to me, he first needs to add my remote repository git Remote Add to his local development environment. Then Git push remotename master makes the push request, and if I accept it, his work can be merged into the trunk. Since we are developing in parallel, if he wants to see my work, he can use Git pull Remotename to pull the changes I have made to the local, which is very convenient.

The above paragraph describes the general process of using a remote warehouse and collaborating with others. Some of the operations that we need to use are as follows:

1. View the currently configured remote repository

You can use git remote -v to see which remote repositories have been added to your current project

Origin is usually its own remote repository on the server, and others are remote repositories of others.

2. Add a new remote repository

To add a new remote repository, specify a simple name for future reference by running git remote add [shortname] [url]

git remote add pb git://github.com/paulboone/ticgit.git

3. Capture the information of the remote warehouse

git fetch [remote-name]

This command will go to the remote repository and pull any data that you don’t already have in your local repository. Once you’re done, you can access all the branches in the remote repository locally, merge one of them locally, or just pull out a branch and see what’s going on. If a repository is cloned, this command will automatically attribute the remote repository to Origin. So git Fetch Origin will fetch all updates that have been uploaded to the remote repository (or submitted since the last fetch) since your last clone. It is important to remember that the FETCH command simply pulls the remote data to the local repository and does not automatically merge into the current work branch, but only manually when you are ready.

4. Grab and merge information from remote repositories

git pull [remote-name]

You can use the Git pull command to automatically grab the data down, and then automatically merge the remote branch into the current branch in the local repository. We often use it in our daily work. It’s fast and good. In fact, by default the git clone command essentially automatically creates a local master branch to keep track of the master branch in the remote repository (assuming the remote repository does have a Master branch). Therefore, we usually run Git pull to grab data from the remote repository of the original clone and merge it into the current branch of the working directory.

5. Push data to remote warehouse

git push [remote-name] [branch-name]

At a point in the project where you want to share your current work with others, you can push data from a local repository to a remote repository. Git push [remote-name] [branch-name] Is a simple command to do this. To push a local master branch to the Origin server (again, cloning will automatically use the default master and Origin names), run the following command:

git push origin master

This command will complete only if you have write permission on the cloned server or if no one else is pushing data at the same time. If someone else pushes several updates before you do, your push will be rejected. You have to grab their updates locally and incorporate them into your own projects before you can push them again.

6. View the remote warehouse information

git remote show [remote-name]

7. Delete and rename remote repositories

Git remote rename [old name] [new name]

git remote rm [remote-name]

Git and making

GitHub is a free code hosting service using Git (similar to the old SourceForge) that hosts many of its most famous projects.

Depending on how Git is used, there are two ways to participate in open source projects on GitHub.

The first is when the project’s creator adds you to the list of co-contributors to the project, so you can push code directly to the project.

The second option is to Fork a piece of code into your own space, which has push permissions of its own. If development is going well, the project creator can add Fork’s projects to the Remote repository and fetch the code into his repository to merge when he sees fit, or we can initiate a request for the creator to merge the code. GitHub promotes this approach to development collaboration.

Using the php-Daemon project as an example, here is how to participate in an open source project hosted on GitHub.

Install and configure Git

1. Start with a GitHub account.

2. Pick a project you like and Fork it.

3. Create a Local Repo.

You can use the following command to copy the project locally, either to an SSH or HTTP address, which can be seen on the project page.

git clone [email protected]:cocowool/PHP-Daemon.git

4. Configure the source project address.

Once the project clone is complete, by default there is a remote called “Origin” pointing to my project on GitHub instead of the original project. In order to get updates on the original project, we need to add another remote named “Upstream”.

Git remote add upstream github.com/shaneharter…

git fetch upstream

5. What you can do next.

Push Commits Push Commits

A few Tips

Git’s command line and graphical interface come with Mac OS X Lion, although its graphical interface sucks blood.

There is also a GITk under Mac that provides graphical tools for historical lookup. It is written in Tcl/Tk and is basically a visual version of the Git log command, with all the git log options available on Gitk.

To see this, type /Developer/usr/bin/gitk in the project directory.

References:

1, wikis, dead simple

2. Git version control

3, Wiki Git

4, Git Homepage

5. Use Git to manage source code

Jquery at GitHub

GitHub forks a repo

8, Git one of the details: Git start

9, Open Experience library Git series articles

10, the Pro Git

An in-depth understanding of Git and use of GitHub hosting services

Related Posts

Summary and examples of dynamic programming

Pygame: The classic Bubble Dragon game is popular all over the world. Will you like it? (Attached source code)

Search two dimensional matrix