A Layman’s Introduction to Version Control System | by Gaurav Goel | Jan, 2021

[ad_1]


Playing with Git

Photo by ammar sabaa on Unsplash

“There is no such thing as Truth. Everyone has his own version of it”

Have you ever saved your files with date-time stamps in file names? I guess most of us have done it to save a version of our existing files. This is required when we have certain changes to be done but we want to keep a history of original contents as well. At any point in time, if you want to go back to original contents, you can go back. Also, you may want to keep a note of why the current changes are done.

e.g I have a file where I maintain my expenses. I can maintain the expenses for different days by adding the date in the name of the file as follows:

This is the crude form of “version control” wherein we have kept the backup of our original file and have saved changes to a new file. This kind of version control or version maintenance is fine if you are doing some small personal work like some class project. However, when you are working on a software project where multiple people are working, you need a better mechanism through which changes made in multiple files (source code files, config files, or just any kind of files…..) by multiple people can be handled. This is where a Version Control System (VCS) comes to the rescue. A Version Control software provides a mechanism to track changes in files over a period of time.

By using a VCS, we can know what, when and by whom, changes were made.

Let’s look at the basic terminologies and actions that you do with any Version Control System:

1. A VCS will provide a database or a storage location where you can store your files. This is called a repository or repo. The files contained in the repo will be tracked by VCS. The computer where the repository is hosted is called the “server” while the computer which connects to the Repository is called the “client”. So, your computer will be the client machine while the computer where the Version Control System is running is the server.

2. On your machine or local computer, the place or folder where you keep your files and make changes is called “Working Copy

3. In the repository, the primary place where you keep your file is called the “Trunk” or “master branch”.

4. When you upload your file for the first time to the repo, it’s called “Add” i.e you are asking the VCS to start tracking your file.

5. The VCS will assign a version number to your file. This is called “Revision”.

6. If you decide to make a change in the file which has been stored on the repo, you will “check out” that file. It means you are now downloading a file from the repo.

7. You can upload the file to the repository after making the change. This is called “Check-in”. Along with Check-in, you can provide the comments explaining why these changes are being done.

8. A VCS also provides a list of changes that have been done. This is called “Change Log”.

9. If a file or folder is copied for private use, it is called Branching.

10. If the changes from one file are applied to another, it’s called Merging or Patching.

11. If a change contradicts with another change to the same file, it’s called a “Conflict”. The user has to “Resolve” the conflict before proceeding.

Any good version control system should provide the following features:

1. Backup and Restore

2. Track Changes

3. Branching

4. Merging

Let’s now discuss a VCS system — GIT

GIT is a very popular Version Control System created in 2005 by Linus Torvalds. Its free open source software with a distributed architecture. Distributed Architecture means that each person contributing to the repository has a copy of the repository on his own machine. This makes the operations really fast.

You can download GIT from the below link:

https://git-scm.com/downloads

Note that “SCM” in the URL stands for Source Control Management. It’s just another way of saying “Version Control” of Source Code files.

Before looking at the basic GIT functionalities, just remember that the purpose of any Version Control System (like GIT) is to track the below information about files :

What changes are made?

Who made the changes?

When were the changes made?

Once you have installed GIT, the very first task can be to create a new repository so that GIT can start tracking your files. To start this, you need to provide basic configuration information to GIT like who you are. This can be achieved in GIT as below:

Adding Configuration Information — GIT CONFIG

The “— global” in the above command means that we are going to use the given values of user.email and user.name for all repositories that we will work with.

Creating a new repository

The next step is to create a new repository. For this, we can create a new folder and create a GIT repository inside it using the “git init” command.

first_repo is a folder under which we have created a new repository. Remember that the repository is the place where we will keep all the files that we want GIT to track.

The init command creates an empty git repository under the folder “first_repo”. If we check the contents of the first_repo folder, we will see a directory “.git”

This is called a Git directory. It is the database that stores the changes and keeps change history. There are a bunch of files and directories inside “.git”

These directories and files should not be touched or manipulated directly. Whenever a repository is cloned, it is this “.git” directory that is copied.

The area outside the git directory is called the “working tree”. It is the place where current or new files are kept. You have to start tracking a new file by using the “git add” command.

Adding new files to track

Currently, our working tree or working directory is empty. We will copy a file “expenses.xlsx” that we want GIT to track.

The content of this excel file is as follows:

We now have a file in the working tree. We will use the “git add” command to ask GIT to start tracking it.

The “git add” command adds the file to something called “staging area”. The Staging Area(also known as index) is a file maintained by GIT that contains information about which files and what changes are going to be “committed”. This is a very important concept. Files in GIT can be in any of the below 3 states:

“Modified” — It means that the file has been changed but has not been committed yet.

“Staged” — It means that the file has been added to the staging area. It will be committed.

“Committed” — It means that the file has been stored in the GIT database.

So a basic GIT workflow consists of 3 sections:

Step 1: You modify files in Working Tree

Step 2: You stage those files (or changes) which you want to be the part of the next commit

Step 3: You perform a commit that takes the files from Staging Area and stores them permanently in the Git repository.

You can check the status of your changes by the “git status” command

Checking status of files

“git status” is used to check the current status of files

It shows that our file “expenses.xlsx” is marked to be committed. To commit it, we will run the “git commit” command.

Committing the changes to the repo

“git commit” — On running this command, GIT opens a text editor where we can enter a commit message. During the installation of GIT software, you will get to select which editor you want GIT to use as default. I had selected Notepad++ and hence it is opened by default.

I added a line on top “Adding Expenses….”

On saving the message, our file gets committed to the repository.

We just committed our first file in GIT!!

Tracking of Files in GIT

You can think of GIT as a representation of your project. A project is nothing but a collection of some files (source code files, config files, image files, data files, etc). Each time we make a commit, GIT takes the snapshot of your project (i.e all the files) at that point in time. So whenever you modify any files, stage them and do a commit, GIT takes a snapshot. If you look at these snapshots, you look at the history of your project.

So let’s look at the status of our project by running the git status command:

This shows that there are no changes to commit and the working tree is clean. Let’s modify our file and do some changes.

I just added one more expense item “Tomatoes” and saved the file.

Let us run the git status command again.

Now it clearly tells us that expenses.xlsx has been modified and there are “Changes not staged for commit”

Let us run the “git add” command to stage the changes.

This marks our file to be committed in the next commit.

This time, we will use a parameter to pass on the commit message as follows:

We can check the logs of commits by issuing “git log” command.

The results clearly show what changes were done, by whom, and at what time. The “HEAD -> master” means that this is the master branch. It’s the pointer to the current branch of the project.

The 40 character alphanumeric string- “b3c9b9af054f02f09968beb8c6f54dc9eba6b659” is the hash value or checksum of the contents of the file. GIT stores every file by such hash values.

Undoing Changes

There may be instances where we made some changes to a file but we want to undo them before committing. I added 1 more line to my expenses.xlsx file.

If we check the git status now, it will show that the file has been modified but the changes have not been committed.

Let’s say we want to “undo” these changes. For doing this, we can “checkout” the last committed snapshot from the git repository by using the “git checkout” command.

It will revert the changes that we had made. If I open the expenses.xlsx file, it will show the below:

What if, you have staged the changes. Can you revert them? Yes. We can use the “git reset” command to revert the changes that have been staged.

What if you have committed the changes. Can you revert them? Yes. We can use the “git revert” command to revert the changes that have been committed.

Branching and Merging

In a project where multiple people are working, it’s very common that someone would like to work on some changes while at the same time another person would like to work on some other changes. In such a case, these individuals can make individual copies of the project. Each person can work on their own copies (or branches) and finally there should be a way to merge and commit the changes done by these individuals to the main project.

So when a new repository is created, this default branch is called “master”. When we committed the “expenses.xlsx” file, we did it in the master branch.

Let’s say we have another person also noting down the expenses. To test the branching features of GIT, we can make a new branch for this additional person.

“git branch” is used to display all the current branches.

Right now, we do not have any other branch except the master. We can create a new branch with the name “new_expenses”.

Now, this new person can switch to the “new-expenses” branch by using the “git checkout” command.

Note that the asterisk (*) sign now appears in front of the new-expenses branch.

I have created a new file “expenses2.xlsx”. Let’s add it to the repository.

Let us check the log using “git log”

Note that now the HEAD is pointing to the new-expenses branch.

Now, let’s merge this new branch with master. To do that, you need to switch back to the master branch and use the “git merge” command

Github is a web-based repository hosting service. So instead of creating our own GIT server, we can use this service to share and access repositories on the web. Other users of our team can copy and clone them and merge changes as required. GitLab and BitBucket are other similar web-based repository hosting services.

So basically, Github provides a free GIT server. You can host your repositories on it but there is a limit on the number of contributors for the free private repositories that you can host. You can always pay a monthly fee to get unlimited access to Github private server access.

Creating a repository on GitHub is super easy. You should have an account on https://github.com/

Once you create an account, you can start by creating a new repository from the user interface of GITHUB itself.

You can clone the GitHub repository to your local machine by using the “git clone” command. You have to supply your GitHub username and password to this command and it will download a copy of the repository to your local machine.

Read More …

[ad_2]


Write a comment