10 KiB
Git Guide
In this guide we are going to show you how to use Gitlab, however before we get going setting thingsup it's good to get an understanding of however the tool works that powers Gitlab (and Github). So let's dive in and take a look.
So what is Git?
At it's core git is a versioning system to store your code also known as a version control system (VCS). It was developed in 2005 by the creator of the Linux Kernel, Linux Torvalds.
The system also allows developers to work on code simultaniously as each developer checks out a local copy of the repository to make their changes on. These changes can then ve commited back to the central repository and other developers can pull these updates into their local copy. These are all terms you'll get familure with as we go through this guide.
Developers can branch, fork and merge repositories allowing flexible workflows to happen, we'll cover these concepts in the documents and video's that follow.
Structure of a repo (Stages)
To understand all these terms lets first look at a git repository and its stages, this is how code gets from the developers working directory to being synchronised with the remote repository.
We can think of this in four main areas, your working directory, a staging area, the local repository and the remote repository. Now depending on if the remote repository already exists depends on if you run a git init
or a git clone
command, and these commands kind of do what they sound like. Init sets up the necessary files to start tracking the changes in a new repository on the local file system, where as clone pulls all the current content from the remote repository and copies it to your local repository and sets up your working directory also. If you want to get changes from the remote repository that someone else has uploaded you can run pull to bring those changes into your working directory.
The staging area comes into play as an intemediate space that sit's between working directory and local repository. A developer adds changes to the stage area, if theyare happy with the change they commit the changes to the local repository and then when they are happy can push those changes to the remote repository. Lets try and visualise this in the diagram below.
flowchart TD
A[Working Directory] -->|git add| B(Staging Area)
B --> |git reset| A
B --> |git commit| C(Local Repository)
C -->|git push| D[Remote Repository]
D --> |git fetch| C
D --> |git pull| A
Sync Issues
When working with other developers you're all going to be cloning and pulling the code fromt he rmeote repository to start with and then committing and pushing your changes back. Now this is where it can get messy. Let's say theres a file called header.html in your repository and you and another developer change that file and push it back to remote repository. This could end up in whats known as a merge conflict. Basically you have both created a different version of the file from the initial clone so what happens now? Which one should be the one stored. Well this is where git is really clever. It accept the first persons commit and push without issues, the second person however will be notified of the conflict and be presented with some options. They will be asked to review the file locally and will be given a version of header.html that contains a diff showing both users contributions. You can then resolve this conflict by creating a merged version of both of your files or accept their changes or your changes only.
The way to read the merge conflict is to look in the file(s) affected and look for the merge markers <<<<<<<, ======= and >>>>>>> which highlights the conflicting sections. You decide which changes to keep, remove or modify and save the file.
A good way to minimise this happening to you is to frequently run git pull
in your working directory to keep up to date with others work, however, whilst this works for small teams larger teams will want to consider something more robust so they don't keep tripping over each other. This is where branches come in!
Branches
In a git repository you have a main branch. This is where you keep the current code, normally the current working code. When you develop new features, often team members work on these in parrallel to each other and you guessed it they don't want to be dealing with loads of merge conflict issues. Luckily, theres a way to avoid this by using a feature called branch. This lets you take a point in time copy of the repository and work onthe code, adding, commiting and pushing files as you please, without breaking things for others. Now when you are ready to move your code into the main branch you do something called a merge, often called a merge request(MR) or pull request(PR) - (these are the same thing). What git does here is move your new files into the main branch and if there are any merge conflicts on files that already existed you'll get teh options to handle them like before. You can switch between branches using the branch and checkout commands.
Let's try and visualise this in the diagram below showing we have a main branch and two feature branches called A and B, these branches are taken at different times from the main branc. We can also see that Feature A gets merged back into the main branch. Now you'll also notice numbers next to each dot int he diagram. These are the UID's for the commits you make. These are super important because it means you can roll back to a previous version of the code by using that UID at any time!ß
gitGraph
commit
commit
branch feature-A
checkout feature-A
commit
commit
commit
checkout main
commit
branch feature-B
checkout feature-B
commit
commit
commit
checkout main
commit
commit
commit
merge feature-A
commit
commit
Using branches to manage your code is often reffered to as Pull Request Workflow, its great for teams for sharing knowledge and encourages code reviews from other team members, but that can introduce delays in getting your feature merged.
HEAD
The most recent commit on the currently checked-out branch is indicated by something reffered to as the HEAD. This is a pointer to any UID within the repository and when new commits are pushed the HEAD updates. This is how git knows to compare your commit with the HEAD to make a diff on the remote repository.
If you have a branch checked out HEAD points to the latest commit on that branch.
You can also get something called detached HEAD state which is where the HEAD isn't pointing to the latest commit of the branch, but we'll deal with that later.
Tags
You can also take those UID's and tag them. You usually use this feature to mark a particular milestone on the main branch such as the release of v1 or v1.1 and so on. They mark the particular UID meaning that someone is able to check out the code that went into a v1 package for example. We can visualise this with the diagram below.
gitGraph
commit
commit tag: "RC_1"
commit
commit tag: "v1"
commit
commit
commit tag: "v1.1"
Forks
Forks are kind of like branches, but instead of just making a slight deviation from main you actually get a full working repostity of your own. Making commits and pushes will only ever effect your git repo however there is a tie back to the original repository that allows you to pull updates from upstream. Developers can then make merge/pull requests backup stream to the main project. This is particularly useful if you want to make contributions to some code that doesn't belong to your team so you can't create a branch on the main repo. This is known as the Forking Workflow.
There are other workflows however such as the Gitflow Workflow which involves having several long lived branches such as main, develop and release branches. You can also completely ignore all these features and do something called Trunk-Based development where everyone commit's to main, and this is good for rapid itteration but can give you merge conflict nightmares!
Getting Setup
Right thats enough theory lets get you set up to start with! The commands I'm about to show you will work on MacOS, Linux or windows. You first need to make sure you have git installed on your system, which you can get from here. On MacOS and Linux you can then use your native terminal to run the git
command and on windows you'll want to open the git bash program.
Username and Email
When you pull a private repository or push to a repository you have two options use HTTPS or SSH. You should use SSH where possible, HTTPS can be used when you are cloning someone elses work to use but don't intend on pushing back to that repository.
Now hopefully you'll have a login to Gitlab or Github if not go and set one up now.
The first thing you'll need to do is set up your name, so git can use this to show to other who commited the code. Simply run the following in your terminal.
git config --global user.name "Ric Harvey"
Now lets go ahead and set up your email address, which will be used to identify you when you commit code and login via SSH.
git config --global user.email ric@rics-superdomain.com
Right now we need to generate you an SSH key. These come in two parts private and public. You are going to want to protect your private key and not let anyone else get a copy of it as this is your key to push code on Gitlab or your repository.
In your terminal again run:
ssh-keygen -t rsa -b 2048 -C "<comment>"
Press enter
to the next question which will have an output like this
Generating public/private ed25519 key pair.
Enter file in which to save the key (/home/user/.ssh/id_rsa):
Now enter a passphrase for your key, alternatively you can leave this blank but best practice says you should protect it!
Enter passphrase (empty for no passphrase):
Enter same passphrase again: