Detailed Table of Contents
Guidance for the item(s) below:
Given this is a first course in SE, tradition demands that we start by defining the subject. However, let's not spend a lot of time going through lengthy/formal definitions of SE. Instead, let's look at an extract from the very first chapter of a very famous SE book, with the aim of providing some inspiration, but also an appreciation of the challenges ahead.
The following description of the Joys of the Programming Craft was taken (and emphasis added) from Chapter 1 of the famous book The Mythical Man-Month, by Frederick P. Brooks.
Why is programming fun? What delights may its practitioner expect as his reward?
First is the sheer joy of making things. As the child delights in his mud pie, so the adult enjoys building things, especially things of his own design. I think this delight must be an image of God's delight in making things, a delight shown in the distinctness and newness of each leaf and each snowflake.
Second is the pleasure of making things that are useful to other people. Deep within, you want others to use your work and to find it helpful. In this respect the programming system is not essentially different from the child's first clay pencil holder "for Daddy's office."
Third is the fascination of fashioning complex puzzle-like objects of interlocking moving parts and watching them work in subtle cycles, playing out the consequences of principles built in from the beginning. The programmed computer has all the fascination of the pinball machine or the jukebox mechanism, carried to the ultimate.
Fourth is the joy of always learning, which springs from the nonrepeating nature of the task. In one way or another the problem is ever new, and its solver learns something: sometimes practical, sometimes theoretical, and sometimes both.
Finally, there is the delight of working in such a tractable medium. The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by the exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures....
Yet the program construct, unlike the poet's words, is real in the sense that it moves and works, producing visible outputs separate from the construct itself. It prints results, draws pictures, produces sounds, moves arms. The magic of myth and legend has come true in our time. One types the correct incantation on a keyboard, and a display screen comes to life, showing things that never were nor could be.
Programming then is fun because it gratifies creative longings built deep within us and delights sensibilities you have in common with all men.
Not all is delight, however, and knowing the inherent woes makes it easier to bear them when they appear.
First, one must perform perfectly. The computer resembles the magic of legend in this respect, too. If one character, one pause, of the incantation is not strictly in proper form, the magic doesn't work. Human beings are not accustomed to being perfect, and few areas of human activity demand it. Adjusting to the requirement for perfection is, I think, the most difficult part of learning to program.
Next, other people set one's objectives, provide one's resources, and furnish one's information. One rarely controls the circumstances of his work, or even its goal. In management terms, one's authority is not sufficient for his responsibility. It seems that in all fields, however, the jobs where things get done never have formal authority commensurate with responsibility. In practice, actual (as opposed to formal) authority is acquired from the very momentum of accomplishment.
The dependence upon others has a particular case that is especially painful for the system programmer. He depends upon other people's programs. These are often maldesigned, poorly implemented, incompletely delivered (no source code or test cases), and poorly documented. So he must spend hours studying and fixing things that in an ideal world would be complete, available, and usable.
The next woe is that designing grand concepts is fun; finding nitty little bugs is just work. With any creative activity come dreary hours of tedious, painstaking labor, and programming is no exception.
Next, one finds that debugging has a linear convergence, or worse, where one somehow expects a quadratic sort of approach to the end. So testing drags on and on, the last difficult bugs taking more time to find than the first.
The last woe, and sometimes the last straw, is that the product over which one has labored so long appears to be obsolete upon (or before) completion. Already colleagues and competitors are in hot pursuit of new and better ideas. Already the displacement of one's thought-child is not only conceived, but scheduled.
This always seems worse than it really is. The new and better product is generally not available when one completes his own; it is only talked about. It, too, will require months of development. The real tiger is never a match for the paper one, unless actual use is wanted. Then the virtues of reality have a satisfaction all their own.
Of course the technological base on which one builds is always advancing. As soon as one freezes a design, it becomes obsolete in terms of its concepts. But implementation of real products demands phasing and quantizing. The obsolescence of an implementation must be measured against other existing implementations, not against unrealized concepts. The challenge and the mission are to find real solutions to real problems on actual schedules with available resources.
This then is programming, both a tar pit in which many efforts have floundered and a creative activity with joys and woes all its own. For many, the joys far outweigh the woes....
Guidance for the item(s) below:
Now, let's switch our focus to the project management aspect of SE.
Broadly speaking, there are two approaches to doing a software project. Those two approaches are also highly relevant to the way this course is run, and how it is different from most SE courses elsewhere.
Let's learn about those two approaches early so that we can better understand how this course works.
Software development goes through different stages such as requirements, analysis, design, implementation and testing. These stages are collectively known as the software development lifecycle (SDLC). There are several approaches, known as software development lifecycle models (also called software process models), that describe different ways to go through the SDLC. Each process model prescribes a 'roadmap' for the software developers to manage the development effort. The roadmap describes the aims of the development stages, the outcome of each stage, and the workflow i.e., the relationship between stages.
The sequential model, also called the waterfall model, views software development as a linear process, in which the project is seen as progressing through the development stages. The name waterfall stems from how the model is drawn to look like a waterfall (see below).
When one stage of the process is completed, it produces some artifacts to be used in the next stage. For example, the requirements stage produces a comprehensive list of requirements, to be used in the design phase.
A strict sequential model project moves only in the forward direction i.e., each stage is completed before starting the next. For example, once the requirements stage is over, there is no provision for revising the requirements later.
This model can work well for a project that produces software to solve a well-understood problem, in which case the requirements can remain stable and the effort can be estimated accurately. Furthermore, as each stage has a well-defined outcome, it is easy to track the progress of the project because one can gauge the project progress by monitoring which stage the project is in.
However, real-world projects often tackle problems that are not well-understood at the beginning, making them unsuitable for this model. For example, target users of a software product may not be able to state their requirements accurately at the start of the project, if they have not used a similar product before.
The iterative model advocates producing the software by going through several iterations. Each of the iterations could potentially go through all the stages of the SDLC, from requirements gathering to deployment.
Each iteration produces a new version of the product, building upon the version produced in the previous iteration. Feedback from each iteration is factored into the subsequent iterations. For example, if an implementation task took longer than expected, the effort estimate for a similar tasks in future iterations can be adjusted accordingly. Similarly, if a feature introduced in the current iteration was not well-received by target users, it can be removed or tweaked in the next iteration.
The iterative model can be done in breadth-first or depth-first approach.
Taking a Minesweeper game as an example,
A project can be done as a mixture of breadth-first and depth-first iterations i.e., an iteration can contain some breadth-first work as well as some depth-first work, or, some iterations can be breadth-first while others are depth-first.
Follow up notes for the item(s) above:
Scanning a TLDR version of a topic: As mentioned in 'Using this Website' page, the more important layer of information is given in bold text. For example, you can quickly scan the essential points of a topic by reading the bold text only (this could be useful when you want to quickly recap a previous topic, or to get an idea of what a topic covers without reading all the details).
Guidance for the item(s) below:
Next, let's resume our Git Learning Trial, covering a few more tours. the first two focus on working with GitHub, while the other two focus on getting more out of the Git revision history.
Target Usage: To back up a Git repository on a cloud-based Git service such as GitHub.
Motivation: One (of several) benefits of maintaining a copy of a repo on a cloud server: it acts as a safety net (e.g., against the folder becoming inaccessible due to a hardware fault).
Lesson plan:
T2L1. Remote Repositories covers that part.
T2L2. Preparing to use GitHub covers that part.
T2L3. Creating a Repo on GitHub covers that part.
T2L4. Linking a Local Repo With a Remote Repo covers that part.
T2L5. Updating the Remote Repo covers that part.
T2L6. Omitting Files from Revision Control covers that part.
To back up your Git repo on the cloud, you’ll need to use a remote repository service, such as GitHub.
A repo you have on your computer is called a local repo. A remote repo is a repo hosted on a remote computer and allows remote access. Some use cases for remote repositories:
It is possible to set up a Git remote repo on your own server, but an easier option is to use a remote repo hosting service such as GitHub.
To use GitHub, you need to sign up for an account, and configure related tools/settings first.
GitHub is a web-based service that hosts Git repositories and adds collaboration features on top of Git. Two other similar platforms are GitLab and Bitbucket. While Git manages version control locally, such platforms provide additional features such as shared access to repositories, issue tracking, code reviews, and permission controls. They are widely used in software development projects, for both open-source software (OSS) and closed-source software projects.
On GitHub, a Git repo can be put in one of two spaces:
Every GitHub user must have a user account, even if they primarily work within an organisation.
PREPARATION: Create a GitHub account
Create a personal GitHub account as described in GitHub Docs → Creating an account on GitHub, if you don't have one yet.
Choose a sensible GitHub username as you are likely to use it for years to come in professional contexts e.g., in job applications.
[Optional, but recommended] Set up your GitHub profile, as explained in GitHub Docs → Setting up your profile.
Before you can interact with GitHub from your local Git client, you need to set up authentication. In the past, you could simply enter your GitHub username and password, but GitHub no longer accepts passwords for Git operations. Instead, you’ll use a more secure method — such as a Personal Access Token (PAT) or SSH keys — to prove your identity.
A Personal Access Token (PAT) is essentially a long, random string that acts like a password, but it can be scoped to specific permissions (e.g., read-only or full access) and revoked at any time. This makes it more secure and flexible than a traditional password.
Git supports two main protocols for communicating with GitHub: HTTPS and SSH.
PREPARATION: Set up authentication with GitHub
Set up your computer's GitHub authentication, as described in the se-edu guide Setting up GitHub Authentication.
GitHub associates a commit to a user based on the email address in the commit metadata. When you push a commit, GitHub checks if the email matches a verified email on a GitHub account. If it does, the commit is shown as authored by that user. If the email doesn’t match any account, the commit is still accepted but won’t be linked to any profile.
GitHub provides a no-reply email (e.g., 12345678+username@users.noreply.github.com
) that you can use as your Git user.email
to hide your real email while still associating commits with your GitHub account.
PREPARATION: [Optional] Configure user.email
to use the no-reply email from GitHub
If you prefer not to include your real email address in commits, you can do the following:
Find your no-reply email provided by GitHub: Navigate to the email settings of your GitHub account and select the option to Keep my email address private
. The no-reply address will then be displayed, typically in the format ID+USERNAME@users.noreply.github.com
.
Update your user.email
with that email address e.g.,
git config --global user.email "12345678+username@users.noreply.github.com"
GitHub offers its own clients to make working with GitHub more convenient.
gh
) brings GitHub-specific commands to your terminal, letting you perform operations on GitHub from your command line.If you are using Git-Mastery exercises (strongly recommended), you need to install and configure GitHub CLI because it is needed by Git-Mastery exercises involving GitHub.
PREPARATION: Set up GitHub CLI
1. Download and run the installer from the GitHub CLI releases page. This is the file named as GitHub CLI {version} windows {chip variant} installer
.
1. Install GitHub CLI using Homebrew:
brew install gh
1. Install GitHub CLI, as explained in the GitHub CLI Linux installation guide for your distribution.
2. Authenticate yourself to GitHub account:
gh auth login
When prompted, choose the protocol (i.e., HTTPS
or SSH
) you used previously to set up your GitHub authentication.
3. Verify the setup by checking the status of your GitHub CLI with your GitHub account.
gh auth status
You should see confirmation that you’re logged in.
4. Verify that Github and GitHub CLI is set up for Git-Mastery:
gitmastery check github
5. [Optional, Recommended] Ask Git-Mastery to switch on the 'progress sync' feature.
# cd into the gitmastery-exercises folder first
gitmastery progress sync on
What happens when you switch on the Git-Mastery 'progress sync' feature?
The first step of backing up a local repo on GitHub: create an empty repository on GitHub.
You can create a remote repository based on an existing local repository, to serve as a remote copy of your local repo. For example, suppose you created a local repo and worked with it for a while, but now you want to upload it onto GitHub. The first step is to create an empty repository on GitHub.
1 Login to your GitHub account and choose to create a new repo.
2 In the next screen, provide a name for your repo. Refer the screenshot below on some guidance on how to provide the required information.
Click Create repository button to create the new repository.
If you enable any of the three Add _____
options shown above, GitHub will not only create a repo, but will also initialise it with some initial content. That is not what we want here. To create an empty remote repo, keep those options disabled.
3 Note the URL of the repo. It will be of the form
https://github.com/{your_user_name}/{repo_name}.git
.
e.g., https://github.com/johndoe/foobar.git
(note the .git
at the end)
done!
EXERCISE: remote-control
The second step of backing up a local repo on GitHub: link the local repo with the remote repo on GitHub.
A Git remote is a reference to a repository hosted elsewhere, usually on a server like GitHub, GitLab, or Bitbucket. It allows your local Git repo to communicate with another remote copy — for example, to upload locally-created commits that are missing in the remote copy.
By adding a remote, you are informing the local repo details of a remote repo it can communicate with, for example, where the repo exists and what name to use to refer to the remote.
The URL you use to connect to a remote repo depends on the protocol — HTTPS or SSH:
https://github.com/
(for GitHub users). e.g.,https://github.com/username/repo-name.git
git@github.com:
. e.g.,git@github.com:username/repo-name.git
A Git repo can have multiple remotes. You simply need to specify different names for each remote (e.g., upstream
, central
, production
, other-backup
...).
Add the empty remote repo you created on GitHub as a remote of a local repo you have.
1 In a terminal, navigate to the folder containing the local repo things
you created earlier.
2 List the current list of remotes using the git remote -v
command, for a sanity check. No output is expected if there are no remotes yet.
3 Add a new remote repo using the git remote add <remote-name> <remote-url>
command.
i.e., if using HTTPS, git remote add origin https://github.com/{YOUR-GITHUB-USERNAME}/things.git
git remote add origin https://github.com/JohnDoe/things.git # using HTTPS
git remote add origin git@github.com:JohnDoe/things.git # using SSH
4 List the remotes again to verify the new remote was added.
git remote -v
origin https://github.com/johndoe/things.git (fetch)
origin https://github.com/johndoe/things.git (push)
The same remote will be listed twice, to show that you can do two operations (fetch
and push
) using this remote. You can ignore that for now. The important thing is the remote you added is being listed.
1 Open the local repo in Sourcetree.
2 Open the dialog for adding a remote, as follows:
Choose Repository
→ Repository Settings
menu option.
Choose Repository
→ Repository Settings...
→ Choose Remotes
tab.
3 Add a new remote to the repo with the following values.
Remote name
: the name you want to assign to the remote repo i.e., origin
URL/path
: the URL of your remote repohttps://github.com/{YOUR-GITHUB-USERNAME}/things.git
Username
: your GitHub username4 Verify the remote was added by going to Repository
→ Repository Settings
again.
5 Add another remote, to verify that a repo can have multiple remotes. You can use any name (e.g., backup
and any URL for this).
done!
EXERCISE: link-me
DETOUR: Managing Details of a Remote
To change the URL of a remote (e.g., origin), use git remote set-url <remote-name> <new-url>
e.g.,
git remote set-url origin https://github.com/user/repo.git
To rename a remote, use git remote rename <old-name> <new-name>
e.g.,
git remote rename origin upstream
To delete a remote from your Git repository, use git remote remove <remote-name>
e.g.,
git remote remove origin
To check the current remotes and their URLs, use:
git remote -v
The third step of backing up a local repo on GitHub: push a copy of the local repo to the remote repo.
You can push content of one repository to another, usually from your local repo to a remote repo. Pushing transfers recorded Git history (such as past commits), but it does not transfer unstaged changes or untracked files.
You can configure Git to track a pairing between a local branch and a remote branch, so in future you can push from the same local branch to the corresponding remote branch without needing to specify them again. For example, you can set your local master
branch to track the master
branch on the remote repo origin
i.e., local master
branch will track the branch origin/master
.
In the revision graph above, you see a new type of ref ( origin/master). This is a remote-tracking branch ref that represents the state of a corresponding branch in a remote repository (if you previously set up the branch to 'track' a remote branch). In this example, the master
branch in the remote origin
is also at the commit C3
(which means you have not created new commits after you pushed to the remote).
If you now create a new commit C4
, the state of the revision graph will be as follows:
Explanation: When you create C4
, the current branch master
moves to C4
, and HEAD
moves along with it. However, the master
branch in the remote origin
remains at C3
(because you have not pushed C4
yet). That is, the remote-tracking branch origin/master
is one commit behind the local branch master
(or, the local branch is one commit ahead). The origin/master
ref will move to C4
only after you push your local branch to the remote again.
Preparation Use a local repo that is connected to an empty remote repo e.g., the things
repo from previous hands-on practicals:
1 Push the master
branch to the remote. Also instruct Git to track this branch pair.
Use the git push -u <remote-repo-name> <local-branch-name>
to push the commits to a remote repository.
git push -u origin master
Explanation:
push
: the Git sub-command that pushes the current local repo content to a remote repoorigin
: name of the remotemaster
: branch to push-u
(or --set-upstream
): the flag that tells Git to track that this local master
is tracking origin/master
branchClick the Push
button on the buttons ribbon at the top.
In the next dialog, ensure the settings are as follows, ensure the Track
option is selected, and click the Push
button on the dialog.
2 Observe the remote-tracking branch origin/master
is now pointing at the same commit as the master
branch.
Use the git log --oneline --graph
to see the revision graph.
* f761ea6 (HEAD -> master, origin/master) Add colours.txt, shapes.txt
* 2bedace Add figs to fruits.txt
* d5f91de Add fruits.txt
Click the History
to see the revision graph.
HEAD
ref may not be shown -- it is implied that the HEAD
ref is pointing to the same commit the currently active branch ref is pointing.origin/master
) is not showing up, you may need to enable the Show Remote Branches
option.done!
The push command can be used repeatedly to send further updates to another repo e.g., to update the remote with commits you created since you pushed the first time.
Target Add a few more commits to the same local repo, and push those commits to the remote repo.
1 Commit some changes in your local repo.
Use the git commit
command to create commits, as you did before.
Optionally, you can run the git status
command, which should confirm that your local branch is 'ahead' by one commit (i.e., the local branch has commits that are not present in the corresponding branch in the remote repo).
git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
(use "git push" to publish your local commits)
nothing to commit, working tree clean
You can also use the git log --oneline --graph
command to see where the branch refs are. Note how the remote-tracking branch origin/master
is one commit behind the local master
.
e60deae (HEAD -> master) Update fruits list
f761ea6 (origin/master) Add colours.txt, shapes.txt
2bedace Add figs to fruits.txt
d5f91de Add fruits.txt
Create commits as you did before.
Before pushing the new commit, Sourcetree will indicate that your local branch is 'ahead' by one commit (i.e., the local branch has one new commit that is not in the corresponding branch in the remote repo).
2 Push the new commits to your fork on GitHub.
To push the newer commit(s) to the remote, any of the following commands should work:
git push origin master
git push origin
master
branch)git push
origin
and to the branch master
i.e., origin/master
)After pushing, the revision graph should look something like the following (note how both local and remote-tracking branch refs are pointing to the same commit again).
e60deae (HEAD -> master, origin/master) Update fruits list
f761ea6 Add colours.txt, shapes.txt
2bedace Add figs to fruits.txt
d5f91de Add fruits.txt
To push, click the Push
button on the top buttons ribbon, ensure the settings are as follows in the next dialog, and click the Push
button on the dialog.
After pushing the new commit to the remote, the remote-tracking branch ref should move to the new commit:
done!
Note that you can push between two repos only if those repos have a shared history among them (i.e., one should have been created by copying the other).
EXERCISE: push-over
DETOUR: Pushing to Multiple Repos
You can push to any number of repos, as long as the target repos and your repo have a shared history.
upstream
, central
, production
, backup
...), if you haven't done so already.e.g., git push backup master
Git allows you to specify which files should be omitted from revision control.
You can specify which files Git should ignore from revision control. While you can always omit files from revision control simply by not staging them, having an 'ignore-list' is more convenient, especially if there are files inside the working folder that are not suitable for revision control (e.g., temporary log files) or files you want to prevent from accidentally including in a commit (files containing confidential information).
A repo-specific ignore-list of files can be specified in a .gitignore
file, stored in the root of the repo folder.
The .gitignore
file itself can be either revision controlled or ignored.
.gitignore
file changes over time), simply commit it as you would commit any other file..gitignore
file itself.The .gitignore
file supports file patterns e.g., adding temp/*.tmp
to the .gitignore
file prevents Git from tracking any .tmp
files in the temp
directory.
SIDEBAR: .gitignore
File Syntax
Blank lines: Ignored and can be used for spacing.
Comments: Begin with #
(lines starting with # are ignored).
# This is a comment
Write the name or pattern of files/directories to ignore.
log.txt # Ignores a file named log.txt
Wildcards:
*
matches any number of characters, except /
(i.e., for matching a string within a single directory level):abc/*.tmp # Ignores all .tmp files in abc directory
**
matches any number of characters (including /
)**/foo.tmp # Ignores all foo.tmp files in any directory
?
matches a single characterconfig?.yml # Ignores config1.yml, configA.yml, etc.
[abc]
matches a single character (a, b, or c)file[123].txt # Ignores file1.txt, file2.txt, file3.txt
Directories:
/
to match directories.logs/ # Ignores the logs directory
/
match files/folders recursively.*.bak # Ignores all .bak files anywhere
/
are relative to the .gitignore
location./secret.txt # Only ignores secret.txt in the root directory
Negation: Use !
at the start of a line to not ignore something.
*.log # Ignores all .log files
!important.log # Except important.log
Example:
# Ignore all log files
*.log
# Ignore node_modules folder
node_modules/
# Don’t ignore main.log
!main.log
1 Add a file into your repo's working folder that you presumably do not want to revision-control e.g., a file named temp.txt
. Observe how Git has detected the new file.
Add a few other files with .tmp
extension.
2 Configure Git to ignore those files:
Create a file named .gitignore
in the working directory root and add the text temp.txt
into it.
echo "temp.txt" >> .gitignore
temp.txt
Observe how temp.txt
is no longer detected as 'untracked' by running the git status
command (but now it will detect the .gitignore
file as 'untracked'.
Update the .gitignore
file as follows:
temp.txt
*.tmp
Observe how .tmp
files are no longer detected as 'untracked' by running the git status
command.
The file should be currently listed under Unstaged files
. Right-click it and choose Ignore...
. Choose Ignore exact filename(s)
and click OK
.
Also take note of other options available e.g., Ignore all files with this extension
etc. They may be useful in future.
Note how the temp.text
is no longer listed under Unstaged files
. Observe that a file named .gitignore
has been created in the working directory root and has the following line in it. This new file is now listed under Unstaged files
.
temp.txt
Right-click on any of the .tmp
files you added, and choose Ignore...
as you did previously. This time, choose the option Ignore files with this extension
.
Note how .temp
files are no longer shown as unstaged files, and the .gitignore
file has been updated as given below:
temp.txt
*.tmp
3 Optionally, stage and commit the .gitignore
file.
done!
Files recommended to be omitted from version control
*.class
, *.jar
, *.exe
.idea/
) EXERCISE: ignoring-somethings
DETOUR: Ignoring Previously-Tracked Files
Adding a file to the .gitignore
file is not enough if the file was already being tracked by Git in previous commits. In such cases, you need to do both of the following:
git rm --cached <file(s)>
command.git rm --cached data/ic.txt
.gitignore
file, as usual.At this point: You should now be able to create a copy of your repo on GitHub, and keep it updated as you add more commits to your local repo. If something goes wrong with your local repo (e.g., disk crash), you can now recover the repo using the remote repo (this tour did not cover how exactly you can do that -- it will be covered in a future tour).
What's next: Tour 3: Working Off a Remote Repo
Target Usage: To work with an existing remote repository.
Motivation: Often, you will need to start with an existing remote repository. In such cases, you may have to create your own copies of that repository, and keep those copies updated when more changes appear in the remote repository.
Lesson plan:
T3L1. Duplicating a Remote Repo on the Cloud covers that part.
T3L2. Creating a Local Copy of a Repo covers that part.
T3L3. Downloading Data Into a Local Repo covers that part.
GitHub allows you to create a remote copy of another remote repo, called forking.
A fork is a copy of a remote repository created on the same hosting service such as GitHub, GitLab, or Bitbucket. On GitHub, you can fork a repository from another user or organisation into your own space (i.e., your user account or an organisation you have sufficient access to). Forking is particularly useful if you want to experiment with a repo but don’t have write permissions to the original -- you can fork it and work on your own remote copy without affecting the original repository.
Preparation Create a GitHub account if you don't have one yet.
1 Go to the GitHub repo you want to fork e.g., samplerepo-things
2 Click on the button in the top-right corner. In the next step,
[ ] Copy the master branch only
option, so that you get copies of other branches (if any) in the repo.done!
Forking is not a Git feature, but a feature provided by hosted Git services like GitHub, GitLab, or Bitbucket.
GitHub does not allow you to fork the same repo more than once to the same destination. If you want to re-fork, you need to delete the previous fork.
EXERCISE: fork-repo
The next step is to create a local copy of the remote repo, by cloning the remote repo.
You can clone a repository to create a full copy of it on your computer. This copy includes the entire revision history, branches, and files of the original, so it behaves just like the original repository. For example, you can clone a repository from a hosting service like GitHub to your computer, giving you a complete local version to work with.
Cloning a repo automatically creates a remote named origin which points to the repo you cloned from.
The repo you cloned from is often referred to as the upstream repo.
1 Clone the remote repo to your computer. For example, you can clone the samplerepo-things repo, or the fork you created from it in a previous lesson.
Note that the URL of the GitHub project is different from the URL you need to clone a repo in that GitHub project. e.g.
https://github.com/se-edu/samplerepo-things # GitHub project URL
https://github.com/se-edu/samplerepo-things.git # the repo URL
You can use the git clone <repository-url> [directory-name]
command to clone a repo.
<repository-url>
: The URL of the remote repository you want to copy.[directory-name]
(optional): The name of the folder where you want the repository to be cloned. If you omit this, Git will create a folder with the same name as the repository.git clone https://github.com/se-edu/samplerepo-things.git # if using HTTPS
git clone git@github.com:se-edu/samplerepo-things.git # if using SSH
git clone https://github.com/foo/bar.git my-bar-copy # also specifies a dir to use
For exact steps for cloning a repo from GitHub, refer to this GitHub document.
File
→ Clone / New ...
and provide the URL of the repo and the destination directory.
File
→ New ...
→ Choose as shown below → Provide the URL of the repo and the destination directory in the next dialog.
2 Verify the clone has a remote named origin
pointing to the upstream repo.
Use the git remote -v
command that you learned earlier.
Choose Repository
→ Repository Settings
menu option.
done!
EXERCISE: clone-repo
When there are new changes in the remote, you need to pull those changes down to your local repo.
There are two steps to bringing over changes from a remote repository into a local repository: fetch and merge.
1 Clone the repo se-edu/samplerepo-finances. It has 3 commits. Your clone now has a remote origin
pointing to the remote repo you cloned from.
2 Change the remote origin
to point to samplerepo-finances-2. This remote repo is a copy of the one you cloned, but it has two extra commits.
git remote set-url origin https://github.com/se-edu/samplerepo-finances-2.git
Go to Repository
→ Repository settings ...
to update remotes.
3 Verify the local repo is unaware of the extra commits in the remote.
git status
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
The revision graph should look like the below:
If it looks like the below, it is possible that Sourcetree is auto-fetching data from the repo periodically.
4 Fetch from the new remote.
Use the git fetch <remote>
command to fetch changes from a remote. If the <remote>
is not specified, the default remote origin
will be used.
git fetch origin
remote: Enumerating objects: 8, done.
... # more output ...
afbe966..cc6a151 master -> origin/master
* [new tag] beta -> beta
Click on the Fetch
button on the top menu:
5 Verify the fetch worked i.e., the local repo is now aware of the two missing commits. Also observe how the local branch ref of the master
branch, the staging area, and the working directory remain unchanged after the fetch.
Use the git status
command to confirm the repo now knows that it is behind the remote repo.
git status
On branch master
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
(use "git pull" to update your local branch)
nothing to commit, working tree clean
Now, the revision graph should look something like the below. Note how the origin/master
ref is now two commits ahead of the master
ref.
6 Merge the fetched changes.
Use the git merge <remote-tracking-branch>
command to merge the fetched changes. Check the status and the revision graph to verify that the branch tip has now moved by two more commits.
git merge origin/master
git status
git log --oneline --decorate
To merge the fetched changes, right-click on the latest commit on origin/remote
branch and choose Merge
.
In the next dialog, choose as follows:
The final result should be something like the below (same as the repo state before we started this hands-on practical):
Note that merging the fetched changes can get complicated if there are multiple branches or the commits in the local repo conflict with commits in the remote repo. We will address them when we learn more about Git branches, in a later lesson.
done!
Pull is a shortcut that combines fetch and merge — it fetches the latest changes from the remote and immediately merges them into your current branch. In practice, Git users typically use the pull instead of the fetch-then-merge.
pull = fetch + merge
1 Similar to the previous hands-on practical, clone the repo se-edu/samplerepo-finances (to a new location).
Change the remote origin
to point to samplerepo-finances-2.
2 Pull the newer commits from the remote, instead of a fetch-then-merge.
Use the git pull <remote> <branch>
command to pull changes.
git pull origin master
The following works too. If the <remote>
and <branch>
are not specified, Git will pull to the current branch from the remote branch it is tracking.
git pull
Click on the Pull
button on the top menu:
3 Verify the outcome is same as the fetch + merge steps you did in the previous hands-on practical.
done!
You can pull from any number of remote repos, provided the repos involved have a shared history. This can be useful when the upstream repo you forked from has some new commits that you wish to bring over to your copies of the repo (i.e., your fork and your local repo).
Preparation Fork se-edu/samplerepo-finances to your GitHub account.
Clone your fork to your computer.
Now, let's pretend that there are some new commits in upstream repo that you would like to bring over to your fork, and your local repo. Here are the steps:
1 Add the upstream repo se-edu/samplerepo-finances as remote named upstream
in your local repo.
Adding remotes was covered in Lesson T2L4. Linking a Local Repo With a Remote Repo
2 Pull from the upstream repo. If there are new commits (in this case, there will be none), those will come over to your local repo. For example:
git pull upstream master
3 Push to your fork. Any new commits you pulled from the upstream repo will now appear in your fork as well. For example:
git push origin master
The method given above is the more 'standard' method of synchronising a fork with the upstream repo. In addition, platforms such as GitHub can provide other ways (example: GitHub's Sync fork feature).
4 For good measure, let's pull from another repo.
other-upstream
in your local repo.git remote add other-upstream https://github.com/se-edu/samplerepo-finances-2.git
git pull other-upstream master
git push origin master
done!
EXERCISE: fetch-and-pull
DETOUR: Pulling from Multiple Remotes
You can pull from any number of repos, provided the repos involved have a shared history.
upstream
, central
, production
, backup
...), if you haven't done so already.e.g., git pull backup master
Similar to before, but remember to choose the intended remote to pull from.
At this point: Now you can create your own remote and local copies of any repo on GitHub, and update your copy when there are new changes in the upstream repo.
What's next: Tour 4: Using the Revision History of a Repo
Target Usage: To make use of the revision history stored by Git.
Motivation: Having put in effort to record the revision history of the working folder, it only makes sense that we use the revision history to our benefit. For example, to be able to answer questions such as "What did I change in this file since last Monday?"
Lesson plan:
T4L1. Examining a Commit covers that part.
T4L2. Tagging Commits covers that part.
T4L3. Comparing Points of History covers that part.
T4L4. Traversing to a Specific Commit covers that part.
T4L5. Rewriting History to Start Over covers that part.
T4L6. Reverting a Specific Commit covers that part.
It is useful to be able to see what changes were included in a specific commit.
When you examine a commit, normally what you see is the 'changes made since the previous commit'. This does not mean that a Git commit contains only the changes made since the previous commit. As you recall, a Git commit contains a full snapshot of the working directory. However, tools used to examine commits typically show only the changes, as that is the more informative part.
Git shows changes included in a commit by dynamically calculating the difference between the snapshots stored in the target commit and the parent commit. This is because Git commits store snapshots of the working directory, not changes themselves.
Although each commit represents a copy of the entire working directory, Git uses space efficiently in two main ways:
To address a specific commit, you can use its SHA (e.g., e60deaeb2964bf2ebc907b7416efc890c9d4914b
). In fact, just the first few characters of the SHA is enough to uniquely address a commit (e.g., e60deae
), provided the partial SHA is long enough to uniquely identify the commit (i.e., only one commit has that partial SHA).
Naturally, a commit can be addressed using any ref pointing to it too (e.g., HEAD
, master
).
Another related technique is to use the <ref>~<n>
notation (e.g., HEAD~1
) to address the commit that is n
commits prior to the commit pointed by <ref>
i.e., "start with the commit pointed by <ref>
and go back n
commits".
A related alternative notation is HEAD~
, HEAD~~
, HEAD~~~
, ... to mean HEAD~1
, HEAD~2
, HEAD~3
etc.
HEAD
or master
HEAD~1
or master~1
or HEAD~
or master~
HEAD~2
or master~2
Git uses the diff format to show file changes in a commit. The diff format was originally developed for Unix. It was later extended with headers and metadata to show changes between file versions and commits. Here is an example diff showing the changes to a file.
diff --git a/fruits.txt b/fruits.txt
index 7d0a594..f84d1c9 100644
--- a/fruits.txt
+++ b/fruits.txt
@@ -1,6 +1,6 @@
-apples
+apples, apricots
bananas
cherries
dragon fruits
-elderberries
figs
@@ -20,2 +20,3 @@
oranges
+pears
raisins
diff --git a/colours.txt b/colours.txt
new file mode 100644
index 0000000..55c8449
--- /dev/null
+++ b/colours.txt
@@ -0,0 +1 @@
+a file for colours
A Git diff can consist of multiple file diffs, one for each changed file. Each file diff can contain one or more hunk i.e., a localised group of changes within the file — including lines added, removed, or left unchanged (included for context).
Given below is how the above diff is divided into its components:
File diff for fruits.txt
:
diff --git a/fruits.txt b/fruits.txt
index 7d0a594..f84d1c9 100644
--- a/fruits.txt
+++ b/fruits.txt
Hunk 1:
@@ -1,6 +1,6 @@
-apples
+apples, apricots
bananas
cherries
dragon fruits
-elderberries
figs
Hunk 2:
@@ -20,2 +20,3 @@
oranges
+pears
raisins
File diff for colours.txt
:
diff --git a/colours.txt b/colours.txt
new file mode 100644
index 0000000..55c8449
--- /dev/null
+++ b/colours.txt
Hunk 1:
@@ -0,0 +1 @@
+a file for colours
Here is an explanation of the diff:
Part of Diff | Explanation |
---|---|
diff --git a/fruits.txt b/fruits.txt | The diff header, indicating that it is comparing the file fruits.txt between two versions: the old (a/ ) and new (b/ ). |
index 7d0a594..f84d1c9 100644 | Shows the before and after the change, and the file mode (100 means a regular file, 644 are file permission indicators). |
--- a/fruits.txt +++ b/fruits.txt | Marks the old version of the file (a/fruits.txt ) and the new version of the file (b/fruits.txt ). |
@@ -1,6 +1,6 @@ | This hunk header shows that lines 1-6 (i.e., starting at line 1 , showing 6 lines) in the old file were compared with lines 1–6 in the new file. |
-apples +apples, apricots | Removed line apples and added line apples, apricots . |
bananas cherries dragon fruits | Unchanged lines, shown for context. |
-elderberries | Removed line: elderberries . |
figs | Unchanged line, shown for context. |
@@ -20,2 +20,3 @@ | Hunk header showing that lines 20-21 in the old file were compared with lines 20–22 in the new file. |
oranges +pears raisins | Unchanged line. Added line: pears .Unchanged line. |
diff --git a/colours.txt b/colours.txt | The usual diff header, indicates that Git is comparing two versions of the file colours.txt : one before and one after the change. |
new file mode 100644 | This is a new file being added. 100644 means it’s a normal, non-executable file with standard read/write permissions. |
index 0000000..55c8449 | The usual SHA hashes for the two versions of the file. 0000000 indicates the file did not exist before. |
--- /dev/null +++ b/colours.txt | Refers to the "old" version of the file (/dev/null means it didn’t exist before), and the new version. |
@@ -0,0 +1 @@ | Hunk header, saying: “0 lines in the old file were replaced with 1 line in the new file, starting at line 1.” |
+a file for colours | Added line |
Points to note:
+
indicates a line being added.-
indicates a line being deleted.TargetView contents of specific commits in a repo.
Preparation You can use any repo that has commits e.g., the things
repo.
1 Locate the commits to view, using the revision graph.
git log --oneline --decorate
e60deae (HEAD -> master, origin/master) Update fruits list
f761ea6 Add colours.txt, shapes.txt
2bedace Add figs to fruits.txt
d5f91de Add fruits.txt
2 Use the git show
command to view specific commits.
git show # shows the latest commit
commit e60deaeb2964bf2ebc907b7416efc890c9d4914b (HEAD -> master, origin/master)
Author: damithc <...@...>
Date: Sat Jun ...
Update fruits list
diff --git a/fruits.txt b/fruits.txt
index 7d0a594..6d502c3 100644
--- a/fruits.txt
+++ b/fruits.txt
@@ -1,6 +1,6 @@
-apples
+apples, apricots
bananas
+blueberries
cherries
dragon fruits
-elderberries
figs
To view the parent commit of the latest commit, you can use any of these commands:
git show HEAD~1
git show master~1
git show e60deae # first few characters of the SHA
git show e60deae..... # run git log to find the full SHA and specify the full SHA
To view the commit that is two commits before the latest commit, you can use git show HEAD~2
etc.
Click on the commit. The remaining panels (indicated in the image below) will be populated with the details of the commit.
done!
PRO-TIP: Use Git Aliases to Work Faster
The Git alias feature allows you to create custom shortcuts for frequently used Git commands. This saves time and reduces typing, especially for long or complex commands. Once an alias is defined, you can use the alias just like any other Git command e.g., use git lodg
as an alias for git log --oneline --decorate --graph
.
To define a global git alias, you can use the git config --global alias.<alias> "command"
command. e.g.,
git config --global alias.lodg "log --oneline --graph --decorate"
You can also create shell-level aliases using your shell configuration (e.g., .bashrc
, .zshrc
) to make even shorter aliases. This lets you create shortcuts for any command, including Git commands, and even combine them with other tools. e.g., instead of the Git alias git lodg
, you can define a shorter shell-level alias glodg
.
1. Locate your .bash_profile
file (likely to be in : C:\Users\<YourName>\.bash_profile
-- if it doesn’t exist, create it.)
1. Locate your shell's config file e.g., .bashrc
or .zshrc
(likely to be in your ~
folder)
1. Locate your shell's config file e.g., .bashrc
or .zshrc
(likely to be in your ~
folder)
Oh-My-Zsh for Zsh terminal supports a Git plugin that adds a wide array of Git command aliases to your terminal.
2. Add aliases to that file:
alias gs='git status'
alias glod='git log --oneline --graph --decorate'
3. Apply changes by running the command source ~/.zshrc
or source ~/.bash_profile
or source ~/.bashrc
, depending on which file you put the aliases in.
When working with many commits, it helps to tag specific commits with custom names so they’re easier to refer to later.
Git lets you tag commits with names, making them easy to reference later. This is useful when you want to mark specific commits -- such as releases or key milestones (e.g., v1.0
or v2.1
). Using tags to refer to commits is much more convenient than using SHA hashes. In the diagram below, v1.0 and interim are tags.
A tag stays fixed to the commit. Unlike branch refs or HEAD
, tags do not move automatically as new commits are made. As you see below, after adding a new commit, tags stay in the previous commits while master←HEAD has moved to the new commit.
Git supports two kinds of tags:
Annotated tags are generally preferred for versioning and public releases, while lightweight tags are often used for less formal purposes, such as marking a commit for your own reference.
Target Add a few tags to a repository.
Preparation Fork and clone the samplerepo-preferences. Use the cloned repo on your computer for the following steps.
1 Add a lightweight tag to the current commit as v1.0
:
git tag v1.0
2 Verify the tag was added. To view tags:
git tag
v1.0
To view tags in the context of the revision graph:
git log --oneline --decorate
507bb74 (HEAD -> master, tag: v1.0, origin/master, origin/HEAD) Add donuts
de97f08 Add cake
5e6733a Add bananas
3398df7 Add food.txt
3 Use the tag to refer to the commit e.g., git show v1.0
should show the changes in the tagged commit.
4 Add an annotated tag to an earlier commit. The example below adds a tag v0.9
to the commit HEAD~2
with the message First beta release
. The -a
switch tells Git this is an annotated tag.
git tag -a v0.9 HEAD~2 -m "First beta release"
5 Check the new annotated tag. While both types of tags appear similarly in the revision graph, the show
command on an annotated tag will show the details of the tag and the details of the commit it points to.
git show v0.9
tag v0.9
Tagger: ... <...@...>
Date: Sun Jun ...
First beta release
commit ....999087124af... (tag: v0.9)
Author: ... <...@...>
Date: Sat Jun ...
Add figs to fruits.txt
diff --git a/fruits.txt b/fruits.txt
index a8a0a01..7d0a594 100644
# rest of the diff goes here
Right-click on the commit (in the graphical revision graph) you want to tag and choose Tag…
.
Specify the tag name e.g., v1.0
and click Add Tag
.
Configure tag properties in the next dialog and press Add
. For example, you can choose whether to make it a lightweight tag or an annotated tag (default).
Tags will appear as labels in the revision graph, as seen below. To see the details of an annotated tag, you need to use the menu indicated in the screenshot.
done!
If you need to change what a tag points to, you must delete the old one and create a new tag with the same name. This is because tags are designed to be fixed references to a specific commit, and there is no built-in mechanism to 'move' a tag.
Preparation Continue with the same repo you used for the previous hands-on practical.
Move the v1.0
tag to the commit HEAD~1
, by deleting it first and creating it again at the destination commit.
Delete the previous v1.0
tag by using the -d
. Add it again to the other commit, as before.
git tag -d v1.0
git tag v1.0 HEAD~1
The same dialog used to add a tag can be used to delete and even move a tag. Note that the 'moving' here translates to deleting and re-adding behind the scene.
done!
Tags are different from commit messages, in purpose and in form. A commit message is a description of the commit that is part of the commit itself. A tag is a short name for a commit, which you can use to address a commit.
Pushing commits to a remote does not push tags automatically. You need to push tags specifically.
Target Push tags you created earlier to the remote.
Preparation Continue with the same repo you used for the previous hands-on practical.
You can go to your remote on GitHub link https://github.com/{USER}/{REPO}/tags
(e.g., https://github.com/johndoe/samplerepo-preferences/tags
) to verify the tag is present there.
Note how GitHub assumes these tags are meant as releases, and automatically provides zip and tar.gz archives of the repo (as at that tag).
1 Push a specific tag in the local repo to the remote (e.g., v1.0
) using the git push <origin> <tag-name>
command.
git push origin v1.0
In addition to verifying the tag's presence via GitHub, you can also use the following command to list the tags presently in the remote.
git ls-remote --tags origin
2 Delete a tag in the remote, using the git push --delete <remote> <tag-name>
command.
git push --delete origin v1.0
3 Push all tags to the remote repo, using the git push <remote> --tags
command.
git push origin --tags
To push a specific tag, use the following menu:
To push all tags, you can tick the Push all tags
option when pushing commits:
done!
Git can tell you the net effect of changes between two points of history.
Git's diff feature can show you what changed between two points in the revision history. Given below are some use cases.
Usage 1: Examining changes in the working directory
Example use case: To verify the next commit will include exactly what you intend it to include.
Preparation For this, you can use the things
repo you created earlier. If you don't have it, you can clone a copy of a similar repo given here.
1 Do some changes to the working directory. Stage some (but not all) changes. For example, you can run the following commands.
echo -e "blue\nred\ngreen" >> colours.txt
git add . # a shortcut to stage all changes
echo "no shapes added yet" >> shapes.txt
2 Examine the staged and unstaged changes.
The git diff
command shows unstaged changes in the working directory (tracked files only). The output of the diff
command, is a diff view (introduced in this lesson).
git diff
diff --git a/shapes.txt b/shapes.txt
index 5c2644b..949c676 100644
--- a/shapes.txt
+++ b/shapes.txt
@@ -1 +1,2 @@
a file for shapes
+no shapes added yet!
The git diff --staged
command shows the staged changes (same as git diff --cached
).
git diff --staged
Select the two commits: Click on one commit, and Ctrl-Click (or Cmd-Click) on the second commit. The changes between the two selected commits will appear in the other panels, as shown below:
done!
Usage 2: Comparing two commits at different points of the revision graph
Example use case: Suppose you’re trying to improve the performance of a piece of software by experimenting with different code tweaks. You commit after each change (as you should). After several commits, you now want to review the overall effect of all those changes on the code.
Target Compare two commits in a repo.
Preparation You can use any repo with multiple commits e.g., the things
repo.
You can use the git diff <commit1> <commit2>
command for this.
..
notation to specify the commit range too e.g., 0023cdd..fcd6199
, HEAD~2..HEAD
git diff v0.9 HEAD
diff --git a/colours.txt b/colours.txt
new file mode 100644
index 0000000..55c8449
--- /dev/null
+++ b/colours.txt
@@ -0,0 +1 @@
+a file for colours
# rest of the diff ...
Swap the commit order in the command and see what happens.
git diff HEAD v0.9
diff --git a/colours.txt b/colours.txt
deleted file mode 100644
index 55c8449..0000000
--- a/colours.txt
+++ /dev/null
@@ -1 +0,0 @@
-a file for colours
# rest of the diff ...
As you can see, the diff
is directional i.e., diff <commit1> <commit2>
shows what changes you need to do to go from the <commit1>
to <commit2>
. If you swap <commit1>
and <commit2>
, the output will change accordingly e.g., lines previously shown as 'added' will now be shown as 'deleted'.
Select the two commits: Click on one commit, and Ctrl-Click (or Cmd-Click) on the second commit. The changes between the two selected commits will appear in the other panels, as shown below:
The same method can be used to compare the current state of the working directory (which might have uncommitted changes) to a point in the history.
done!
Usage 3: Examining changes to a specific file
Example use case: Similar to other use cases but when you are interested in a specific file only.
Target Examine the changes done to a file between two different points in the version history (including the working directory).
Preparation Use any repo with multiple commits e.g. the things
repo.
Add the -- path/to/file
to a previous diff command to narrow the output to a specific file. Some examples:
git diff -- fruits.txt # unstaged changes to fruits.txt
git diff --staged -- src/main.java # staged changes to src/main.java
git diff HEAD~2..HEAD -- fruits.txt # changes to fruits.txt between commits
Sourcetree UI shows changes to one file at a time by default; just click on the file to view changes to that file. To view changes to multiple files, Ctrl-Click (or Cmd-Click) on multiple files to select them.
done!
Another useful feature of revision control is to be able to view the working directory as it was at a specific point in history, by checking out a commit created at that point.
Suppose you added a new feature to a software product, and while testing it, you noticed that another feature added two commits ago doesn’t handle a certain edge case correctly. Now you’re wondering: did the new feature break the old one, or was it already broken? Can you go back to the moment you committed the old feature and test it in isolation, and come back to the present after you found the answer? With Git, you can.
To view the working directory at a specific point in history, you can check out a commit created at that point.
When you check out a commit, Git:
HEAD
ref to that commit, marking it as the current state you’re viewing.→
[check out commit C2
...]
Checking out a specific commit puts you in a "detached HEAD
" state: i.e., the HEAD
no longer points to a branch, but directly to a commit (see the above diagram for an example). This isn't a problem by itself, but any commits you make in this state can be lost, unless certain follow-up actions are taken. It is perfectly fine to be in a detached state if you are only examining the state of the working directory at that commit.
To get out of a "detached HEAD" state, you can simply check out a branch, which "re-attaches" HEAD
to the branch you checked out.
→
[check out master
...]
Target Checkout a few commits in a local repo, while examining the working directory to verify that it matches the state when you created the corresponding commit
Preparation Use any repo with commits e.g., the things
repo
1 Examine the revision tree, to get your bearing first.
git log --oneline --decorate
Reminder: You can use aliases to reduce typing Git commands.
e60deae (HEAD -> master, origin/master) Update fruits list
f761ea6 (tag: v1.0) Add colours.txt, shapes.txt
2bedace (tag: v0.9) Add figs to fruits.txt
d5f91de Add fruits.txt
2 Use the checkout <commit-identifier>
command to check out a commit other than the one currently pointed by HEAD
. You can use any of the following methods:
git checkout v1.0
: checks out the commit tagged v1.0
git checkout 0023cdd
: checks out the commit with the hash 0023cdd
git checkout HEAD~2
: checks out the commit 2 commits behind the most recent commit.git checkout HEAD~2
Note: switching to 'HEAD~2'.
You are in 'detached HEAD' state.
# rest of the warning about the detached head ...
HEAD is now at 2bedace Add figs to fruits.txt
3 Verify HEAD
and the working directory have updated as expected.
HEAD
should now be pointing at the target commitshapes.txt
should not be in the folder).git log --oneline --decorate
2bedace (HEAD, tag: v0.9) Add figs to fruits.txt
d5f91de Add fruits.txt
HEAD
is indeed pointing at the target commit.
But note how the output does not show commits you added after the checked-out commit.
The --all
switch tells git log
to show commits from all refs, not just those reachable from the current HEAD
. This includes commits from other branches, tags, and remotes.
git log --oneline --decorate --all
e60deae (origin/master, master) Update fruits list
f761ea6 (tag: v1.0) Add colours.txt, shapes.txt
2bedace (HEAD, tag: v0.9) Add figs to fruits.txt
d5f91de Add fruits.txt
4 Go back to the latest commit by checking out the master
branch again.
git checkout master
In the revision graph, double-click the commit you want to check out, or right-click on that commit and choose Checkout...
.
Click OK
to the warning about ‘detached HEAD’ (similar to below).
The specified commit is now loaded onto the working folder, as indicated by the HEAD
label.
To go back to the latest commit on the master
branch, double-click the master
branch.
If you check out a commit that comes before the commit in which you added a certain file (e.g., temp.txt
) to the .gitignore
file, and if the .gitignore
file is version controlled as well, Git will now show it under ‘unstaged modifications’ because at Git hasn’t been told to ignore that file yet.
done!
If there are uncommitted changes in the working directory, Git proceeds with a checkout only if it can preserve those changes.
The Git stash feature temporarily sets aside uncommitted changes you’ve made (in your working directory and staging area), without committing them. This is useful when you’re in the middle of some work, but need to switch to another state (e.g., checkout a previous commit), and your current changes are not yet ready to be committed or discarded. You can later reapply the stashed changes when you’re ready to resume that work.
DETOUR: Stashing Uncommitted Changes Temporarily
For basic usage, you can use the following two commands:
git stash
: Stash staged and unstaged changesgit stash pop
: Reapplies the latest stashed changes and removes it from the stash list.RESOURCES
A more detailed explanation of stashing: https://www.atlassian.com/git/tutorials/saving-changes/git-stash
A video explanation:
DETOUR: Dealing with Uncommitted Conflicting Changes at a Checkout
To proceed with a checkout when there are conflicting uncommitted changes in the working directory, there are several options:
Git can also reset the revision history to a specific point so that you can start over from that point.
Suppose you realise your last few commits have gone in the wrong direction, and you want to go back to an earlier commit and continue from there — as if the “bad” commits never happened. Git’s reset feature can help you do that.
Git reset moves the tip of the current branch to a specific commit, optionally adjusting your staged and unstaged changes to match. This effectively rewrites the branch's history by discarding any commits that came after that point.
Resetting is different from the checkout feature:
HEAD
ref.→
[reset to C2
...]
master
branch!There are three types of resets: soft, mixed, hard. All three move the branch pointer to a new commit, but they vary based on what happens to the staging area and the working directory.
Preparation First, set the stage as follows (e.g., in the things
repo):
i) Add four commits that are supposedly 'bad' commits.
ii) Do a 'bad' change to one file and stage it.
iii) Do a 'bad' change to another file, but don't stage it.
The following commands can be used to add commits B1
-B4
:
echo "bad colour" >> colours.txt
git commit -am "Incorrectly update colours.txt"
echo "bad shape" >> shapes.txt
git commit -am "Incorrectly update shapes.txt"
echo "bad fruit" >> fruits.txt
git commit -am "Incorrectly update fruits.txt"
echo "bad line" >> incorrect.txt
git add incorrect.txt
git commit -m "Add incorrect.txt"
echo "another bad colour" >> colours.txt
git add colours.txt
echo "another bad shape" >> shapes.txt
Now we have some 'bad' commits and some 'bad' changes in both the staging area and the working directory. Let's use the reset feature to get rid of all of them, but do it in three steps so that you can learn all three types of resets.
1 Do a soft reset to B2
(i.e., discard last two commits). Verify,
master
branch is now pointing at B2
, and,B3
and B4
) are now in the staging area.Use the git reset --soft <commit>
command to do a soft reset.
git reset --soft HEAD~2
You can run the following commands to verify the current status of the repo is as expected.
git status # check overall status
git log --oneline --decorate # check the branch tip
git diff # check unstaged changes
git diff --staged # check staged changes
Right-click on the commit that you want to reset to, and choose Reset <branch-name> to this commit
option.
In the next dialog, choose Soft - keep all local changes
.
2 Do a mixed reset to commit B1
. Verify,
master
branch is now pointing at B1
.incorrect.txt
appears as an 'untracked' file -- this is because unstaging a change of type 'add file' results in an untracked file.Use the git reset --mixed <commit>
command to do a mixed reset. The --mixed
flag is the default, and can be omitted.
git reset HEAD~1
Verify the repo status, as before.
Similar to the previous reset, but choose the Mixed - keep working copy but reset index
option in the reset dialog.
3 Do a hard reset to commit C4
. Verify,
master
branch is now pointing at C4
i.e., all 'bad' commits are gone.incorrect.txt
-- Git leaves untracked files alone, as untracked files are not meant to be under Git's control).Use the git reset --hard <commit>
command.
git reset --hard HEAD~1
Verify the repo status, as before.
Similar to the previous reset, but choose the Hard - discard all working copy changes
option.
done!
Rewriting history can cause your local repo to diverge from its remote counterpart. For example, if you discard earlier commits and create new ones in their place, and you’ve already pushed the original commits to a remote repository, your local branch history will no longer match the corresponding remote branch. Git refers to this as a diverged history.
To protect the integrity of the remote, Git will reject attempts to push a diverged branch using a normal push. If you want to overwrite the remote history with your local version, you must perform a force push.
Preparation Choose a local-remote repo pair under your control e.g., the things
repo from Tour 2: Backing up a Repo on the Cloud.
1 Rewrite the last commit: Reset the current branch back by one commit, and add a new commit.
For example, you can use the following commands.
git reset --hard HEAD~1
echo "water" >> drinks.txt
git add .
git commit -m "Add drinks.txt"
2 Observe how the local branch is diverged.
git log --oneline --graph --all
* fc1d04e (HEAD -> master) Add drinks.txt
| * e60deae (upstream/master, origin/master) Update fruits list
|/
* f761ea6 (tag: v1.0) Add colours.txt, shapes.txt
* 2bedace (tag: v0.9) Add figs to fruits.txt
* d5f91de Add fruits.txt
3 Attempt to push to the remote. Observe Git rejects the push.
git push origin master
To https://github.com/.../things.git
! [rejected] master -> master (non-fast-forward)
error: failed to push some refs to 'https://github.com/.../things.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. If you want to integrate the remote changes,
hint: ...
4 Do a force-push.
You can use the --force
(or -f
) flag to force push.
git push -f origin master
A safer alternative to --force
is --force-with-lease
which overwrites the remote branch only if it hasn’t changed since you last fetched it (i.e., only if remote doesn't have recent changes that you are unaware of):
git push --force-with-lease origin master
done!
DETOUR: Resetting Uncommitted Changes
At times, you might need to get rid of uncommitted changes so that you have a fresh start to the next commit.
To get rid of uncommitted changes, you can reset the repo to the last commit (i.e., HEAD
):
The command git reset
(without specifying a commit) defaults to git reset HEAD
.
git reset
: moves any staged changes to working directory (i.e., unstage).
git reset --hard
: get rid of any staged and unstaged changes.
Related DETOUR: Updating the Last Commit
Git allows you to amend the most recent commit. This is useful when you realise there’s something you’d like to change — e.g., fix a typo in the commit message, or to exclude some unintended change from the commit.
That aspect is covered in a detour in the lesson T5L3. Reorganising Commits.
DETOUR: Undoing/Deleting Recent Commits
How do you undo or delete the last few commits if you realise they were incorrect, unnecessary, or done too soon?
Undoing or deleting recent n
commits is easily accomplished with Git's reset
feature.
n
commits and discard the those changes entirely, do a hard
reset the commit HEAD~n
e.g.,git reset --hard HEAD~3
n
commits, but keep changes staged, do a soft
reset the commit HEAD~n
e.g.,git reset --soft HEAD~3
n
commits, and move changes to the working directory, do a mixed
reset the commit HEAD~n
e.g.,git reset --mixed HEAD~3
To do the above for the most recent commit only, use HEAD~1
(or just HEAD~
).
DETOUR: Resetting a Remote-Tracking Branch Ref
Suppose you moved back the current branch ref by two commits, as follows:
git reset --hard HEAD~2
→
If you now wish to move back the remote-tracking branch ref by two commits, so that the local repo 'forgets' that it previously pushed two more commits to the remote, you can do:
git update-ref refs/remotes/origin/master HEAD
→
The git update-ref refs/remotes/origin/master HEAD
commands resets the remote-tracking branch ref origin/master
to follow the current HEAD
.
update-ref
is an example of what are known as Git plumbing commands -- lower-level commands used by Git internally. In contrast, day-to-day Git commands (such as commit
, log
, push
etc.) are known as porcelain commands (as in, in bathrooms we see the porcelain parts but not the plumbing parts that operates below the surface to make everything work).
Git can add a new commit to reverse the changes done in a specific past commit, called reverting a commit.
When a past commit introduced a bug or an unwanted change, but you do not want to modify that commit — because rewriting history can cause problems if others have already based work on it — you can instead revert that commit.
Reverting creates a new commit that cancels out the changes of the earlier one i.e., Git computes the opposite of the changes introduced by that commit — essentially a reverse diff — and applies it as a new commit on top of the current branch. This way, the problematic changes are reversed while preserving the full history, including the "bad" commit and the "fix".
→
[revert C2
]
C2
Preparation Run the following commands to create a repo with a few commits:
mkdir pioneers
cd pioneers
git init
echo "hacked the matrix" >> neo.txt
git add .
git commit -m "Add Neo"
echo "father of theoretical computing" >> alan-turing.txt
git add .
git commit -m "Add Turing"
echo "created COBOL, compiler pioneer" >> grace-hopper.txt
git add .
git commit -m "Add Hopper"
1 Revert the commit Add Neo
.
You can use the git revert <commit>
command to revert a commit. In this case, we want to revert the commit that is two commits behind the HEAD
.
git revert HEAD~2
What happens next:
In the revision graph, right-click on the commit you want to revert, and choose Reverse commit...
done!
A revert can result in a conflict, if the new changes done to reverse the previous commit conflict with the changes done in other more recent commits. Then, you need to resolve the conflict before the revert operation can proceed. Conflict resolution is covered in a later topic.
At this point: You should now be able to use a repository’s revision history to understand how the working directory evolved over time -- and use that insight to inform your work.
How useful this history is depends greatly on how well it was constructed -- for example, how frequently and meaningfully commits were made -- we’ll explore that in a later tour.
What's next: Tour 5: Fine-Tuning the Revision History
Target Usage: To maintain a clean and meaningful revision history.
Motivation: The usefulness of the revision history increases if it consists of well-crafted and well-documented commits.
Lesson plan:
T5L1. Controlling What Goes Into a Commit covers that part.
T5L2. Writing Good Commit Messages covers that part.
T5L3. Reorganising Commits covers that part.
To create well-crafted commits, you need to know how to control which precise changes go into a commit.
Crafting a commit involves two aspects:
SIDEBAR: Guidelines on what to include in a commit
A good commit represents a single, logical unit of change — something that can be described clearly in one sentence. For example, fixing a specific bug, adding a specific feature, or refactoring a specific function. If each commit tells a clear story about why the change was made and what it achieves, your repository history becomes a valuable narrative of the project’s development. Here are some (non-exhaustive) guidelines:
Git can let you choose not just which files, but which specific changes within those files, to include in a commit. Most Git tools — including the command line and many GUIs — let you interactively select which "hunks" or even individual lines of a file to stage. This allows you to separate unrelated changes and avoid committing unnecessary edits. If you make multiple changes in the same file, you can selectively stage only the parts that belong to the current logical change.
This level of control is particularly useful when:
Preparation You can use any repo for this.
1 Do several changes to some tracked files. Change multiple files. Also change multiple locations in the same file.
2 Stage some changes in some files while keeping other changes in the same files unstaged.
As you know, you can use git add <filename>
to stage changes to an entire file.
To select which hunks to stage, you can use the git add -p
command instead (-p
stands for 'by patch'):
git add -p
This command will take you to an interactive mode in which you can go through each hunk and decide if you want to stage it. The video below contains a demonstration of how this feature works:
To stage a hunk, you can click the Stage
button above the hunk in question:
Most git operations can be done faster through the CLI than equivalent Git GUI clients, once you are familiar enough with the CLI commands.
However, selective staging is one exception where a good GUI can do better than the CLI, if you need to do many fine-grained staging operations (e.g., frequently staging only parts of hunks).
done!
Detailed and well-written commit messages can increase the value of Git revision history.
Every commit you make in Git also includes a commit message that explains the change. While one-line messages are fine for small or obvious changes, as your revision history grows, good commit messages become an important source of information — for example, to understand the rationale behind a specific change made in the past.
A commit message is meant to explain the intent behind the changes, not just what was changed. The code (or diff) already shows what changed. Well-written commit messages make collaboration, code reviews, debugging, and future maintenance easier by helping you and others quickly understand the project’s history without digging into the code of every commit.
A complete commit message can include a short summary line (the subject) followed by a more detailed body if needed. The subject line should be a concise description of the change, while the body can elaborate on the context, rationale, side effects, or other details if the change is more complex.
A commit message has the following structure (note how the subject and the body are separated by a blank line):
Subject line
<blank line>
Body
# lines starting with '#' are ignored (they will not be included in the commit message)
Here is an example commit message:
Find command: make matching case-insensitive
Find command is case-sensitive.
A case-insensitive find is more user-friendly because users cannot be
expected to remember the exact case of the keywords.
Let's,
* update the search algorithm to use case-insensitive matching
* add a script to migrate stress tests to the new format
Do some changes to a repo you have.
Commit the changes while writing a full commit message (i.e., subject + body).
When you are ready to commit, use the git commit
command (without specifying a commit message).
git commit
This will open your default text editor (like Vim, Nano, or VS Code). Write the commit message inside the editor.
Save and close the editor to create the commit.
You can write your full commit message in the textbox you have been using to write commit messages already.
done!
Following a style guide makes your commit messages more consistent and fit-for-purpose. Many teams adopt established guidelines. These style guides typically contain common conventions that Git users follow when writing commit messages. For example:
Fix typo in README
rather than Fixed typo
or Fixes typo
).PRO-TIP: Configure Git to use your preferred text editor
Git will use the default text editor when it needs you to write a commit message. However, Git can be configured to use a different text editor of your choice.
You can use the following command to set the Git's default text editor:
git config --global core.editor "<editor command>"
Some examples for <editor command>
Editor | Command to use |
---|---|
Vim (default) | vim |
Nano | nano |
VS Code | code --wait e.g., git config --global core.editor "code --wait" For this to work, your computer should already be configured to launch VS Code using the code command. See here to find how (refer the 'Launching from command line' section). |
Sublime Text | subl -n -w |
Atom | atom --wait |
Notepad++ | notepad++.exe (Windows only) |
Notepad | notepad (Windows built-in) |
Why use --wait
or -w
? Graphical editors (like VS Code or Sublime) start a separate process, which can take a few seconds. Without --wait
, Git may think editing is done before you actually write the message. --wait
makes Git pause until the editor window is closed.
RESOURCES
When the revision history gets 'messy', Git has a way to 'tidy up' the recent commits.
Git has a powerful tool called interactive rebasing which lets you review and reorganise your recent commits. With it, you can reword commit messages, change their order, delete commits, combine several commits into one (squash), or split a commit into smaller pieces. This feature is useful for tidying up a commit history that has become messy — for example, when some commits are out of order, poorly described, or include changes that would be clearer if split up or combined.
Preparation Run the following commands to create a sample repo that we'll be using for this hands-on practical:
mkdir samplerepo-sitcom
cd samplerepo-sitcom
git init
echo "Aspiring actress" >> Penny.txt
git add .
git commit -m "C1: Add Penny.txt"
echo "Scientist" >> Sheldon.txt
git add .
git commit -m "C3: Add Sheldon.txt"
echo "Comic book store owner" >> Stuart.txt
git add .
git commit -m "C2: Add Stuart.txt"
echo "Engineer" >> Stuart.txt
git commit -am "X: Incorrectly update Stuart.txt"
echo "Engineer" >> Howard.txt
git add .
git commit -m "C4: Adddd Howard.txt"
Target Here are the commits that should be in the created repo, and how each commit needs to be 'tidied up'.
C4: Adddd Howard.txt
-- Fix typo in the commit message Adddd
→ Add
.X: Incorrectly update Stuart.txt
-- Drop this commit.C2: Add Stuart.txt
-- Swap this commit with the one below.C3: Add Sheldon.txt
-- Swap this commit with the one above.C1: Add Penny.txt
-- No change required.1 Start the interactive rebasing.
To start the interactive rebase, use the git rebase -i <start-commit>
command. -i
stands for 'interactive'. In this case, we want to modify the last four commits (hence, HEAD~4
).
git rebase -i HEAD~4
pick 97a8c4a C3: Add Sheldon.txt
pick 60bd28d C2: Add Stuart.txt
pick 8b9a36f X: Incorrectly update Stuart.txt
pick 8ab6941 C4: Adddd Howard.txt
# Rebase ee04afe..8ab6941 onto ee04afe (4 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup [-C | -c] <commit> = like "squash" but keep only the previous
# commit's log message, unless -C is used, in which case
# keep only this commit's message; -c is same as -C but
# opens the editor
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# create a merge commit using the original merge commit's
# message (or the oneline, if no original merge commit was
# specified); use -c <commit> to reword the commit message
# u, update-ref <ref> = track a placeholder for the <ref> to be updated
# to this position in the new commits. The <ref> is
# updated at the end of the rebase
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
The command will take you to the text editor, which will present you with a wall of text similar to the above. It has two parts:
pick
indicated by default (pick
means 'use this commit in the result') for each.2 Edit the commit list to specify the rebase actions, as follows:
pick 60bd28d C2: Add Stuart.txt
pick 97a8c4a C3: Add Sheldon.txt
drop 8b9a36f X: Incorrectly update Stuart.txt
reword 8ab6941 C4: Addddd Howard.txt
4 Once you save edits and exit the text editor, Git will perform the rebase based on the actions you specified, from top to bottom.
At some steps, Git will pause the rebase and ask for your inputs. In this case, it will ask you to specify the new commit message when it is processing the following line.
reword 8ab6941 C4: Addddd Howard.txt
To go to the interactive rebase mode, right-click the parent commit of the earliest commit you want to reorganise (in this case, it is C1: Add Penny.txt
) and choose Rebase children of <SHA> interactively...
2 To indicate what action you want to perform on each commit, select the commit in the list and click on the button for the action you want to do on it:
3 To execute the rebase, after indicating the action for all commits (the dialog will look like the below), click OK
.
The final result should be something like the following, 'tidied up' exactly as we wanted:
* 727d877 C4: Add Howard.txt
* 764fc29 C3: Add Sheldon.txt
* 08a965a C2: Add Stuart.txt
* 6436598 C1: Add Penny.txt
done!
Rebasing rewrites history. It is not recommended to rebase commits you have already shared with others.
DETOUR: Updating the Last Commit
Git allows you to amend the most recent commit. This is useful when you realise there’s something you’d like to change — e.g., fix a typo in the commit message, or to exclude some unintended change from the commit.
Updating the commit message
To change the commit message subject only, use the git commit --amend -m "<new commit message>"
command.
git commit --amend -m "Fix bug that froze the GUI"
To change the entire commit message (not just the subject), run the git commit --amend
command, which will open the text editor for you to edit the commit message. The commit will be updated when you close the text editor.
Click on the Commit
button on the top menu. In the region that you use to enter the commit message, use one of the two methods given below to go into the 'Amend last commit' mode.
Updating changes in the commit
While there are multiple ways to do this, one method that will work universally is to do a 'soft reset' of the last commit, update the staging area as you wish, and commit again.
'Updating' a commit does not really update that commit -- it simply creates a new commit with the new data. The original commit remains and is 'left behind' in the repo, and will be garbage-collected after a while if it is not referenced by anything else.
At this point: You should now be able to create more meaningful commits from the start, and also refine them further after they’ve been created.
What's next: Tour 6: Branching Locally
Guidance for the item(s) below:
As you are likely to be using an IDE for the iP, let's learn at least enough about IDEs to get you started using one.
🤔 In case you are puzzled by the sudden change of topic, it's because we take an iterative approach to covering topics, as explained in the panel below:
Professional software engineers often write code using Integrated Development Environments (IDEs). IDEs support most development-related work within the same tool (hence, the term integrated).
An IDE generally consists of:
Examples of popular IDEs:
Some web-based IDEs have appeared in recent times too e.g., Amazon's Cloud9 IDE.
Some experienced developers, in particular those with a UNIX background, prefer lightweight yet powerful text editors with scripting capabilities (e.g., Emacs) over heavier IDEs.
Refer to these se-edu guides:
Guidance for the item(s) below:
As you start adding features to your project iteratively, you'll need a way to detect if the new code breaks the existing code. Next, let's learn a rather simple way to do that using a certain type of testing (we'll be learning more sophisticated methods in later weeks).
This also means we are now switching focus from the implementation aspect to the testing aspect of SE.
Testing: Operating a system or component under specified conditions, observing or recording the results, and making an evaluation of some aspect of the system or component. –- source: IEEE
When testing, you execute a set of test cases. A test case specifies how to perform a test. At a minimum, it specifies the input to the software under test (SUT) and the expected behavior.
Example: A minimal test case for testing a browser:
longfile.html
located in the test data
folder.longfile.html
.Test cases can be determined based on the specification, reviewing similar existing systems, or comparing to the past behavior of the SUT.
For each test case you should do the following:
A test case failure is a mismatch between the expected behavior and the actual behavior. A failure indicates a potential defect (or a bug) -- we say 'potential' because the error could be in the test case itself.
Example: In the browser example above, a test case failure is implied if the scrollbar remains disabled after loading longfile.html
. The defect/bug causing that failure could be an uninitialized variable.
When you modify a system, the modification may result in some unintended and undesirable effects on the system. Such an effect is called a regression.
Regression testing is the re-testing of the software to detect regressions. The typical way to detect regressions is retesting all related components, even if they had been tested before.
Regression testing is more effective when it is done frequently, after each small change. However, doing so can be prohibitively expensive if testing is done manually. Hence, regression testing is more practical when it is automated.
An automated test case can be run programmatically and the result of the test case (pass or fail) is determined programmatically. Compared to manual testing, automated testing reduces the effort required to run tests repeatedly and increases precision of testing (because manual testing is susceptible to human errors).
A simple way to semi-automate testing of a CLI (Command Line Interface) app is by using input/output re-direction. Here are the high-level steps:
Let's assume you are testing a CLI app called AddressBook
. Here are the detailed steps:
Store the test input in the text file input.txt
.
Example input.txt
Store the output you expect from the SUT in another text file expected.txt
.
Example expected.txt
Run the program as given below, which will redirect the text in input.txt
as the input to AddressBook
and similarly, will redirect the output of AddressBook
to a text file output.txt
. Note that this does not require any changes in AddressBook
code.
java AddressBook < input.txt > output.txt
The way to run a CLI program differs based on the language.
e.g., In Python, assuming the code is in AddressBook.py
file, use the command
python AddressBook.py < input.txt > output.txt
If you are using Windows, use a normal MS-DOS terminal (i.e., cmd.exe
) to run the app, not a PowerShell window.
Next, you compare output.txt
with the expected.txt
. This can be done using a utility such as Windows' FC
(i.e., File Compare) command, Unix's diff
command, or a GUI tool such as WinMerge.
FC output.txt expected.txt
Note that the above technique is only suitable when testing CLI apps, and only if the exact output can be predetermined. If the output varies from one run to the other (e.g., it contains a time stamp), this technique will not work. In those cases, you need more sophisticated ways of automating tests.
Follow up notes for the item(s) above:
Congrats! You've made it to the end of this week's topics. It feels like a lot right now but now that we got an early start, this stuff will be second nature to you by the time you are done with the semester. 😃