In order to use Git to work effectively in a team, you need a central location that you all agree will act as the “One True Repository”. GitLab and GitHub are excellent organizations that offer both free and paid Git server services.
Remotes in GitLab and GitHub
Both are great!
GitLab
Open-core Ukrainian/Dutch company, headquartered in California.
Great for team collaboration within private companies.
Arguably the most popular private-repo solution for most major companies and research organizations.
Better project management tooling than GitHub.
Excellent for team-intensive projects that are not for immediate public release.
GitHub
Owned by Microsoft, headquartered in California.
The gold standard for open-source code.
Used by most of the major tech companies for their open-source code projects (but not for their private code projects).
Better for tools for collaboration across many dispersed groups and user/developer communities.
Excellent for individual or open-source coding projects that are intended for sharing with the public.
Git: cloning from GitLab
When starting a new repo, it’s usually easier to initialize it first on Gitlab (rather than initializing locally with git init) and then pull to your local machine using git clone.
Now use the Jupyter Lab Launcher to create a new Jupyter Lab Notebook within git_branching_and_merging/. Run:
$ git status
.ipynb_checkpoints/
What do you see? You probably see files that we’d like to ignore. Let’s do that. Run
$ vim .gitignore
Now add the file extensions you’d like to ignore:
*ipynb
*ipynb_checkpoints/
*nc
Git: branching and merging
Why branch? Because you (1) you want to commit early and often and (2) you don’t want to break your working code.
Let’s update our Euclidean Distance function to take an arbitrary number of dimensions. Watch what happens when I update this function and commit without without branching.
Git: branching and merging
We need to branch in order to preserve our working code! Let’s see what branches we have. Run:
We just created and checked out a new branch, on which we will do our modifications. Let’s check to ensure that we’re on our new branch:
$ git branch
* euclidean_distance_k_dimensions
main
Excellent! We’re on our new branch euclidean_distance_k_dimensions. Let’s get to work on updating our distance function to taking an arbitrary number of dimensions.
Git: branching and merging
To update our Python function euclidean_distance() to take an arbitrary number of dimensions, we’ll use the following code:
# k-dimensional Euclidean distance moduleimport mathdef euclidean_distance(point1, point2):""" Calculate the Euclidean distance between two points in n-dimensional space. Parameters: - point1 (list or tuple): Coordinates of the first point. - point2 (list or tuple): Coordinates of the second point. Returns: - float: The Euclidean distance between the two points. Notes: - This function assumes that both points have the same number of dimensions. - The Euclidean distance is calculated as the square root of the sum of the squared differences between corresponding coordinates of the two points. """# Validate inputiflen(point1) !=len(point2):raiseValueError("Both points must have the same number of dimensions")# Calculate the sum of squared differences squared_diffs = [(x - y) **2for x, y inzip(point1, point2)]# Calculate the square root of the sum of squared differences distance =sum(squared_diffs) **0.5return distanceif__name__=="__main__":# Example usage point_a = (1, 2, 3) point_b = (4, 5, 6)print(euclidean_distance(point_a, point_b))
Replace the relevant lines in euclidean_distance.py with the above code. Test the changes.
Git: branching and merging
Now check on the repo status. If you’re satisfied with the changes, stage and then commit the changes.
Now we that we’re satisfied that our code works, we want to merge the changes back into main:
$ git status
$ git checkout main
$ git merge euclidean_distance_k_dimensions
Well done! You have now successfully done your first branch-and-merge, all using your local version of git.
Git: branching and merging
What happens to the old branches? They persist after merging unless they are explicitly pruned. To prune the branch we just created, run:
$ git branch -d euclidean_distance_k_dimensions
If there are un-merged changes on the branch that is about to be deleted, git will warn you that such changes exist and have yet to be merged and will then ask you if you really want to delete said branch.
GitLab for project management: Initializing a new repo
Everything we just did with branching and merging can be done using GitLab, but in a manner that facilitates effective collaboration with a team. We can manage branching and merging while also tracking tasks we need to finish for the project via GitLab’s Issues feature.
Name the project <YOUR USERNAME HERE>_branch_and_merge_using_issues.
The result ought to look like this:
GitLab for project management: cloning your new repo
On your Jupyter Lab instance, open a new terminal and clone your new repo using git clone <PROJECT URL>. You can find your project’s URL on your project page under Code:
GitLab for project management: pushing changes from your local repository back to GitLab
Let’s add the 2-D Euclidean distance module that we’ve been working with for the past several days to the local copies of our repos. In Jupyter Lab, create a new file called euclidean_distance.py and copy in your 2-D Euclidean distance function, along with some test code:
# 2-dimensional Euclidean distance moduleimport mathdef euclidean_distance(point1, point2):""" Calculate the Euclidean distance between two points in 2-dimensional space. Parameters: - point1 (list or tuple): Coordinates of the first point (x1, y1). - point2 (list or tuple): Coordinates of the second point (x2, y2). Returns: - float: The Euclidean distance between the two points. Notes: - The Euclidean distance is calculated as the square root of the sum of the squared differences between corresponding coordinates of the two points. """# Calculate the sum of squared differences x1, y1 = point1 x2, y2 = point2 squared_diffs = (x2 - x1)**2+ (y2 - y1)**2# Calculate the square root of the sum of squared differences distance = math.sqrt(squared_diffs)return distanceif__name__=="__main__":# Example usage point_a = (1, 2) point_b = (4, 5)print(euclidean_distance(point_a, point_b))
Now stage and commit your changes to your local repository using git add and git commit -m. Once you’ve staged and commited your local changes, you’re ready to push your changes back to GitLab.
GitLab for project management: pushing changes from your local repository back to GitLab
What is git push? It’s the command that “pushes” the commits on your local branch to GitLab’s version of that branch. To push, first use git status to ensure that you have no unsaved local changes to your branch. Then run git push origin <BRANCH NAME HERE>:
$ git status
$ git push origin main
Now navigate to your project page on GitLab. Do you see your commit? It ought to be visible as soon as you push.
GitLab for project management: GitLab Issues for managing project tasks
In GitLab, “to do” tasks are called “Issues”. They can be found under the Plan section of the project menu on the left-hand side of the project’s GitLab page.
Use that menu option to navigate to the Issues page.
Think of “Issues” as the place where you communicate all the tasks that need to happen in order to complete the projects. Let’s use issues to track tasks and the associated branching/merging that we’ll need to perform.
GitLab for project management: GitLab Issues for managing project tasks
Let’s issues to add the k-dimensional Euclidean distance capabilities to our module.
Go to your project’s Issues page and add New Issue. Add a descriptive title, such as
Add k-dimensional Euclidean distance
and select yourself as an Assignee.
GitLab for project management: create merge request and branch from Issues
Now we’ll create a branch (and simultaneous merge request) using our new issue. Doing this allows our team (and ourselves) know that we’re actively working on this issue, while at the same time protecting our working branches.
GitLab for project management: create merge request and branch from Issues
Select the merge request. Make sure your options are all correct (default are usually appropriate) and click Create merge request. Notice that this automatically creates a new branch for you that is named after the related issue.
GitLab for project management: pulling to get the latest copy of our repo from GitLab
Now we need to pull this new branch to our Jupyter Lab’s copy of the repo and begin working on it. To do this, we’ll first run git status to ensure that there are no new local changes and then use git pull to pull new changes:
$ git status
$ git pull
git pull will notify you that a new branch has been created.
NOTE: git pull and git push are nearly as important as git add and git commit. You will be using them nearly as frequently.
What happens if you run git branch?
GitLab for project management: update branch and push changes back to GitLab
Update the branch with your k-dimensional Euclidean distance. Then:
Stage the changes with git add
Save the changes with git commit -m
Push the changes back to GitLab with git push origin <BRANCH NAME HERE>
# k-dimensional Euclidean distance moduleimport mathdef euclidean_distance(point1, point2):""" Calculate the Euclidean distance between two points in n-dimensional space. Parameters: - point1 (list or tuple): Coordinates of the first point. - point2 (list or tuple): Coordinates of the second point. Returns: - float: The Euclidean distance between the two points. Notes: - This function assumes that both points have the same number of dimensions. - The Euclidean distance is calculated as the square root of the sum of the squared differences between corresponding coordinates of the two points. """# Validate inputiflen(point1) !=len(point2):raiseValueError("Both points must have the same number of dimensions")# Calculate the sum of squared differences squared_diffs = [(x - y) **2for x, y inzip(point1, point2)]# Calculate the square root of the sum of squared differences distance =sum(squared_diffs) **0.5return distanceif__name__=="__main__":# Example usage point_a = (1, 2, 3) point_b = (4, 5, 6)print(euclidean_distance(point_a, point_b))
GitLab for project management: merging branches with GitLab Issues
Now that we’ve updated our branch, it’s time to merge it back into main with GitLab.
Under Code in the left hand menu, navigate to Merge requests.
Once there, you can review and approve the merge request. Go ahead and do so. Then navigate back your repo’s GitLab code base. What do you see?
Congratulations! You’ve just created your first GitLab issue and completed it using branching and merging!