Skip to content

GitHub & Version Control

We use GitHub for version control and code collaboration. Our repositories are hosted in the majority-dev organization.

Organization

GitHub Organization: majority-dev

You'll be added to the DataTeam group by your manager.

GitHub CLI

The GitHub CLI provides a convenient way to interact with GitHub from the terminal.

Installation

brew install gh

Authentication

gh auth login

Follow the prompts to authenticate via your browser.

Verify

List repositories in the organization:

gh repo list majority-dev --limit 100

Clone a Repository

Using HTTPS:

git clone https://github.com/majority-dev/dt-structure.git

Using SSH:

git clone git@github.com:majority-dev/dt-structure.git


Workflow

1. Create Feature Branch

git checkout -b feature/your-feature-name

Branch naming conventions: Start your branch name with the ticket code (eg DATA-2222), and then add a short description of what that is about. Example: DATA-2758: Update Axiom tx codes and accounts

2. Make Changes

Edit files, add features, fix bugs.

3. Commit Changes

git add .
git commit -m "Clear, descriptive commit message"

4. Push to GitHub

git push origin feature/your-feature-name

5. Create Pull Request

Using GitHub UI: 1. Navigate to the repository 2. Click "Compare & pull request" 3. Check the relevant checkboxes in the PR template 4. Request reviewers 5. Submit PR

6. Merge

Once approved: - Squash and merge. This should delete the remote feature branch, after merging. - Check the Actions of the repo, you might need to trigger manually the Deployment of your code in the Prod environment.


GitHub Actions

We use GitHub Actions for CI/CD.

Common Workflows

  • Linting - Code quality checks
  • Build - Build Docker images
  • Deploy - Deploy to environments

Viewing Workflows

  1. Go to repository on GitHub
  2. Click "Actions" tab
  3. View workflow runs and logs

Troubleshooting Failed Workflows

  1. Click the failed workflow
  2. Review logs for errors
  3. Fix issues locally
  4. Push fixes
  5. Workflow runs automatically

[TODO] update the below section, move Python sections to Python and link to it here

Post-Migration Setup

After the migration completes, follow these steps:

GitHub Configuration

1. Add Repo to Data Team

Ask Platform team / Michel to add the repo to the Data team repos. This allows all team members to access the Settings pane.

2. Create Environments

Manually create environments under Settings > Environments: - dev - prod

Recommended: Enable manual approval for the prod environment. When code is merged to main, the deployment pipeline will: - Create an image - Push it to Azure Container Registry - Wait for manual approval before prod deployment

You'll receive an email notification for approval.

3. Set Repository Access

Add the Data Team as Admin: - Navigate to Settings > Collaborators & Teams - Click Add... - Select Data Team - Set role to Admin

4. Configure Branch Protection

Under Settings > Branches, enable at minimum: - ☑️ Checkbox 1: Require a pull request before merging - ☑️ Checkbox 3: Require status checks to pass before merging

5. Subscribe to Slack Notifications

Subscribe the repo to the #data_team_pull_requests channel:

/github subscribe majority-dev/[YOUR_REPO_NAME] pulls

6. Configure Pull Request Settings

Under Settings > General, scroll to Pull Requests: - ☑️ Allow squash merging - ☐ Uncheck other merge options


Code Migration (Python Repositories)

GitHub Workflows

Add two workflows under .github/workflows/:

  1. Code Deployment Pipeline - Deploys your code
  2. Code Quality Check Pipeline - Checks formatting and linting

Reference: dt-cfsb-sftp workflows

Container Image Naming

Ensure the correct container image name in your deployment configuration. This name will be used in: - Azure Container Registry - Airflow (or other services picking up the image)

Dockerfile Updates

Python Version: - Update to the latest Python version if possible - Consider package dependencies before upgrading

Package Manager:

We use uv as our package manager:

# Initialize uv environment
uv init .

# Create virtual environment
uv venv

# Activate virtual environment
source .venv/bin/activate

# Add dependencies
uv add <package-name>

# Or from requirements.txt
uv add -r requirements.txt

# Add dev dependencies (e.g., ruff)
uv add ruff --dev

Update Dockerfile:

Use uv in your Dockerfile. Example: dt-cfsb-sftp Dockerfile

Delete requirements.txt:

Once pyproject.toml is created, delete requirements.txt as it's no longer needed.

Cleanup Old Files

Delete Azure DevOps Folder:

rm -rf .azuredevops

This folder refers to the old Azure DevOps pipeline configuration.

Add Required Files

CODEOWNERS File

Create .github/CODEOWNERS:

* @majority-dev/data

Reference: dt-cfsb-sftp example

Linting Configuration

Create ruff.toml with your linting rules. We use Ruff for linting and formatting.

Example ruff.toml:

line-length = 120
target-version = "py312"

[lint]
select = ["E", "F", "I"]
ignore = ["E501"]

Migrating Pipelines from Azure DevOps to GitHub

The equivalent of Azure Pipelines in GitHub is GitHub Actions.

Using GitHub Actions Importer

# Install GitHub extension
gh extension install github/gh-actions-importer

# Update extension
gh actions-importer update

# Configure personal tokens for GitHub and Azure
gh actions-importer configure

# Perform dry-run migration
gh actions-importer dry-run azure-devops pipeline \
  --pipeline-id 937 \
  --output-dir path/to/Documents/repos/ \
  --azure-devops-organization MAJORITY \
  --azure-devops-project $ADO_PROJECT

This command generates a new pipeline YAML file for GitHub. Copy the output and add it to your GitHub repo under .github/workflows/.