Links for this session!

Overview

This live session introduces a practical workflow for version control and reproducible reporting using GitHub, R, and Quarto. We will move from local projects to publishable, shareable outputs with a minimal but robust setup.

Who this is for

R users who are new to Git and GitHub
Analysts who want a clean, repeatable workflow for public health projects
Anyone preparing to publish Quarto documents or GitHub Pages sites

What you will learn

Why version control matters for public health data science
How to connect RStudio to GitHub securely
How to structure a project for collaboration and reproducibility
How to use Quarto for reports and websites
How to publish a project using GitHub Pages

Mini-case example for the session

We will use a small, realistic dataset to practice a reproducible workflow from data import to a published report. The example mimics a monthly district summary often used in routine surveillance.

In this mini-case, you are supporting a district review meeting where the program team wants to understand whether recent increases in suspected malaria cases reflect a true rise in transmission or shifts in testing effort. You will compare districts across a few months, calculate positivity, and prepare a short summary that could be shared with colleagues and decision makers.

This is also a gentle introduction to Quarto. If you have not used Quarto before, think of it as a modern, easy way to combine R code, results, and narrative text in a single document. We will use Quarto to turn the analysis into a clean report that can be shared or published online, without copying and pasting output by hand.

You can think of the Quarto workflow in three steps:

Write your analysis and narrative in a single .qmd file
Render to HTML with one command
Publish the output to GitHub Pages

This blog post is being created using the same process, so you can use it as a real example of the workflow we will practice together.

This post uses the same workflow

This post is written in Quarto, rendered to HTML, and published with GitHub Pages. You can follow the exact same steps during the session.

Case question

Which districts show a rise in suspected malaria cases over the last three months, and how do testing numbers affect interpretation?

During the session, we will:

Commit the raw data import
Commit the cleaned dataset with a clear message
Add a Quarto report section with the figure
Publish the report to GitHub Pages

Session outline

Version control concepts in plain language
Git setup and authentication
Creating a new R project with Git
Project structure and .gitignore best practices
First commits and commit messages that tell a story
Pushing to GitHub and working with issues
Building a Quarto document and publishing

Before the session

Please complete the steps below so we can spend the session coding together.

Install and verify Git

Install Git if needed:

https://git-scm.com/downloads

Then, in RStudio:

#| eval: false
usethis::git_sitrep()

If anything is missing, follow the instructions shown in the report.

Authenticate GitHub from R

#| eval: false
usethis::gh_token_help()

Verify stored credentials:

#| eval: false
gitcreds::gitcreds_get()

Optional but recommended:

#| eval: false
usethis::use_git_config(
  user.name = "Your Name",
  user.email = "your_email@example.com"
)

Core Git workflow (add, commit, push)

If this is your first time using Git, the basic cycle is:

Make changes to your files
Add the files you want to track
Commit with a clear message
Push your commits to GitHub

RStudio Git tab (recommended)

Save your files
Open the Git tab
Check the files you want to include
Click Commit, write a message, and commit
Click Push to send changes to GitHub

Command line (equivalent)

#| eval: false
git status
git add README.md
git commit -m "Add project overview"
git push

Common commands and what they do

git status shows what changed
git add stages changes for a commit
git commit -m "message" saves a snapshot
git push sends commits to GitHub
git pull gets the latest changes from GitHub
git log --oneline shows your recent history

Commit message tips

Start with a verb (e.g., “Add”, “Fix”, “Update”)
Keep it short but specific
Mention the main change, not every detail

Example:

Add summary plot for malaria case trends

Mini-exercise: recover from a mistake

Use this if time allows. The goal is to see how Git helps you undo a change safely.

Open a script or Quarto file and make a small edit
Check what changed:

#| eval: false
git status

Discard the change:

#| eval: false
git restore path/to/file.qmd

Confirm the file is back to its original state:

#| eval: false
git status

If you prefer the RStudio interface, you can right-click a file in the Git tab and choose Revert.

Mini-exercise: undo a staged file

This helps if you accidentally staged the wrong file.

#| eval: false
git add path/to/file.qmd
git status
git restore --staged path/to/file.qmd
git status

Mini-exercise: review a diff before commit

Use this to see exactly what will be committed.

#| eval: false
git diff
git diff --staged

During the session

We will build a small example project together. If you want to follow along, open RStudio and have a folder ready where you can create a new project.

Suggested project structure

#| eval: false
fs::dir_create(c("data", "scripts", "outputs", "local"))

Creating a clear structure early saves time later. It keeps raw data, cleaning scripts, and outputs separated so collaborators can find what they need quickly. It also makes it easier to automate your workflow and reduces the chance of accidentally overwriting files or mixing raw and processed data.

project/
  data/         # raw data files
  scripts/      # R scripts
  outputs/      # cleaned data, figures, models
  local/        # private or sensitive files (ignored)
  README.md
  index.qmd     # Quarto document

Example .gitignore

Use a .gitignore file to tell Git which files should never be tracked. Add it early in the project so you do not accidentally commit large datasets, temporary files, or anything sensitive. You can update .gitignore at any time, but it only affects new changes, so it is best to put it in place before you start adding files.

.Rhistory
.RData
/local/
*.docx
*.tmp

Collaborating with GitHub Issues

Issues are a lightweight way to coordinate work and keep a clear record of decisions. They help teams capture tasks, bugs, and questions in one place so progress is visible to everyone.

Key features to know:

Labels to group work (e.g., bug, data, documentation)
Assignees to show who is responsible for a task
Checklists to break work into steps
References that link issues to commits and pull requests

Good issue hygiene:

Write a clear title and short description
Add context, links, or screenshots when useful
Keep issues focused on one task or decision

To collaborate with others, add them to your repository:

Open the repo on GitHub
Go to Settings → Collaborators and teams
Invite a collaborator by GitHub username or email

Once they accept, they can push changes, open issues, and submit pull requests (depending on permissions).

Example issue template

**Task:** Add data dictionary for malaria routine data

**Why this matters**
- Clarifies variable definitions for collaborators
- Supports reproducibility and handover

**Steps**
- [ ] Draft variable list and descriptions
- [ ] Add to `README.md`
- [ ] Link the issue in the next commit

**Assigned to:** @username

Quarto and publishing

We will create a simple Quarto document and publish it with GitHub Pages.

title: "My First Quarto Document"
format: html
editor: visual

Build the example Quarto document

Create a file called index.qmd in your project root and add a short write-up plus the mini-case R code. Below is a minimal working example you can paste and adapt.

Summary

This report checks recent suspected malaria cases and test positivity across districts.

library(readr)
library(dplyr)


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

library(ggplot2)

Warning: package 'ggplot2' was built under R version 4.5.1

library(plotly)

Warning: package 'plotly' was built under R version 4.5.2


Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout

case_data <- read_csv("data/malaria_routine_data.csv")

Rows: 72 Columns: 8

── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): district
dbl  (6): population, under5_population, suspected_cases, tests_done, positi...
date (1): month

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

case_summary <- case_data %>%
  mutate(positivity = positive_tests / tests_done) %>%
  group_by(district) %>%
  summarise(
    total_cases = sum(suspected_cases),
    mean_positivity = mean(positivity)
  ) %>%
  arrange(desc(total_cases))

case_summary

# A tibble: 6 × 3
  district total_cases mean_positivity
  <chr>          <dbl>           <dbl>
1 Mansa          11023           0.284
2 Chipata         9011           0.243
3 Mongu           8505           0.263
4 Solwezi         7387           0.223
5 Choma           6779           0.213
6 Kasama          6123           0.203

ggplot(case_data, aes(month, suspected_cases, color = district)) +
  geom_line(linewidth = 1) +
  geom_point() +
  labs(
    title = "Suspected malaria cases by district",
    x = "Month",
    y = "Suspected cases"
  ) +
  theme_minimal()

interactive_plot <- ggplot(case_data, aes(month, suspected_cases, color = district)) +
  geom_line(linewidth = 1) +
  geom_point() +
  labs(
    title = "Suspected malaria cases by district (interactive)",
    x = "Month",
    y = "Suspected cases"
  ) +
  theme_minimal()

plotly::ggplotly(interactive_plot)

Publishing workflow:

#| eval: false
usethis::use_github_pages()

#| eval: false
quarto preview

#| eval: false
quarto publish gh-pages

If you have not installed Plotly before:

#| eval: false
install.packages("plotly")

Beyond a basic report

Quarto and GitHub Pages can publish far more than a static report. Once you are comfortable with the basics, you can explore:

Dashboards for routine monitoring and rapid summaries
Interactive maps for spatial data and intervention targeting
Slide decks for presentations and briefings
Websites for documentation, data dictionaries, and project notes

These options use the same workflow: build your content in Quarto, then publish it with GitHub Pages.

Practice exercises

Create a new R project with Git enabled
Make three commits with descriptive messages
Push to GitHub
Create a Quarto index.qmd and publish to GitHub Pages
Open a GitHub Issue to track a next step

Resources

Happy Git with R: https://happygitwithr.com/
Quarto documentation: https://quarto.org
GitHub Pages with Quarto: https://quarto.org/docs/publishing/github-pages.html
Git cheat sheet: https://git-scm.com/docs

Questions or help

If you run into setup issues before the session, share the output of usethis::git_sitrep() in the cohort channel so we can troubleshoot.