DaTax R Course

Master R for Tax Administration Analytics

Welcome

This course teaches you practical R programming for tax administration. You’ll learn to analyze taxpayer data, automate reports, and create professional visualizations—no programming experience required.

Built by the World Bank DaTax team for tax administrators worldwide.


What You’ll Learn

By the end of this course, you will be able to:

Note🎯 Learning Objectives

Data Management - Import and export tax datasets (CSV, Excel, multiple file formats) - Clean and transform taxpayer records efficiently - Handle millions of rows with confidence

Data Analysis - Filter and subset taxpayer populations - Calculate tax statistics and compliance metrics - Group and summarize data by regions, sectors, or time periods - Merge datasets from different tax systems

Data Visualization - Create professional charts for reports and presentations - Visualize tax revenue trends, compliance rates, and distributions - Design publication-ready graphics with ggplot2

Reproducible Workflows - Write clear, documented R scripts - Automate repetitive analysis tasks - Share reproducible analysis with colleagues

Why Learn R for Tax Administration?

  • Free & Open Source: No licensing costs for your organization
  • Powerful: Handle datasets with millions of taxpayer records
  • Reproducible: Every analysis step is documented and repeatable
  • Efficient: Automate monthly reports and routine analyses
  • Growing: Large community of users in government and research

Prerequisites

Important✓ What You Need

Required - A computer with R and RStudio installed (see installation guide below) - Basic computer skills (opening files, using folders) - Internet connection to download course materials

NOT Required - No programming experience needed - No statistics background required - No prior knowledge of R or any coding language

This course is designed for complete beginners. If you can work with Excel, you can learn R.


Course Structure

The course consists of 5 sequential modules designed to build your skills progressively:

Module 1: Introduction to R — Learn R basics, RStudio interface, variables, and fundamental data types.

Module 2: Data Import & Export — Master reading CSV and Excel files, handle different file formats, and export results.

Module 3: Data Wrangling with dplyr — Transform datasets with filtering, selecting, creating variables, and summarizing.

Module 4: Reshaping & Joining Data — Reshape data between wide and long formats. Combine multiple datasets.

Module 5: Data Visualization with ggplot2 — Create professional charts and communicate insights effectively.

Each module includes step-by-step explanations, hands-on exercises with real tax data, template scripts, and complete solutions.


Getting Started

Step 1: Install R and RStudio

You need to install two pieces of software: R (the programming language) and RStudio (the interface that makes R easy to use).

WarningImportant: Installation Order

Always install R first, then RStudio. RStudio needs R to be already installed on your computer.

Installing R

  1. Visit the R website
    Go to https://cloud.r-project.org/

  2. Click “Download R for Windows”
    You’ll see this link near the top of the page

  3. Click “base” or “install R for the first time”
    This takes you to the download page for the basic R installation

  4. Click “Download R-4.x.x for Windows”
    (The version number will be different, e.g., R-4.4.1)
    This downloads an .exe file (usually to your Downloads folder)

  5. Run the installer

    • Double-click the downloaded .exe file
    • If Windows asks “Do you want to allow this app to make changes?”, click Yes
    • Choose English (or your preferred language)
    • Click Next through all screens, accepting the default options
    • Click Finish when installation completes
  6. Verify installation

    • Look for the R icon on your desktop or in your Start menu
    • You don’t need to open R directly—you’ll use RStudio instead
  1. Visit the R website
    Go to https://cloud.r-project.org/

  2. Click “Download R for macOS”
    You’ll see this link near the top of the page

  3. Choose the correct version for your Mac

    • For newer Macs (M1, M2, M3 chips): Download the file that says “Apple silicon arm64”
    • For older Macs (Intel processors): Download the file that says “Intel 64-bit”
    • If you’re not sure which you have, click the Apple menu () → “About This Mac” → look under “Chip” or “Processor”
  4. Download the .pkg file
    Click the appropriate link (e.g., R-4.4.1-arm64.pkg or R-4.4.1-x86_64.pkg)
    The file will download to your Downloads folder

  5. Run the installer

    • Double-click the downloaded .pkg file
    • Click Continue through the introduction screens
    • Click Agree to the license
    • Click Install (you may need to enter your Mac password)
    • Click Close when installation completes
  6. Install XQuartz (may be required)

    • Some R packages need XQuartz
    • Download it from https://www.xquartz.org/
    • Install it the same way (double-click the .pkg file)
    • Restart your computer after installing XQuartz
  7. Verify installation

    • Look for R in your Applications folder
    • You don’t need to open R directly—you’ll use RStudio instead
  1. Visit the R website
    Go to https://cloud.r-project.org/

  2. Click “Download R for Linux”
    Choose your Linux distribution (Ubuntu, Debian, Fedora, etc.)

  3. Follow your distribution’s instructions

    For Ubuntu/Debian:

    # Update package list
    sudo apt update
    
    # Install R
    sudo apt install r-base r-base-dev

    For Fedora:

    sudo dnf install R
  4. Verify installation
    Open a terminal and type R --version
    You should see the R version information

Installing RStudio

Now that R is installed, you can install RStudio:

  1. Visit the RStudio website
    Go to https://posit.co/download/rstudio-desktop/

  2. Scroll down to “All Installers”
    Look for the table showing different versions

  3. Click the Windows installer
    Find the row that says “Windows 10/11” and click the download link
    This downloads a .exe file (e.g., RStudio-2024.xx.x-xxx.exe)

  4. Run the installer

    • Double-click the downloaded .exe file
    • If Windows asks permission, click Yes
    • Click Next through all screens, accepting defaults
    • Click Install
    • Click Finish when complete
  5. Launch RStudio

    • Find the RStudio icon on your desktop or Start menu
    • Double-click to open RStudio
    • You should see a window with multiple panels—that’s RStudio!
  1. Visit the RStudio website
    Go to https://posit.co/download/rstudio-desktop/

  2. Scroll down to “All Installers”
    Look for the table showing different versions

  3. Click the macOS installer
    Find the row that says “macOS 11+” and click the download link
    This downloads a .dmg file

  4. Install RStudio

    • Double-click the downloaded .dmg file
    • A window opens showing the RStudio icon and Applications folder
    • Drag the RStudio icon into the Applications folder
    • Wait for the copy to complete
    • Eject the RStudio disk image (if it appears on your desktop)
  5. Launch RStudio

    • Go to Applications folder
    • Double-click RStudio
    • If Mac says “RStudio cannot be opened because it is from an unidentified developer”:
      • Open System Preferences → Security & Privacy
      • Click “Open Anyway” next to the RStudio message
      • Click Open in the confirmation dialog
    • You should see RStudio open with multiple panels
  1. Visit the RStudio website
    Go to https://posit.co/download/rstudio-desktop/

  2. Download the installer for your distribution

    For Ubuntu/Debian:

    • Download the .deb file
    • Install with: sudo dpkg -i rstudio-2024.xx.x-amd64.deb
    • If you get dependency errors, run: sudo apt-get install -f

    For Fedora/RedHat:

    • Download the .rpm file
    • Install with: sudo dnf install rstudio-2024.xx.x-x86_64.rpm
  3. Launch RStudio
    Type rstudio in the terminal or find it in your applications menu

Verify Everything Works

  1. Open RStudio (not R—you’ll always use RStudio)

  2. You should see 4 panels:

    • Console (left or bottom-left)
    • Environment/History (top-right)
    • Files/Plots/Help (bottom-right)
    • Script editor (top-left, may not appear until you open a file)
  3. Test R by typing in the Console:

    2 + 2

    Press Enter. You should see [1] 4

  4. If you see the result, congratulations! R and RStudio are working correctly.

TipInstallation Successful?

If you saw [1] 4 appear in your console, you’re ready to start learning!

Having problems? Common issues: - RStudio won’t open: Make sure you installed R first - Console says “R not found”: Reinstall R, then restart RStudio - Mac security blocks RStudio: Go to System Preferences → Security & Privacy → click “Open Anyway”


Step 2: Download Course Materials

Visit the Teaching Materials page to download: - Exercise templates (.R files) - Solution scripts (.R files)
- Sample tax datasets - Quick reference guides


Step 3: Start Learning

Click “1. Introduction to R” in the left sidebar to begin your first module.

Work through modules sequentially—each builds on concepts from previous lessons.


Ready to Begin?

Tip🚀 Start Your R Journey

You have R and RStudio installed. You know where to find course materials.

Now click “1. Introduction to R” in the sidebar menu to begin Module 1.

Remember: every expert was once a beginner. Take your time, practice consistently, and don’t hesitate to experiment!


Developed by The World Bank DaTax Team | View on GitHub

Back to top