R Getting Started — Installation, Setup, and First Steps
Learning Objectives
By the end of this tutorial, you will be able to:
- Install R and RStudio on Windows, macOS, or Linux
- Navigate the RStudio IDE and understand its four panes
- Execute R commands in the console and script editor
- Understand R's basic syntax, help system, and working directory
- Write and run your first R scripts
What Is R?
R is a programming language and environment designed for statistical computing, data analysis, and graphics. Created in 1993 by Ross Ihaka and Robert Gentleman at the University of Auckland, R has become the lingua franca of data science, bioinformatics, and academic research.
| Feature | Description |
|---|---|
| Open Source | Free to use, with a massive community |
| Statistical Focus | Built-in functions for almost every statistical test |
| Data Visualization | ggplot2, base graphics, and lattice |
| Package Ecosystem | 20,000+ packages on CRAN |
| Reproducibility | R Markdown and Quarto for literate programming |
| Cross-Platform | Runs on Windows, macOS, Linux |
R vs Python for Data Science
| Aspect | R | Python |
|---|---|---|
| Primary Use | Statistics, research | General-purpose, ML engineering |
| Data Frames | Native, first-class | Via pandas |
| Visualization | ggplot2 (grammar of graphics) | matplotlib, seaborn |
| Learning Curve | Steeper for non-statisticians | Gentler for programmers |
| Package Management | CRAN, install.packages() | PyPI, pip |
| Community | Academics, statisticians | Industry, developers |
Installing R
Step 1: Download R
- Go to https://cran.r-project.org/
- Click "Download R for [your OS]"
- Choose the latest stable version (e.g., R 4.4.x)
Step 2: Install R
Windows:
- Run the downloaded
.exefile - Accept default settings
- Choose 64-bit version if prompted
macOS:
- Open the
.pkgfile - Follow the installer prompts
Linux (Ubuntu/Debian):
sudo apt update
sudo apt install r-base
Step 3: Verify Installation
Open a terminal or command prompt:
R --version
Installing RStudio
RStudio is the most popular IDE (Integrated Development Environment) for R.
- Go to https://posit.co/download/rstudio-desktop/
- Download the free version for your OS
- Install and launch RStudio
RStudio Interface
RStudio has four panes:
| Pane | Purpose |
|---|---|
| Source Editor | Write and edit R scripts |
| Console | Execute R commands interactively |
| Environment/History | View variables, data, and command history |
| Files/Plots/Packages/Help | Browse files, view plots, manage packages |
Keyboard Shortcuts:
Ctrl + Enter— Run current line from editor to consoleCtrl + Shift + C— Comment/uncomment linesCtrl + Shift + M— Insert pipe operator (|>or%>%)Tab— Auto-complete
The R Console
The R console is where you execute commands interactively. Type a command and press Enter.
# Basic arithmetic
2 + 3
# [1] 5
100 / 3
# [1] 33.33333
# Assigning values
x <- 42
x
# [1] 42
# You can also use = for assignment (but <- is conventional)
y = 100
y
# [1] 100
# Printing
print("Hello, R!")
# [1] "Hello, R!"
cat("Hello,", "R!", "\n")
# Hello, R!
R Syntax Basics
Assignment Operators
R supports three assignment operators, but <- is the convention:
# Preferred: arrow operator
x <- 10
# Also valid (but less common in R)
x = 10
# Assignment to the right (rare, used in some packages)
10 -> x
# Multiple assignment
a <- b <- c <- 0
a; b; c
# [1] 0
# [1] 0
# [1] 0
Comments
# This is a comment
# R ignores comments
x <- 5 # Inline comment
# Multi-line
# Line 1
# Line 2
Semicolons
R does not require semicolons, but they work:
x <- 5; y <- 10; x + y
# [1] 15
Working Directory
Your working directory is where R reads and saves files by default.
# Check current working directory
getwd()
# [1] "C:/Users/YourName/Documents"
# Set working directory
setwd("C:/my_project/data")
# Use forward slashes or double backslashes
setwd("C:\\my_project\\data")
# List files in current directory
list.files()
Tip in RStudio: Session → Set Working Directory → Choose Directory
The R Help System
R has extensive built-in help. This is one of R's greatest strengths.
# Get help on a function
?mean
help(mean)
# Get help on a package
help(package = "ggplot2")
# Search help pages
??linear regression
help.search("regression")
# Examples from help
example("mean")
# Vignettes (long-form tutorials)
vignette("dplyr")
Reading Help Pages
Help pages follow a standard structure:
| Section | Description |
|---|---|
| Description | What the function does |
| Usage | Function signature |
| Arguments | Parameter descriptions |
| Value | What it returns |
| Examples | Working code examples |
# Help for the mean function
?mean
# Arguments: x, trim, na.rm
# Usage: mean(x, trim = 0, na.rm = FALSE, ...)
# Try the examples
mean(c(1, 2, 3, 4, 5))
# [1] 3
mean(c(1, 2, NA, 4, 5), na.rm = TRUE)
# [1] 3
Data Types Overview
R has six basic data types:
| Type | Example | Description |
|---|---|---|
| numeric | 3.14 | Decimal numbers |
| integer | 42L | Whole numbers (L suffix) |
| character | "hello" | Text strings |
| logical | TRUE / FALSE | Boolean values |
| complex | 3 + 4i | Complex numbers |
| raw | charToRaw("x") | Raw bytes |
# Numeric (double by default)
x <- 3.14
class(x)
# [1] "numeric"
# Integer (use L suffix)
y <- 42L
class(y)
# [1] "integer"
# Character
name <- "Alice"
class(name)
# [1] "character"
# Logical
is_active <- TRUE
class(is_active)
# [1] "logical"
# Complex
z <- 3 + 4i
class(z)
# [1] "complex"
Data Structures Overview
R has five main data structures:
| Structure | Description | Homogeneous? |
|---|---|---|
| Vector | Ordered collection of same type | Yes |
| Matrix | 2D grid of same type | Yes |
| Array | Multi-dimensional grid of same type | Yes |
| List | Ordered collection of any type | No |
| Data Frame | Table with columns of different types | No |
# Vector
v <- c(1, 2, 3, 4, 5)
# Matrix
m <- matrix(1:9, nrow = 3, ncol = 3)
# List
l <- list(1, "hello", TRUE, 3.14)
# Data Frame
df <- data.frame(
name = c("Alice", "Bob"),
age = c(25, 30)
)
Your First R Script
Create a new file in RStudio: File → New File → R Script
# ============================================
# My First R Script
# ============================================
# 1. Basic calculations
result <- 2 + 3
cat("2 + 3 =", result, "\n")
# 2. Working with vectors
scores <- c(85, 92, 78, 95, 88)
cat("Scores:", scores, "\n")
cat("Mean:", mean(scores), "\n")
cat("Max:", max(scores), "\n")
cat("Min:", min(scores), "\n")
# 3. A simple function
bmi <- function(weight_kg, height_m) {
weight_kg / (height_m ^ 2)
}
my_bmi <- bmi(70, 1.75)
cat("BMI:", round(my_bmi, 1), "\n")
# 4. Conditional
if (my_bmi < 18.5) {
cat("Category: Underweight\n")
} else if (my_bmi < 25) {
cat("Category: Normal weight\n")
} else if (my_bmi < 30) {
cat("Category: Overweight\n")
} else {
cat("Category: Obese\n")
}
# 5. Simple plot
x <- 1:100
y <- sin(x / 10)
plot(x, y, type = "l", col = "blue",
main = "Sine Wave",
xlab = "X", ylab = "sin(x/10)")
Save as my_first_script.R and run with Ctrl + Shift + Enter.
R Packages
Packages extend R's functionality. The Comprehensive R Archive Network (CRAN) hosts 20,000+ packages.
# Install a package (only once per R version)
install.packages("ggplot2")
# Load a package (every session)
library(ggplot2)
# Or use the namespace prefix
ggplot2::ggplot()
# See installed packages
installed.packages()
# Update all packages
update.packages()
Essential Packages for Beginners
| Package | Purpose |
|---|---|
ggplot2 | Data visualization |
dplyr | Data manipulation |
tidyr | Data tidying |
readr | Fast data import |
stringr | String manipulation |
lubridate | Date/time handling |
tidyverse | Meta-package (loads all above) |
# Install the tidyverse (recommended for beginners)
install.packages("tidyverse")
# Load it
library(tidyverse)
R Console Tips and Tricks
Command History
# Use Up/Down arrows to cycle through previous commands
# Or use history()
history()
# Search history with Ctrl+R (reverse search)
Clearing the Console
# In RStudio: Ctrl+L
# In R console:
cat("\014")
Exiting R
# Save workspace and quit
q()
# Or in RStudio: Ctrl+Q
# Save workspace image?
# Save to .RData file (not recommended for beginners)
Common Beginner Mistakes
1. Case Sensitivity
# R is case-sensitive
x <- 10
X <- 20
cat(x, X) # 10 20 — different variables
2. Using Undefined Variables
# This will error
# y + 1
# Error: object 'y' not found
y <- 5
y + 1
# [1] 6
3. Forgetting Quotes for Strings
# Wrong
# name <- Alice # Error: object 'Alice' not found
# Correct
name <- "Alice"
4. Mixing Assignment Styles
# Convention: use <-
x <- 5
# = works but is discouraged
x = 5
# -> works but is rarely used
5 -> x
Practice Exercises
Exercise 1: Calculator
Write R code to calculate:
- The area of a circle with radius 5 (area = pi * r^2)
- The average of 10, 20, 30, 40, 50
- The remainder when 17 is divided by 5
Solution
# Area of circle
radius <- 5
area <- pi * radius^2
cat("Area:", area, "\n")
# Average
numbers <- c(10, 20, 30, 40, 50)
avg <- mean(numbers)
cat("Average:", avg, "\n")
# Remainder
remainder <- 17 %% 5
cat("Remainder:", remainder, "\n")
Exercise 2: BMI Calculator
Create a function that takes weight (kg) and height (cm), converts height to meters, and returns the BMI. Test it with your own values.
Solution
calculate_bmi <- function(weight_kg, height_cm) {
height_m <- height_cm / 100
bmi <- weight_kg / (height_m ^ 2)
return(round(bmi, 1))
}
my_bmi <- calculate_bmi(70, 175)
cat("My BMI:", my_bmi, "\n")
Key Takeaways
- R is a language for statistics — purpose-built for data analysis and visualization
- Install R first, then RStudio — R is the engine, RStudio is the dashboard
- Use
<-for assignment — it's the R convention - R is case-sensitive —
xandXare different variables - The
?operator is your friend —?meangives instant help - Start with
tidyverse— it provides a consistent, modern workflow - Save scripts, not just console output — reproducibility matters
Next: Learn about R Variables and Data Types — the building blocks of every R program.