R Variables and Data Types
Learning Objectives
By the end of this tutorial, you will be able to:
- Create and assign variables using
<-,=, and-> - Identify and use all major R data types including numeric, character, logical, complex, and raw
- Perform type checking with
class(),typeof(),is.*(), andas.*()functions - Convert between types safely using coercion rules
- Understand variable scoping and naming conventions
What Are Variables?
In R, a variable is a name that refers to a value stored in memory. Unlike Python or Java, R variables are copies — when you assign x <- 5, R creates a copy of the value.
# Assignment
x <- 42
y <- "Hello"
z <- TRUE
# R uses copy-on-modify semantics
a <- c(1, 2, 3)
b <- a
b[1] <- 99
cat("a:", a, "\n") # a: 1 2 3 (unchanged)
cat("b:", b, "\n") # b: 99 2 3
Assignment Operators
R supports three assignment operators, but <- is the standard convention.
# Preferred: left arrow
x <- 10
# Also valid (less common in R)
x = 10
# Right arrow (rare, used in some pipelines)
10 -> x
# Multiple assignment
a <- b <- c <- 0
Why <- Over =?
# = can be ambiguous in function calls
mean(x = c(1, 2, 3)) # x is a named argument
mean(x <- c(1, 2, 3)) # x is assigned AND passed
# <- is always assignment
x <- 5 # Clearly assigns 5 to x
R Data Types
R has six basic data types:
1. Numeric (Double)
The default numeric type in R is double-precision floating-point (64-bit).
x <- 3.14
class(x) # [1] "numeric"
typeof(x) # [1] "double"
is.numeric(x) # [1] TRUE
# Scientific notation
speed <- 3e8
# [1] 300000000
# Special values
Inf + 1 # [1] Inf (infinity)
-Inf - 1 # [1] -Inf
0/0 # [1] NaN (Not a Number)
2. Integer
Whole numbers, created with the L suffix.
x <- 42L
class(x) # [1] "integer"
typeof(x) # [1] "integer"
is.integer(x) # [1] TRUE
# Without L, it's numeric (double)
y <- 42
typeof(y) # [1] "double"
# Check if value is whole number
is.integer(42) # [1] FALSE (it's double)
is.integer(42L) # [1] TRUE
3. Character (String)
Text data, enclosed in quotes.
name <- "Alice"
class(name) # [1] "character"
typeof(name) # [1] "character"
is.character(name) # [1] TRUE
# Single or double quotes both work
a <- "hello"
b <- 'hello'
# Escape characters
cat("Line 1\nLine 2\n")
# Line 1
# Line 2
cat("Path: C:\\Users\\Documents\n")
# Path: C:\Users\Documents
# Raw strings (R 4.0+)
path <- r"(C:\Users\Documents)"
cat(path)
# C:\Users\Documents
4. Logical (Boolean)
TRUE or FALSE values.
is_active <- TRUE
class(is_active) # [1] "logical"
typeof(is_active) # [1] "logical"
is.logical(is_active) # [1] TRUE
# Shorthand
T <- TRUE
F <- FALSE
# Comparison operators return logical
5 > 3 # [1] TRUE
5 < 3 # [1] FALSE
5 == 5 # [1] TRUE
5 != 3 # [1] TRUE
# Logical operations
TRUE & FALSE # [1] FALSE (AND)
TRUE | FALSE # [1] TRUE (OR)
!TRUE # [1] FALSE (NOT)
5. Complex
Numbers with real and imaginary parts.
z <- 3 + 4i
class(z) # [1] "complex"
typeof(z) # [1] "complex"
# Operations
Re(z) # [1] 3 (real part)
Im(z) # [1] 4 (imaginary part)
Mod(z) # [1] 5 (magnitude)
Arg(z) # [1] 0.9272952 (angle in radians)
Conj(z) # [1] 3-4i (conjugate)
6. Raw
Raw bytes, rarely used directly.
x <- charToRaw("Hello")
x
# [1] 48 65 6c 6c 6f
class(x) # [1] "raw"
typeof(x) # [1] "raw"
# Convert back
rawToChar(x)
# [1] "Hello"
Type Checking Functions
R provides two families of functions for checking types:
class() — User-Friendly Label
class(42) # [1] "numeric"
class(42L) # [1] "integer"
class("hello") # [1] "character"
class(TRUE) # [1] "logical"
class(3 + 4i) # [1] "complex"
class(NULL) # [1] "NULL"
class(NA) # [1] "logical"
class(list()) # [1] "list"
class(data.frame()) # [1] "data.frame"
typeof() — Internal Storage Type
typeof(42) # [1] "double"
typeof(42L) # [1] "integer"
typeof("hello") # [1] "character"
typeof(TRUE) # [1] "logical"
is.*() Family — Boolean Checks
is.numeric(42) # [1] TRUE
is.integer(42) # [1] FALSE
is.character("x") # [1] TRUE
is.logical(TRUE) # [1] TRUE
is.null(NULL) # [1] TRUE
is.na(NA) # [1] TRUE
is.nan(NaN) # [1] TRUE
NA vs NULL vs NaN
| Value | Meaning | is.na() | is.null() | is.nan() |
|---|---|---|---|---|
NA | Missing value | TRUE | FALSE | FALSE |
NULL | Empty/nothing | FALSE | TRUE | FALSE |
NaN | Not a Number | TRUE | FALSE | TRUE |
# NA — missing data
x <- NA
is.na(x) # [1] TRUE
# NULL — empty object
y <- NULL
is.null(y) # [1] TRUE
# NaN — undefined math
z <- 0/0
is.nan(z) # [1] TRUE
is.na(z) # [1] TRUE (NaN is also NA!)
Type Conversion (Coercion)
R automatically converts types in some operations (implicit coercion). You can also convert explicitly (explicit coercion).
Implicit Coercion Hierarchy
When mixing types in a vector, R promotes to the most flexible type:
logical → integer → numeric → complex → character
# Logical + Numeric → Numeric
c(TRUE, 1, 2)
# [1] 1 2 3
# Numeric + Character → Character
c(1, 2, "three")
# [1] "1" "2" "three"
# Logical + Character → Character
c(TRUE, "hello")
# [1] "TRUE" "hello"
Explicit Conversion
| Function | Converts To | Example |
|---|---|---|
as.numeric() | Numeric | as.numeric("3.14") → 3.14 |
as.integer() | Integer | as.integer("42") → 42L |
as.character() | Character | as.character(42) → "42" |
as.logical() | Logical | as.logical(1) → TRUE |
as.complex() | Complex | as.complex(3) → 3+0i |
# String to number
as.numeric("3.14") # [1] 3.14
as.integer("42") # [1] 42
# Number to string
as.character(42) # [1] "42"
as.character(3.14) # [1] "3.14"
# Number to logical
as.logical(0) # [1] FALSE
as.logical(1) # [1] TRUE
as.logical(-1) # [1] TRUE (any non-zero = TRUE)
# Logical to number
as.numeric(TRUE) # [1] 1
as.numeric(FALSE) # [1] 0
Common Gotchas
# NA propagation
c(1, 2, NA, 4) # [1] 1 2 NA 4
mean(c(1, 2, NA, 4)) # [1] NA
mean(c(1, 2, NA, 4), na.rm = TRUE) # [1] 2.333333
# Character conversion loses precision
as.character(pi) # [1] "3.14159265358979"
as.numeric("hello") # [1] NA (with warning)
# Integer overflow in other languages doesn't happen in R
x <- .Machine$integer.max
x + 1 # [1] 2147483648 (becomes double)
Vectors — R's Fundamental Data Structure
A vector is a sequence of values of the same type. R is vectorized — operations work element-wise.
# Create vectors with c()
numbers <- c(1, 2, 3, 4, 5)
characters <- c("a", "b", "c")
logicals <- c(TRUE, FALSE, TRUE)
# Sequences
1:10 # [1] 1 2 3 4 5 6 7 8 9 10
seq(1, 10, by = 2) # [1] 1 3 5 7 9
seq(1, 10, length.out = 5) # [1] 1.00 3.25 5.50 7.75 10.00
# Repetition
rep(1, 5) # [1] 1 1 1 1 1
rep(c(1, 2), each = 3) # [1] 1 1 1 2 2 2
rep(c(1, 2), times = 3)# [1] 1 2 1 2 1 2
# Vectorized operations
x <- c(1, 2, 3, 4, 5)
x + 10 # [1] 11 12 13 14 15
x * 2 # [1] 2 4 6 8 10
x^2 # [1] 1 4 9 16 25
Variable Naming Rules
Valid Names
# Must start with a letter or dot
name <- "valid"
_name <- "valid"
.name <- "valid"
name2 <- "valid"
my_var_name <- "valid"
# Invalid names
# 2name <- "invalid" # Starts with number
# my-var <- "invalid" # Contains hyphen
# my var <- "invalid" # Contains space
# function <- "invalid" # Reserved keyword (but not error in R!)
Naming Conventions
| Style | Example | Usage |
|---|---|---|
| snake_case | my_variable | Recommended for R |
| dot.case | my.variable | Common in base R |
| camelCase | myVariable | Less common in R |
| SCREAMING_SNAKE | MY_CONSTANT | Constants |
# R conventions
user_name <- "Alice" # snake_case (recommended)
total.count <- 100 # dot.case (base R style)
MAX_RETRIES <- 3 # SCREAMING_SNAKE for constants
Variable Scope
Variables have different scopes — where they are accessible.
# Global scope
global_var <- "I'm global"
my_function <- function() {
# Local scope
local_var <- "I'm local"
cat(global_var, "\n") # Can access global
cat(local_var, "\n") # Can access local
}
my_function()
# I'm global
# I'm local
# cat(local_var) # Error: object 'local_var' not found
# Assignment inside function creates local variable
x <- 10
modify_x <- function() {
x <- 20 # This is a NEW local x
cat("Inside:", x, "\n")
}
modify_x() # Inside: 20
cat("Outside:", x, "\n") # Outside: 10
# Use <<- to modify global variables
x <- 10
modify_global <- function() {
x <<- 20 # Modifies the global x
}
modify_global()
cat("After:", x, "\n") # After: 20
Special Values
Missing Values
# NA — Not Available (missing)
x <- c(1, 2, NA, 4, 5)
is.na(x) # [1] FALSE FALSE TRUE FALSE FALSE
na.omit(x) # [1] 1 2 4 5
# attr(,"na.action")
# [1] 3
# attr(,"class")
# [1] "omit"
# Different NA types
NA_integer_ # NA with integer type
NA_real_ # NA with numeric type
NA_complex_ # NA with complex type
NA_character_ # NA with character type
Infinite Values
Inf + 1 # [1] Inf
-Inf - 1 # [1] -Inf
1/0 # [1] Inf
-1/0 # [1] -Inf
is.infinite(Inf) # [1] TRUE
is.finite(Inf) # [1] FALSE
Machine Constants
.Machine$double.xmax # Maximum representable numeric
.Machine$double.xmin # Minimum positive numeric
.Machine$integer.max # Maximum integer
.Machine$double.eps # Smallest difference between two numbers
pi # [1] 3.141593
Practice Exercises
Exercise 1: Type Explorer
Write a function that takes any R object and prints its class(), typeof(), is.numeric(), is.character(), and is.logical() results.
Solution
explore_type <- function(obj) {
cat("Value:", obj, "\n")
cat("Class:", class(obj), "\n")
cat("Type:", typeof(obj), "\n")
cat("is.numeric:", is.numeric(obj), "\n")
cat("is.character:", is.character(obj), "\n")
cat("is.logical:", is.logical(obj), "\n")
cat("---\n")
}
explore_type(42)
explore_type("hello")
explore_type(TRUE)
explore_type(42L)
Exercise 2: Coercion Chain
Predict what each line produces, then verify in R:
c(TRUE, FALSE, TRUE, FALSE)
c(TRUE, 1, 2, 3)
c(1, 2, "three", TRUE)
as.numeric("3.14")
as.character(42)
as.logical(0)
as.logical(1)
as.logical(-1)
Solution
c(TRUE, FALSE, TRUE, FALSE) # [1] TRUE FALSE TRUE FALSE
c(TRUE, 1, 2, 3) # [1] 1 1 2 3 (logical → numeric)
c(1, 2, "three", TRUE) # [1] "1" "2" "three" "TRUE"
as.numeric("3.14") # [1] 3.14
as.character(42) # [1] "42"
as.logical(0) # [1] FALSE
as.logical(1) # [1] TRUE
as.logical(-1) # [1] TRUE (any non-zero = TRUE)
Exercise 3: Missing Value Detector
Write a function that counts how many NA values are in a vector.
Solution
count_na <- function(x) {
sum(is.na(x))
}
# Test
test <- c(1, NA, 3, NA, NA, 6, 7)
count_na(test) # [1] 3
Key Takeaways
- R uses
<-for assignment — it's the convention, not= - Six data types: numeric (double), integer, character, logical, complex, raw
- NA is missing data, NULL is "nothing exists", NaN is invalid math
- Coercion follows a hierarchy: logical → integer → numeric → complex → character
- Use
class()for user-friendly labels,typeof()for internal storage type - Use
is.*()andas.*()families for type checking and conversion - R is vectorized — operations work element-wise on vectors
- Variables have scope — local inside functions, global outside
- Follow naming conventions — snake_case for variables, UPPER_SNAKE for constants
Next: Learn about R Operators — arithmetic, comparison, logical, and assignment operators in R.