# Install packages
install.packages("tidyquant")
install.packages("tidyverse")R Note 1 - Introduction to R and Financial Data Handling
Course Information
Course Name: Investments | 000351051
Semester: Spring 2025
Introduction to R & RStudio
Welcome to the first session of Investments! This tutorial introduces the R programming language and how to handle financial data for investment research.
1. Setting Up R and RStudio
Before we begin, ensure you have installed: - R - RStudio
# Load packages
library(tidyquant)
library(tidyverse)
library(ggplot2)
library(ggthemes)
library(highcharter)
library(DT)2. Basic R Syntax
# Basic operations in R
x <- 10
y <- 5
sum_xy <- x + y
diff_xy <- x - y
prod_xy <- x * y
quot_xy <- x / y
exp_xy <- x^y
print(sum_xy)[1] 15
print(diff_xy)[1] 5
print(prod_xy)[1] 50
print(quot_xy)[1] 2
print(exp_xy)[1] 1e+05
3. Understanding Data Types in R
Numeric
num_var <- 42
print(num_var)[1] 42
Character (String)
char_var <- "Hello, R!"
print(char_var)[1] "Hello, R!"
Logical (Boolean)
bool_var <- TRUE
print(bool_var)[1] TRUE
Factor (Categorical Data)
factor_var <- factor(c("low", "medium", "high", "medium"))
print(factor_var)[1] low medium high medium
Levels: high low medium
Data Structure
# Vectors
prices <- c(100, 102, 105, 107, 110)
print(prices)[1] 100 102 105 107 110
# Lists (Collection of different types)
list_var <- list(num_var, char_var, bool_var, prices)
print(list_var)[[1]]
[1] 42
[[2]]
[1] "Hello, R!"
[[3]]
[1] TRUE
[[4]]
[1] 100 102 105 107 110
# Matrices (2D array)
matrix_var <- matrix(1:9, nrow=3)
print(matrix_var) [,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
# Data Frames
stock_data <- data.frame(
Date = as.Date('2024-01-01') + 0:4,
Price = prices
)
print(stock_data) Date Price
1 2024-01-01 100
2 2024-01-02 102
3 2024-01-03 105
4 2024-01-04 107
5 2024-01-05 110
4. The Pipe Operator (%>% and |>)
The pipe operator allows you to pass the output of one function directly into another function, making your code cleaner and easier to read.
Using %>% from dplyr (or tidyverse family)
The dplyr package provides the %>% pipe operator, which allows for chaining multiple operations.
# Example dataset
data <- data.frame(x = 1:5, y = c(2, 4, 6, 8, 10))
data %>%
mutate(z = x + y) %>%
filter(z > 6) %>%
arrange(desc(z)) x y z
1 5 10 15
2 4 8 12
3 3 6 9
Using |> (Base R Pipe)
R introduced the native |> pipe operator in version 4.1.0. It works similarly but has slightly different behavior:
data |>
mutate(z = x + y) |>
filter(z > 6) |>
arrange(desc(z)) x y z
1 5 10 15
2 4 8 12
3 3 6 9
Key Differences:
%>%allows passing arguments to functions with.placeholder (useful for non-standard functions).|>is optimized for performance and is recommended for base R users.
# An alternative example:
# normal use of a function
mean(prices)[1] 104.8
# pipe from dplyr
prices %>% mean[1] 104.8
# pipe in base R
prices |> mean()[1] 104.8
5. Using For Loops in R
A for loop allows us to iterate over a sequence and perform operations repeatedly.
Basic For Loop Example
# Print numbers from 1 to 5
for (i in 1:5) {
print(i)
}[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
Looping Over a Vector
prices <- c(100, 102, 105, 107, 110)
for (price in prices) {
print(price * 1.1) # Increase price by 10%
}[1] 110
[1] 112.2
[1] 115.5
[1] 117.7
[1] 121
Storing Loop Output in a Vector
adjusted_prices <- numeric(length(prices))
for (i in seq_along(prices)) {
adjusted_prices[i] <- prices[i] * 1.1
}
print(adjusted_prices)[1] 110.0 112.2 115.5 117.7 121.0
Nesting Loops Example
for (i in 1:3) {
for (j in 1:2) {
print(paste("i:", i, "j:", j))
}
}[1] "i: 1 j: 1"
[1] "i: 1 j: 2"
[1] "i: 2 j: 1"
[1] "i: 2 j: 2"
[1] "i: 3 j: 1"
[1] "i: 3 j: 2"
6. Importing Financial Data
Using tidyquant to fetch stock prices:
# Get Apple (AAPL) stock data
apple_stock <- tq_get("AAPL", from = "2023-01-01", to = "2024-01-01")
# Interactive table
datatable(apple_stock) |>
formatRound(columns = c("open","high","low","close","adjusted"), digits = 2)7. Basic Data Manipulation with dplyr
# Calculate daily returns
apple_stock <- apple_stock |>
arrange(date) |>
mutate(daily_return = adjusted / lag(adjusted) - 1)
head(apple_stock)# A tibble: 6 × 9
symbol date open high low close volume adjusted daily_return
<chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 AAPL 2023-01-03 130. 131. 124. 125. 112117500 123. NA
2 AAPL 2023-01-04 127. 129. 125. 126. 89113600 125. 0.0103
3 AAPL 2023-01-05 127. 128. 125. 125. 80962700 123. -0.0106
4 AAPL 2023-01-06 126. 130. 125. 130. 87754700 128. 0.0368
5 AAPL 2023-01-09 130. 133. 130. 130. 70790800 128. 0.00409
6 AAPL 2023-01-10 130. 131. 128. 131. 63896200 129. 0.00446
8. Visualizing Stock Prices with ggplot2
The ggplot2 is a powerful tool for creating visualizations in R. It follows the Grammar of Graphics approach, allowing users to build complex charts by adding multiple layers.
Key Features of ggplot2:
- Layered approach to building plots.
- Highly customizable aesthetic mappings.
- Supports multiple themes for different styling.
Using Themes with ggthemes
The ggthemes package provides multiple predefined themes to enhance the aesthetics of plots. More styles of ggthemes can be found via ggplot2 Themes and Styles.
Now, let’s plot the stock price trend for AAPL and use the wall street journal (WSJ) theme:
# Plot price trend
ggplot(apple_stock, aes(x = date, y = adjusted)) +
geom_line(color = "blue", linewidth = 1) +
labs(title = "AAPL Stock Price", x = "Date", y = "Adjusted Close") +
theme_wsj()
9. Interactive Visualization with highcharter
More pre-built themes of hicharter can be found via highcharter Themes and Styles.
# Create interactive stock price and volume plot
highchart(type = "stock") |>
hc_add_series(apple_stock, type = "line", hcaes(x = date, y = adjusted), name = "AAPL Price") |>
hc_add_series(apple_stock, type = "column", hcaes(x = date, y = volume), name = "Trading Volume", yAxis = 1) |>
hc_yAxis_multiples(
list(title = list(text = "Stock Price")),
list(title = list(text = "Trading Volume"), opposite = TRUE)
) |>
hc_title(text = "AAPL Stock Price and Trading Volume") |>
hc_add_theme(hc_theme_ft())# Candlestick Chart
highchart(type = "stock") |>
hc_add_series(apple_stock, type = "candlestick", hcaes(x = date, open = open, high = high, low = low, close = adjusted), name = "AAPL Price") |>
hc_title(text = "AAPL Candlestick Chart")Exercises
- Fetch stock price data for MSFT using
tidyquant. - Compute daily returns for MSFT.
- Create a price trend plot for MSFT.
- Create a vector, matrix, and data frame, and perform calculations on them.
- Use
highcharterto plot MSFT price and trading volume.
Resources
GenAI: ChatGPT, Gemini, etc.
Data Analysis with R, Coursera, Mine Çetinkaya-Rundel, Duke University.
Google Data Analytics, Coursera, Google.
Stack Overflow for coding Q&A!
Happy coding! 🚀