R Note 1 - Introduction to R and Financial Data Handling

Author
Affiliation

Asst. Prof. Calvin J. Chiou

National Chengchi University (NCCU)

Course Information

  • Course Name: Investments | 000351051

  • Semester: Spring 2025

Introduction to R & RStudio

Welcome to the first session of Investments! This tutorial introduces the R programming language and how to handle financial data for investment research.

1. Setting Up R and RStudio

Before we begin, ensure you have installed: - R - RStudio

# Install packages
install.packages("tidyquant")
install.packages("tidyverse")
# Load packages
library(tidyquant)
library(tidyverse)
library(ggplot2)
library(ggthemes)
library(highcharter)
library(DT)

2. Basic R Syntax

# Basic operations in R
x <- 10
y <- 5

sum_xy <- x + y
diff_xy <- x - y
prod_xy <- x * y
quot_xy <- x / y
exp_xy <- x^y

print(sum_xy)
[1] 15
print(diff_xy)
[1] 5
print(prod_xy)
[1] 50
print(quot_xy)
[1] 2
print(exp_xy)
[1] 1e+05

3. Understanding Data Types in R

Numeric

num_var <- 42
print(num_var)
[1] 42

Character (String)

char_var <- "Hello, R!"
print(char_var)
[1] "Hello, R!"

Logical (Boolean)

bool_var <- TRUE
print(bool_var)
[1] TRUE

Factor (Categorical Data)

factor_var <- factor(c("low", "medium", "high", "medium"))
print(factor_var)
[1] low    medium high   medium
Levels: high low medium

Data Structure

# Vectors
prices <- c(100, 102, 105, 107, 110)
print(prices)
[1] 100 102 105 107 110
# Lists (Collection of different types)
list_var <- list(num_var, char_var, bool_var, prices)
print(list_var)
[[1]]
[1] 42

[[2]]
[1] "Hello, R!"

[[3]]
[1] TRUE

[[4]]
[1] 100 102 105 107 110
# Matrices (2D array)
matrix_var <- matrix(1:9, nrow=3)
print(matrix_var)
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9
# Data Frames
stock_data <- data.frame(
  Date = as.Date('2024-01-01') + 0:4,
  Price = prices
)
print(stock_data)
        Date Price
1 2024-01-01   100
2 2024-01-02   102
3 2024-01-03   105
4 2024-01-04   107
5 2024-01-05   110

4. The Pipe Operator (%>% and |>)

The pipe operator allows you to pass the output of one function directly into another function, making your code cleaner and easier to read.

Using %>% from dplyr (or tidyverse family)

The dplyr package provides the %>% pipe operator, which allows for chaining multiple operations.

# Example dataset
data <- data.frame(x = 1:5, y = c(2, 4, 6, 8, 10))

data %>% 
  mutate(z = x + y) %>% 
  filter(z > 6) %>% 
  arrange(desc(z))
  x  y  z
1 5 10 15
2 4  8 12
3 3  6  9

Using |> (Base R Pipe)

R introduced the native |> pipe operator in version 4.1.0. It works similarly but has slightly different behavior:

data |> 
  mutate(z = x + y) |> 
  filter(z > 6) |> 
  arrange(desc(z))
  x  y  z
1 5 10 15
2 4  8 12
3 3  6  9

Key Differences:

  • %>% allows passing arguments to functions with . placeholder (useful for non-standard functions).

  • |> is optimized for performance and is recommended for base R users.

# An alternative example:
# normal use of a function
mean(prices)
[1] 104.8
# pipe from dplyr
prices %>% mean
[1] 104.8
# pipe in base R
prices |> mean()
[1] 104.8

5. Using For Loops in R

A for loop allows us to iterate over a sequence and perform operations repeatedly.

Basic For Loop Example

# Print numbers from 1 to 5
for (i in 1:5) {
  print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

Looping Over a Vector

prices <- c(100, 102, 105, 107, 110)

for (price in prices) {
  print(price * 1.1)  # Increase price by 10%
}
[1] 110
[1] 112.2
[1] 115.5
[1] 117.7
[1] 121

Storing Loop Output in a Vector

adjusted_prices <- numeric(length(prices))

for (i in seq_along(prices)) {
  adjusted_prices[i] <- prices[i] * 1.1
}
print(adjusted_prices)
[1] 110.0 112.2 115.5 117.7 121.0

Nesting Loops Example

for (i in 1:3) {
  for (j in 1:2) {
    print(paste("i:", i, "j:", j))
  }
}
[1] "i: 1 j: 1"
[1] "i: 1 j: 2"
[1] "i: 2 j: 1"
[1] "i: 2 j: 2"
[1] "i: 3 j: 1"
[1] "i: 3 j: 2"

6. Importing Financial Data

Using tidyquant to fetch stock prices:

# Get Apple (AAPL) stock data
apple_stock <- tq_get("AAPL", from = "2023-01-01", to = "2024-01-01")
# Interactive table
datatable(apple_stock) |> 
  formatRound(columns = c("open","high","low","close","adjusted"), digits = 2)

7. Basic Data Manipulation with dplyr

# Calculate daily returns
apple_stock <- apple_stock |> 
  arrange(date) |> 
  mutate(daily_return = adjusted / lag(adjusted) - 1)
head(apple_stock)
# A tibble: 6 × 9
  symbol date        open  high   low close    volume adjusted daily_return
  <chr>  <date>     <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl>        <dbl>
1 AAPL   2023-01-03  130.  131.  124.  125. 112117500     124.     NA      
2 AAPL   2023-01-04  127.  129.  125.  126.  89113600     125.      0.0103 
3 AAPL   2023-01-05  127.  128.  125.  125.  80962700     124.     -0.0106 
4 AAPL   2023-01-06  126.  130.  125.  130.  87754700     128.      0.0368 
5 AAPL   2023-01-09  130.  133.  130.  130.  70790800     129.      0.00409
6 AAPL   2023-01-10  130.  131.  128.  131.  63896200     129.      0.00446

8. Visualizing Stock Prices with ggplot2

The ggplot2 is a powerful tool for creating visualizations in R. It follows the Grammar of Graphics approach, allowing users to build complex charts by adding multiple layers.

Key Features of ggplot2:

  • Layered approach to building plots.
  • Highly customizable aesthetic mappings.
  • Supports multiple themes for different styling.

Using Themes with ggthemes

The ggthemes package provides multiple predefined themes to enhance the aesthetics of plots. More styles of ggthemes can be found via ggplot2 Themes and Styles.

Now, let’s plot the stock price trend for AAPL and use the wall street journal (WSJ) theme:

# Plot price trend
ggplot(apple_stock, aes(x = date, y = adjusted)) +
  geom_line(color = "blue", linewidth = 1) +
  labs(title = "AAPL Stock Price", x = "Date", y = "Adjusted Close") +
  theme_wsj()

9. Interactive Visualization with highcharter

More pre-built themes of hicharter can be found via highcharter Themes and Styles.

# Create interactive stock price and volume plot
highchart(type = "stock") |> 
  hc_add_series(apple_stock, type = "line", hcaes(x = date, y = adjusted), name = "AAPL Price") |> 
  hc_add_series(apple_stock, type = "column", hcaes(x = date, y = volume), name = "Trading Volume", yAxis = 1) |> 
  hc_yAxis_multiples(
    list(title = list(text = "Stock Price")),
    list(title = list(text = "Trading Volume"), opposite = TRUE)
  ) |> 
  hc_title(text = "AAPL Stock Price and Trading Volume") |>
  hc_add_theme(hc_theme_ft())
# Candlestick Chart
highchart(type = "stock") |> 
  hc_add_series(apple_stock, type = "candlestick", hcaes(x = date, open = open, high = high, low = low, close = adjusted), name = "AAPL Price") |> 
  hc_title(text = "AAPL Candlestick Chart")

Exercises

  1. Fetch stock price data for MSFT using tidyquant.
  2. Compute daily returns for MSFT.
  3. Create a price trend plot for MSFT.
  4. Create a vector, matrix, and data frame, and perform calculations on them.
  5. Use highcharter to plot MSFT price and trading volume.

Resources

  1. Cookbook for R.

  2. R Workflow.

  3. GenAI: ChatGPT, Gemini, etc.

  4. Data Analysis with R, Coursera, Mine Çetinkaya-Rundel, Duke University.

  5. Google Data Analytics, Coursera, Google.

  6. Stack Overflow for coding Q&A!


Happy coding! 🚀

Back to top