Welcome to the second installment of my three-part series dedicated to portfolio standard deviation, also known as volatility. In this series, you will learn to build a Shiny application to visualize total portfolio volatility over time, as well as how each asset has contributed to that volatility. You will accomplish this over the course of three posts, but if you just can’t wait, the final app is available here. If you missed the first installment on calculating and visualizing portfolio standard deviation, you can find it here.

By way of quick reminder, our portfolio consists of the following:


+ SPY (S&P500 fund), weighted 25%
+ EFA (a non-US equities fund), weighted 25%
+ IJS (a small-cap value fund), weighted 20%
+ EEM (an emerging-mkts fund), weighted 20%
+ AGG (a bond fund), weighted 10%

We first need to import prices and calculate returns for this portfolio. You can find the bare code needed to do so in the previous post. For the curious, there is an entire post on converting individual prices to monthly returns here and another post on converting to portfolio returns here.

Component Contribution to Standard Deviation

Let’s start investigating how each of our five assets contributes to portfolio standard deviation. Why might we want to do that?

From a substantive perspective, you might want to ensure that risk is not too concentrated in one asset. Not only might this lead to a less diversified portfolio than you intended, but it also might indicate that your initial assumptions about a particular asset were wrong — or at least, have become less right over time.

From an R toolkit perspective, we want to build our own custom function for the job, nest it within another custom function to make it rolling, apply the combination with purrr  and then use both functions in a Shiny app.

Let’s load up our packages and get started:


library(PerformanceAnalytics)
library(tidyverse)
library(tidyquant)
library(timetk)
library(tibbletime)
library(highcharter)
library(scales)

Component Contribution, Step-by-Step

An asset's percentage contribution to portfolio standard deviation is defined as:

(marginal contribution of the asset * weight of the asset) / portfolio standard deviation

Let's start by building a covariance matrix and calculating the standard deviation of our portfolio. You should note that this is exactly how I started the previous post — in that case, it was more of an exploration of matrix algebra. Today, it will become essential.


covariance_matrix <- 
cov(asset_returns_xts)

sd_portfolio <- 
  sqrt(t(w) %*% covariance_matrix %*% w)
 

We will find the marginal contribution of each asset by taking the cross product of the weights vector and the covariance matrix, divided by the portfolio standard deviation.


marginal_contribution 
  w %*% covariance_matrix / sd_portfolio[1, 1]

Now, multiply the marginal contribution of each asset by your original weights vector, or — which we defined way back in the Returns section of the first post — to get the total contribution.


component_contribution 
  marginal_contribution * w 

Let's check that the sum of the asset contributions is equal to total portfolio standard deviation.


components_summed - rowSums(component_contribution)

components_summed

[1] 0.02661184

sd_portfolio[1,1]

[1] 0.02661184

The summed components are equal to the total, as we hoped.

To get to percentage contribution of each asset, divide each asset’s contribution by total portfolio standard deviation.


component_percentages  
  component_contribution / sd_portfolio[1, 1]

round(component_percentages, 3)

       SPY   EFA   IJS  EEM   AGG
[1,] 0.233 0.276 0.227 0.26 0.003

And there you go, a step-by-step code flow! Now let’s get functional.

Component Contribution with a Custom Function

You will notice in the above code, we supplied only two pieces of data: asset_returns_xts and w. Therefore, we can write a function that takes two arguments — an individual asset returns object and a weights vector — and calculates the contribution to standard deviation of each of the assets. Let's title our function component_contr_matrix_fun() and port our previous step-by-step code:


component_contr_matrix_fun <- function(returns, w){
# create covariance matrix
covariance_matrix <- 
  cov(returns)
# calculate portfolio standard deviation
sd_portfolio <- 
  sqrt(t(w) %*% covariance_matrix %*% w)
# calculate marginal contribution of each asset
marginal_contribution <- 
  w %*% covariance_matrix / sd_portfolio[1, 1]
# multiply marginal by weights vecotr
component_contribution <- 
  marginal_contribution * w 
# divide by total standard deviation to get percentages
component_percentages <- 
  component_contribution / sd_portfolio[1, 1] 

component_percentages %>% 
  as_tibble() %>% 
  gather(asset, contribution)
}

Check out the final two lines of the function: as_tibble() converts the result to a tibble and gather(asset, contribution) converts to a tidy format. We don’t have to include those; feel free to delete them and see how it changes the results.

Let’s test the function and confirm it returns the same results as our step-by-step code flow if we pass in the same data as before: asset_returns_xts and w.


test_the_function_xts  
  component_contr_matrix_fun(asset_returns_xts, w)

test_the_function_xts
asset <chr> contribution <dbl>
SPY 0.233393101
EFA 0.276260324
IJS 0.227019764
EEM 0.260150284
AGG 0.003176527 

The substantive results are the same as the results of our earlier step-by-step calculation.

Let’s test whether we can pass in a tibble of returns. We will first convert asset_returns_xs to a tibble with tk_tbl() and save the result as asset_returns_tibble. Then, we'll pass that object to our function.


asset_returns_tibble  
  asset_returns_xts %>% 
  tk_tbl(preserve_index = TRUE, rename_index = "date") 

percentages_tibble 
  asset_returns_tibble %>% 
  select(-date) %>% 
  component_contr_matrix_fun(., w)

percentages_tibble
asset <chr>
contribution <dbl>
SPY 0.233393101
EFA 0.276260324
IJS 0.227019764
EEM 0.260150284
AGG  0.003176527

We have consistent results from our step-by-step code and from our custom function when passing in xts and tibble data.

The custom function accomplishes what we want. Of course, I'm certainly not the first person to imagine the usefulness of component contribution to standard deviation. There is a built-in function to accomplish this: StdDev(asset_returns_xts, weights = w, portfolio_method = "component") from the PerformanceAnalytics package.

However, writing our own function allows us flexibility. We are able to pass in a tibble and specify the return format as a tidy tibble, alter the matrix algebra and calculate an entirely different statistic, or alter the output to include the weights as a column. Your creativity is the only limitation when it comes to custom functions, especially once you have a template for writing and creating them.

Next, you will see how returning a tibble facilitates visualization.

Visualizing Component Contribution

The output of our custom function, percentages_tibble, is already tidy. We can pipe to ggplot() and use geom_bar() to chart risk contribution on the y-axis.


percentages_tibble %>% 
  ggplot(aes(x = asset, y = contribution)) +
  geom_col(fill = 'cornflowerblue', 
           colour = 'pink', 
           width = .6) + 
  scale_y_continuous(labels = percent, 
                     breaks = pretty_breaks(n = 20)) + 
  ggtitle("Percent Contribution to Standard Deviation") +
  theme(plot.title = element_text(hjust = 0.5)) +
  xlab("") +
  ylab("Percent Contribution to Risk")
contribution-graph-1 

This chart shows individual contributions, but it would be useful to see a chart that compares asset weight to risk contribution.

In order to create such a chart, gather your tibble to long format with gather(type, percent, -asset), then call ggplot(aes(x = asset, y = percent, fill = type)). Position the columns so they are not right on top of each other with geom_col(position='dodge').


percentages_tibble %>%
  mutate(weights = w) %>% 
  gather(type, percent, -asset) %>% 
  group_by(type) %>% 
  ggplot(aes(x = asset, 
             y = percent, 
             fill = type)) +
  geom_col(position='dodge') + 
  scale_y_continuous(labels = percent) + 
  ggtitle("Percent Contribution to Volatility") +
  theme(plot.title = element_text(hjust = 0.5))

contribution-graph-2

This chart makes it clear that AGG, the bond fund, has done a good job as a volatility dampener. It has a 10% allocation, but contributes almost zero to volatility.

You have now written a custom function and used the results in a nice visualization. It's time to think about how you could add the ggplot() flow to your custom function so that the result is a data visualization.

Rolling Component Contribution

Let's embark on our most involved task yet, which is to calculate the rolling volatility contribution of each asset. The previous section helped us uncover the total contribution of each asset over the life of the portfolio, but it did not help us understand risk components over time.

To gain that understanding, you need to create a function that calculates the rolling contribution to standard deviation after being supplied with four arguments:

  1. Asset returns
  2. Portfolio weights
  3. Starting date
  4. Rolling window

Before you start coding, here is the logic that you need to implement:

  1. Assign a start date and rolling window.
  2. Assign an end date based on the start date and window. If we set window = 24, we will be calculating contributions for the 24-month period between the start date and the end date.
  3. Use filter() to subset the original data. I labelled the subsetted data frame as returns_to_use.
  4. Assign a weights object called w based on the weights argument.
  5. Pass returns_to_use and w to component_contr_matrix_fun(), the function we created earlier. We are nesting one custom function inside another.
  6. After running component_contr_matrix_fun(), you will have an object called component_percentages. What is this? It is the risk contribution for each asset during the first 24-month window.
  7. Add a date to component_percentages with mutate(date = ymd(end_date)). We have the component contributions as of the end date.
  8. We now have the risk contributions for the 24-month period that started on the first date, or January 2013, and ended on the end date, January 2015, because we default to start = 1If we wanted to get the risk contribution for a 24-month period that started on the second date or February 2013, we would set start = 2, and so forth.
  9. If we want risk contributions for all the 24-month periods, we need to apply this function starting at January, then February, then March, then April, all the way to the start date that is 24 months before the end of our data.

Now, let's turn those enumerated steps into a function.


interval_sd_by_hand  
  function(returns_df, 
           start = 1, 
           window = 24, 
           weights){
  
  # First create start date. 
  start_date 
    returns_df$date[start]
  
  # Next create an end date that depends 
  # on start date and window.
  end_date 
    returns_df$date[c(start + window)]
  
  # Filter on start and end date.
  returns_to_use 
    returns_df %>% 
    filter(date >= start_date and date <- end_date) %>% 
    select(-date)
  
  # Portfolio weights
  w <- weights

  # Call our original custom function
  # We are nesting one function inside another
  component_percentages 
   component_contr_matrix_fun(returns_to_use, w)
  
  # Add back the end date as date column
  results_with_date 
    component_percentages %>% 
    mutate(date = ymd(end_date)) %>%
    select(date, everything()) %>% 
    spread(asset, contribution) %>% 
    # Round the results for better presentation
    mutate_if(is.numeric, function(x) x * 100)
}

We can test our work and apply that function to a returns tibble.


test_interval_function_1 <- 
  interval_sd_by_hand(asset_returns_tibble, 
                 start = 1, 
                 window = 24, 
                 weights = w)


test_interval_function_1
date 
<date>
AGG 
<dbl>
EEM 
<dbl>
EFA 
<dbl>
IJS 
<dbl>
SPY 
<dbl>
2015-01-31 1.157802 25.57255 29.22951 22.98519 21.05495

The function returns the contribution to risk for each asset from January 2013 through January 2015. The reason the date column is 2015-01-31 is that, as of that date, our results are the contributions over the previous 24 months of each asset.

What if we started at start = 2?


test_interval_function_2 <- 
  interval_sd_by_hand(asset_returns_tibble, 
                 start = 2, 
                 window = 24, 
                 weights = w) %>% 
      mutate_if(is.numeric, funs(round(., 3)))

test_interval_function_2
date
<date>
AGG
<dbl>
EEM
<dbl>
EFA
<dbl>
IJS
<dbl>
SPY
<dbl>
2015-02-28 1.003 25.643 27.969 23.932  21.453

We have moved up by one month to February 2015.

You could keep doing this manually — moving up one date with each function call  and then paste together all of the results. Instead, we will use the purrr package to apply interval_sd_by_hand() and demonstrate the power of purrr alongside custom functions.

Purrr contains a family of map functions, which apply another function iteratively, similar to a loop. Let's call map_df() to apply our rolling function iteratively, looping over the date index of our returns object. The appending of _df tells map() will store our results in tibble rows.

map_df() takes five arguments: the function to be applied, which is interval_sd_by_hand(), and the four arguments to that function, which are returns, weights, a rolling window, and the starting date index.


window <- 24

portfolio_vol_components_tidy_by_hand <- 
  # First argument: 
  # tell map_df to start at date index 1
  # This is the start argument to interval_sd_by_hand() 
  # and it is what map() will loop over until we tell 
  # it to stop at the date that is 24 months before the
  # last date.
  map_df(1:(nrow(asset_returns_tibble) - window),
         # Second argument: 
         # tell it to apply our rolling function
         interval_sd_by_hand, 
         # Third argument:
         # tell it to operate on our returns 
         returns_df = asset_returns_tibble,
         # Fourth argument: 
         # supply the weights
         weights = w, 
         # Fifth argument: 
         # supply the rolling window
         window = window)

tail(portfolio_vol_components_tidy_by_hand)
date
<date>
AGG
<dbl>
EEM
<dbl>
EFA
<dbl>
IJS
<dbl>
SPY
<dbl>
2017-07-31 0.13271359 26.80095 26.41556 22.44355 24.20723
2017-08-31 0.18180307 26.58476 26.95346 21.79423 24.48574
2017-09-30 -0.03963607 25.58903 26.39014 23.87013 24.19033
2017-10-31 0.03207932 25.31378 25.67523 24.80619 24.17271
2017-11-30 0.08530671 26.72473 25.35333 25.72146 22.11517
2017-12-31 0.01379818 26.06585 25.20422 26.37297  22.34316

The result is the rolling component contribution to standard deviation of our five assets. That last date is December 31, 2017. The results for that date are the contributions to standard deviation for each asset over the preceding 24 months.

We did quite a bit of work to get that data: We ran a step-by-step calculation with matrix algebra and wrapped it into a function called component_contr_matrix_fun(), then created another function called my_interval_sd() to make it rolling. Finally, we applied it with map_df(). Now you know why R coders spend so much time at white boards!

Let’s complete our task by visualizing these results.

Visualizing Rolling Component Contribution

We begin our aesthetic journey with ggplot(). Since your results are in wide format, you will first convert them to long format with gather(asset, contribution, -date) and then chart by asset.


portfolio_vol_components_tidy_by_hand %>% 
    gather(asset, contribution, -date) %>% 
    group_by(asset) %>%
    ggplot(aes(x = date)) +
    geom_line(aes(y = contribution, 
                  color = asset)) +
    scale_x_date(breaks = 
                   pretty_breaks(n = 8)) +
    scale_y_continuous(labels = 
                   function(x) paste0(x, "%"))

contribution-graph-3

This plot clearly conveys how the five assets have contributed over time. In particular, when one asset’s contribution surpasses another asset’s contribution, we can see where the lines cross.

That said, percentages that total 100 are very often visualized with a stacked chart in the financial world. You can build such a chart with geom_area(aes(colour = asset, fill= asset), position = 'stack'), a function that tells ggplot() to construct an area chart where each geom is stacked.


portfolio_vol_components_tidy_by_hand %>% 
    gather(asset, contribution, -date) %>% 
    group_by(asset) %>%
ggplot(aes(x = date, 
           y = contribution)) + 
geom_area(aes(colour = asset, 
              fill= asset), 
          position = 'stack') +
  scale_x_date(breaks = 
                   pretty_breaks(n = 8)) +
  scale_y_continuous(labels = 
                    function(x) paste0(x, "%"))

contribution-graph-4

Ggplot() has stacked these in alphabetical order. It is hard to see AGG’s contribution at all. Stacked charts are common in finance when time series sum to 100, but I find that they convey less information than line charts.

Visualizing with Highcharter

First, let's convert the tibble to an xts with tk_xts(date_var = date).


portfolio_vol_components_tidy_xts <- 
  portfolio_vol_components_tidy_by_hand %>% 
  tk_xts(date_var = date, 
         silent = TRUE)

Now, add each time series to the highchart() flow.

Note that we have tweaked the minimum and maximum y-axis values by adding hc_yAxis(...max = max(portfolio_vol_components_tidy_xts) + 5, min = min(portfolio_vol_components_tidy_xts) - 5. That line of code will set the maximum y-axis value to the highest value in our data plus five and the minimum y-axis value to the lowest value in our data minus five. This is purely aesthetic.


highchart(type = "stock") %>% 
  hc_title(text = "Volatility Contribution") %>%
  hc_add_series(portfolio_vol_components_tidy_xts[, 1], 
                name = symbols[1]) %>%
  hc_add_series(portfolio_vol_components_tidy_xts[, 2], 
                name = symbols[2]) %>%
  hc_add_series(portfolio_vol_components_tidy_xts[, 3], 
                name = symbols[3]) %>%
  hc_add_series(portfolio_vol_components_tidy_xts[, 4], 
                name = symbols[4]) %>%
  hc_add_series(portfolio_vol_components_tidy_xts[, 5], 
                name = symbols[5]) %>%
  hc_yAxis(labels = list(format = "{value}%"), 
      max = max(portfolio_vol_components_tidy_xts) + 5,
      min = min(portfolio_vol_components_tidy_xts) - 5,
      opposite = FALSE) %>%
  hc_navigator(enabled = FALSE) %>% 
  hc_scrollbar(enabled = FALSE) %>% 
  hc_add_theme(hc_theme_flat()) %>%
  hc_exporting(enabled = TRUE) %>% 
  hc_legend(enabled = TRUE)

contribution-graph-5

This looks good! You can see clearly that the IJS has been contributing more to volatility recently.

To build a stacked chart using highcharter, you must first change to an area chart using hc_chart(type = "area") and then set hc_plotOptions(area = list(stacking = "percent"...).


highchart() %>% 
  hc_chart(type = "area") %>% 
  hc_title(text = "Volatility Contribution") %>%
  hc_plotOptions(area = list(
     stacking = "percent",
     lineColor = "#ffffff",
     lineWidth = 1,
     marker = list(
       lineWidth = 1,
       lineColor = "#ffffff"
       ))
     ) %>% 
  hc_add_series(portfolio_vol_components_tidy_xts[, 1], 
                name = names(portfolio_vol_components_tidy_xts[,1])) %>%
  hc_add_series(portfolio_vol_components_tidy_xts[, 2], 
                name = names(portfolio_vol_components_tidy_xts[,2])) %>%
  hc_add_series(portfolio_vol_components_tidy_xts[, 3], 
                name = names(portfolio_vol_components_tidy_xts[,3])) %>%
  hc_add_series(portfolio_vol_components_tidy_xts[, 4], 
                name = names(portfolio_vol_components_tidy_xts[,4])) %>%
  hc_add_series(portfolio_vol_components_tidy_xts[, 5], 
                name = names(portfolio_vol_components_tidy_xts[,5])) %>%
  hc_yAxis(labels = list(format = "{value}%"),
      opposite = FALSE) %>%
  hc_xAxis(type = "datetime") %>%
  hc_tooltip(pointFormat = 
"
{series.name}:{point.percentage:.2f}%
", shared = TRUE) %>% hc_navigator(enabled = FALSE) %>% hc_scrollbar(enabled = FALSE) %>% hc_add_theme(hc_theme_flat()) %>% hc_exporting(enabled = TRUE) %>% hc_legend(enabled = TRUE)

contribution-graph-6

And there you go! That’s all for today. In the next installment, you will wrap to a Shiny application. Thanks for reading.

Jonathan Regenstein
Author
Jonathan Regenstein

Jonathan is the Director of Financial Services at RStudio and the author of Reproducible Finance with R: Code Flows and Shiny Apps for Portfolio Management (CRC Press). He writes the Reproducible Finance blog series for RStudio and his code/apps can be seen at www.reproduciblefinance.com. Prior to joining RStudio, he worked at JP Morgan. He studied international relations at Harvard University and did graduate work in political economy at Emory University.