When comparing economic performance via time series, it is often necessary to rebase the series to a common base period.

Objective: Compare MA vs. CT Nominal GDP.

  setwd("~/workshops/create_indices")
  
  #++++++++++++++++++++++++++++++++++
  suppressMessages(suppressWarnings(library(quantmod)))  
  suppressMessages(suppressWarnings(library(dplyr)))
  suppressMessages(suppressWarnings(library(ggrepel)))
  suppressMessages(suppressWarnings(library(ggthemes)))
  suppressMessages(suppressWarnings(library(ggplot2)))

  remove(list = ls()) #Clear everything      
  options(digits = 3, scipen = 999)

In this exercise we will examine CT’s economic performance relative to MA. We will use nominal GDP

First download data from FRED; you can use quantmod of pdfetch. Obviously, you need to know the code for the particular series you want. The quarterly series available on FRED starts on 2005:Q1 and ends on 2019:Q1.

Connecticut GDP

        getSymbols("CTNQGSP", return.class = "xts", index.class = "Date", src = "FRED")
## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
## 
## This message is shown once per session and may be disabled by setting 
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
## [1] "CTNQGSP"
        gdp_q_CT =CTNQGSP
        #head(gdp_q_CT,3); tail(gdp_q_CT,3) # 2005Q1 to 2019Q1
        autoplot(gdp_q_CT)

Note that autoplot() is a ggplot object. You can enhance the automatic look provided by autoplot with ggplot() faceting. Note the theme_classic() from the package ggthemes; it provides a specific look and feel.

                                 autoplot(gdp_q_CT) + labs(x = "Date",
                                 y = "CT GDP",
                                 caption = "(Source: CT GDP: FRED)",
                                 title ="CT Nominal GDP",
                                 subtitle = "2007Q1 to 2019:Q1")+
                                  theme_classic()

Download MA data from FRED.

               getSymbols("MANQGSP", return.class = "xts", index.class = "Date", src = "FRED")
## [1] "MANQGSP"
               gdp_q_MA =MANQGSP
               #head(gdp_q_MA,3); tail(gdp_q_MA,3) # 2005Q1 to 2019Q1
               #autoplot(gdp_q_MA)
               
               autoplot(gdp_q_MA) + labs(x = "Date",
                                      y = "MA GDP",
                                      caption = "(Source: MA GDP: FRED)",
                                      title ="MA Nominal GDP",
                                      subtitle = "2007Q1 to 2019:Q1")+
                 theme_classic()

It is difficult to easily compare the relative performance. Placing both series on one graph will improve interpretation.

We create a dataframe of the relevant series. First we convert to a ts format. And create a Date sequence to facilitate graphing.

The dplyr mutate() function below shows this. In addition, it is easier for ggplot to draw from a data in long form. The tidyr command gather() provides this shift.

               # Create a dataframe
               Date = seq(as.Date("2005-01-01"), as.Date("2019-01-01"), by = "1 quarter")
                   gdp_q_CT = as.ts(gdp_q_CT) #Convert to ts
                   gdp_q_MA = as.ts(gdp_q_MA) #Convert to ts
               gdp_q_df = as.data.frame(cbind(Date, gdp_q_CT, gdp_q_MA))
               
               #convert from long to wide
               gdp_df = gdp_q_df %>% tidyr::gather(key = key, value = value, -Date )
               #head(gdp_df,3); tail(gdp_df,3)
               
               
               ggplot(gdp_df, aes(x = as.Date(Date), y = value, col = key)) +
                  geom_line()+
                    labs(x = "Date",
                        y = "Index, 2007:Q1 = 100",
                            caption = "(Source: CT GDP & AWH, GDP Deflator: FRED)",
                              title ="CT Labor Productivity, Real GDP, & Hours Worked",
                                subtitle = "2007Q1 to 2019:Q1")+
                                  theme_classic()+
                                    theme(legend.position = "none") +
                                      geom_text(data = subset(gdp_df, Date == "17075"), 
                                        aes(label = key, x = as.Date(Date), y = value), 
                                          hjust = 2, vjust = 2, size = 4)

The difference in magnitudes masks important information between the series. It would be easier if they were both placed on an apples to apples basis.

This requires that both series be rebased to 100 on 2005:Q1.

gdp_indices = gdp_q_df %>% mutate(CT = gdp_q_CT*100/gdp_q_CT[1], 
                                  MA = gdp_q_MA*100/gdp_q_MA[1]) %>%
                                    select(Date, CT,MA)


gdp_ind = gdp_indices %>% tidyr::gather(key = key, value = value, -Date )
#gdp_ind

      ggplot(gdp_ind, aes(x = as.Date(Date), y = value, col = key)) +
        geom_line()+
          geom_hline(yintercept=100, linetype="dashed", 
             color = "blue", size=0.25)+
              labs(x = "Date",
                y = "Index, 2005:Q1 = 100",
                  caption = "(Source: CT & MA Nominal GDP: FRED)",
                    title ="Nominal GDP, CT & MA",
                      subtitle = "2007Q1 to 2019:Q1")+
                        theme_classic()+
                          theme(legend.position = "none") +
                            geom_text(data = subset(gdp_ind, Date == "17075"), 
                              aes(label = key, x = as.Date(Date), y = value), 
                                  hjust = 4.5, vjust = 1, size = 6, fontface = 2)

####################################################################################################  

And there you have it. Once rebased, the series can provide a clearer visual of the relative performance of each state.

  #IN CLASS ASSIGNMENT: Add US Nominal GDP to the comparison graph.
  #

Download US GDP from FRED

      getSymbols("GDP", return.class = "xts", index.class = "Date", src = "FRED")
## [1] "GDP"
      gdp_q_US =GDP
      #head(gdp_q_US,3); tail(gdp_q_US,3) # 2005Q1 to 2019Q1
      #autoplot(gdp_q_US)

US series needs to be shortened to match length of CT & MA time series

      gdp_q_US = gdp_q_US["2005-01-01/2019-01-01"]  #this is how you shorten an XTS series
      
          gdp_q_US = as.ts(gdp_q_US) #Convert to ts
          gdp_q_df = as.data.frame(cbind(Date, gdp_q_CT, gdp_q_MA, gdp_q_US))
      
      #convert from long to wide
      gdp_df = gdp_q_df %>% tidyr::gather(key = key, value = value, -Date )
      #head(gdp_df,3); tail(gdp_df,3)
      
      
      ggplot(gdp_df, aes(x = as.Date(Date), y = value, col = key)) +
        geom_line()+
          labs(x = "Date",
            y = "Index, 2005:Q1 = 100",
             caption = "(Source: FRED)",
              title ="CT, MA & US Nominal GDP",
                subtitle = "2005Q1 to 2019:Q1")+
                  theme_classic()+
                    theme(legend.position = "none") +
                      geom_text(data = subset(gdp_df, Date == "17075"), 
                        aes(label = key, x = as.Date(Date), y = value), 
                          hjust = 2, vjust = 2)

Again, it makes more sense to rebase the series to have the identical base.

Note how it is possible to change the entire look and feel of the graph by changing the theme. It is possible to alter particular elements of the graph.

Here i use theme_tufte and then change the background image to “beige”. Also note the use of geom_text_repel() from the package ggrepel; it distances the labels from the graph to avoid visual clutter.

 gdp_indices = gdp_q_df %>% mutate(CT = gdp_q_CT*100/gdp_q_CT[1], 
                                        MA = gdp_q_MA*100/gdp_q_MA[1],
                                        US = gdp_q_US*100/gdp_q_US[1]) %>%
                                          select(Date, CT,MA, US)
      
      
      gdp_ind = gdp_indices %>% tidyr::gather(key = key, value = value, -Date )
      #gdp_ind
      
      ggplot(gdp_ind, aes(x = as.Date(Date), y = value, col = key)) +
        geom_line()+
          geom_hline(yintercept=100, linetype="dashed", 
           color = "blue", size=0.25)+
            labs(x = "Date",
             y = "Index, 2005:Q1 = 100",
             caption = "(Source: FRED)",
              title ="Nominal GDP, CT, MA & US",
                subtitle = "2005Q1 to 2019:Q1")+
                  theme_tufte()+
                    theme(legend.position = "none") +
                      geom_text_repel(data = subset(gdp_ind, Date == "17075"), 
                        aes(label = key, x = as.Date(Date), y = value), 
                          hjust = -1, vjust = -1, size = 6, fontface = 2) +
      
      theme(
        panel.background = element_rect(fill = "beige",
                                        colour = "lightblue",
                                        size = 0.5, linetype = "solid"),
        panel.grid.major = element_line(size = 0.5, linetype = 'solid',
                                        colour = "white"), 
        panel.grid.minor = element_line(size = 0.25, linetype = 'solid',
                                        colour = "white")
      )

      #===============================================================================================================#

QED AMDG