NYC Hotel Analysis

by lksfr

NYC Hotel Analysis/nyc_hospitality.ipynb

# turning off warnings 
options(warn=-1)

Loading Required Libraries

library(tidyverse)
library(readxl)
library(gganimate)
library(zoo)
library(ggimage)

Reading in Data

First, I read in the data I found on NYC & Company's website:

occ_adr <- read_excel("hotel_reports.xlsx", sheet="occ_adr")
demand <- read_excel("hotel_reports.xlsx", sheet="demand")

Data Cleaning

The data was pretty clean to begin with but I still had to make some adjustments:

# checking data types 
str(occ_adr)
str(demand)

# rounding the occupancy percentages 
occ_adr <- occ_adr %>% mutate(Occ = as.numeric(Occ)) %>% mutate_if(is.numeric, round, 3)
# transforming the month column into a factor 
occ_adr$Month <- factor(occ_adr$Month, levels=c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"))

Dynamic Visualizations

I used gganimate for all dynamic visualizations in the article. It is built on top of ggplot2 and therefore really easy to get used to if you have used ggplot2 before. I'll go through each of the plots from the article and share my code along with comments.

Hotel Room Nights Sold Plot

As I wanted to start out with the hotel room nights sold, I decided to go with a line plot. I used the tranistion_reveal() funcion to animate the line plot. This function lets the data gradually appear along a specified dimension. In this case, I wanted to use months 1-12 on the x-axis. The function starts out in regular ggplot2 fashion: I used the data from the "demand" tibble and used the "Month2" variable on the x- and the "Demand" variable on the y-axis. I grouped the data by year and colored my lines according to the "Year" variable. I then added a the line geom and "scale_color_viridis_d()", which simply adds the viridis color palette. After that, I added labels for the x- and y-axis along with a centered title before using the transition_reveal function to animate this static ggplot2 plot.

p <- ggplot(
     demand,
     aes(Month2, Demand, group = Year, color = factor(Year))
     ) +
     geom_line() +
     scale_color_viridis_d() +
     labs(x = "Month", y = "Millions of Stays") +
     theme(legend.position = "top", plot.title = element_text(hjust = 0.5),
           axis.text=element_text(size=11)) + 
     scale_x_continuous(breaks=c(1,2,3,4,5,6,7,8,9,10,11,12), 
                        labels=c("Jan", "Feb", "Mar", "Apr", 
                                 "May", "June", "July", "Aug", 
                                 "Sep", "Oct", "Nov", "Dec")
     ) +
     labs(color='Year') +
     ggtitle("Hotel Room Nights Sold in NYC 2015-2019") +
     transition_reveal(Month2) +
     theme_light()

In order to render this plot, I used the following:

animate(p, height=500, width=500)

Average Daily Rate Plot

The ADR plot is structured slightly differently than the hotel room nights sold plot. First, I decided to create a lollipop plot consisting of the point and segment geoms. I also incorporated a dynamic display of which month the plot is currently showing by using frame_time in the title of the plot. Lastly, I used the transition_time() function to animate the plot. This function allows you to specify transitions through distinct points in time (months in this case). ease_linear('linear') simply describes how values change between these distinct points in time.

p2 <- ggplot(occ_adr,
      aes(x=Year, y=ADR, color=Year))  +
      geom_emoji(aes(image='1f4b2')) +
      geom_segment(aes(
        y = 190,
        x = Year,
        yend = ADR,
        xend=Year)
      ) +
      coord_flip() +
      theme(legend.position = "none", axis.text=element_text(size=12)) +
      labs(title='Month: {floor(frame_time)}', y='Average Daily Rate', x='Year') +
      transition_time(Month2) +
      ease_aes('linear')

Again, to animate the plot:

animate(p2, height=500, width=500)

Occupancy Plot

The occupancy plot is very similar to the hotel room nights sold plot. I used a very similar structure to the plot above and again made use of the transition_reveal function to let the line plot gradually appear.

p3 <- ggplot(
      occ_adr,
      aes(Month2, Occ, group = Year, color = factor(Year))
      ) +
      geom_line() +
      scale_color_viridis_d() +
      labs(x = "Month", y = "Avg. % Occupancy") +
      theme(legend.position = "top", plot.title = element_text(hjust = 0.5),
            axis.text=element_text(size=12),
      ) + 
      scale_y_continuous(labels = scales::percent_format(accuracy=5L)
      ) +
      scale_x_continuous(breaks=c(1,2,3,4,5,6,7,8,9,10,11,12), 
                         labels=c("Jan", "Feb", "Mar", "Apr", "May", 
                                  "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec")
      ) +
      labs(color='Year') +
      ggtitle("Average Occupancy NYC Hotels 2015-2019") +
      transition_reveal(Month2) +
      theme_light()

Animating the plot:

animate(p3, height=500, width=500)