NYC Hotel Analysis/nyc_hospitality.ipynb
# turning off warnings
options(warn=-1)
library(tidyverse)
library(readxl)
library(gganimate)
library(zoo)
library(ggimage)
First, I read in the data I found on NYC & Company's website:
occ_adr <- read_excel("hotel_reports.xlsx", sheet="occ_adr")
demand <- read_excel("hotel_reports.xlsx", sheet="demand")
The data was pretty clean to begin with but I still had to make some adjustments:
# checking data types
str(occ_adr)
str(demand)
# rounding the occupancy percentages
occ_adr <- occ_adr %>% mutate(Occ = as.numeric(Occ)) %>% mutate_if(is.numeric, round, 3)
# transforming the month column into a factor
occ_adr$Month <- factor(occ_adr$Month, levels=c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"))
I used gganimate for all dynamic visualizations in the article. It is built on top of ggplot2 and therefore really easy to get used to if you have used ggplot2 before. I'll go through each of the plots from the article and share my code along with comments.
As I wanted to start out with the hotel room nights sold, I decided to go with a line plot. I used the tranistion_reveal() funcion to animate the line plot. This function lets the data gradually appear along a specified dimension. In this case, I wanted to use months 1-12 on the x-axis. The function starts out in regular ggplot2 fashion: I used the data from the "demand" tibble and used the "Month2" variable on the x- and the "Demand" variable on the y-axis. I grouped the data by year and colored my lines according to the "Year" variable. I then added a the line geom and "scale_color_viridis_d()", which simply adds the viridis color palette. After that, I added labels for the x- and y-axis along with a centered title before using the transition_reveal function to animate this static ggplot2 plot.
p <- ggplot(
demand,
aes(Month2, Demand, group = Year, color = factor(Year))
) +
geom_line() +
scale_color_viridis_d() +
labs(x = "Month", y = "Millions of Stays") +
theme(legend.position = "top", plot.title = element_text(hjust = 0.5),
axis.text=element_text(size=11)) +
scale_x_continuous(breaks=c(1,2,3,4,5,6,7,8,9,10,11,12),
labels=c("Jan", "Feb", "Mar", "Apr",
"May", "June", "July", "Aug",
"Sep", "Oct", "Nov", "Dec")
) +
labs(color='Year') +
ggtitle("Hotel Room Nights Sold in NYC 2015-2019") +
transition_reveal(Month2) +
theme_light()
In order to render this plot, I used the following:
animate(p, height=500, width=500)
The ADR plot is structured slightly differently than the hotel room nights sold plot. First, I decided to create a lollipop plot consisting of the point and segment geoms. I also incorporated a dynamic display of which month the plot is currently showing by using frame_time in the title of the plot. Lastly, I used the transition_time() function to animate the plot. This function allows you to specify transitions through distinct points in time (months in this case). ease_linear('linear') simply describes how values change between these distinct points in time.
p2 <- ggplot(occ_adr,
aes(x=Year, y=ADR, color=Year)) +
geom_emoji(aes(image='1f4b2')) +
geom_segment(aes(
y = 190,
x = Year,
yend = ADR,
xend=Year)
) +
coord_flip() +
theme(legend.position = "none", axis.text=element_text(size=12)) +
labs(title='Month: {floor(frame_time)}', y='Average Daily Rate', x='Year') +
transition_time(Month2) +
ease_aes('linear')
Again, to animate the plot:
animate(p2, height=500, width=500)
The occupancy plot is very similar to the hotel room nights sold plot. I used a very similar structure to the plot above and again made use of the transition_reveal function to let the line plot gradually appear.
p3 <- ggplot(
occ_adr,
aes(Month2, Occ, group = Year, color = factor(Year))
) +
geom_line() +
scale_color_viridis_d() +
labs(x = "Month", y = "Avg. % Occupancy") +
theme(legend.position = "top", plot.title = element_text(hjust = 0.5),
axis.text=element_text(size=12),
) +
scale_y_continuous(labels = scales::percent_format(accuracy=5L)
) +
scale_x_continuous(breaks=c(1,2,3,4,5,6,7,8,9,10,11,12),
labels=c("Jan", "Feb", "Mar", "Apr", "May",
"June", "July", "Aug", "Sep", "Oct", "Nov", "Dec")
) +
labs(color='Year') +
ggtitle("Average Occupancy NYC Hotels 2015-2019") +
transition_reveal(Month2) +
theme_light()
Animating the plot:
animate(p3, height=500, width=500)