library(haven)
library(ggplot2)
library(RCurl)
library(dplyr)
library(patchwork)
library(hrbrthemes)
library(astsa)
library(Rwave)
library(lattice)
library(zoo)
library(forecast)
library (Matrix)
library (matrixcalc)
library(readr)
This study is part of a larger research project selected to the Nova Scotia COVID-19 Research Coalition and funded by the Research Nova Scotia.
As the number of COVID-19 cases increases globally, governments and authorities have continued to use mobility restrictions that were, and still are, the only effective tool to control for the viral transmission. Yet, the relationship between public orders and behavioral parameters of social distancing observed in the community is a complex process whose effects on the spread are not well understood. The evidence shows that adherence to public orders about the social distancing is not stable and fluctuates with degree of spatial differences in information and the level of risk aversion. This study aims to uncover these mobility dynamics and their effect on the COVID-19 spread in three major cities in North America. Without accounting for this dynamic structure, a naive calculation of correlations with any level of lagged mobility variations shows a strong negative relationship: as the mobility goes down, cases go up.
We’ve developed and trained a particular nonparametric model, which is called Dynamic Functional Connectivity in neuroscience, that better captures the changing and time-sensitive effects of mobility restrictions on the viral spread to answer the following two related questions:
Our study compares three major cities: Montreal, Toronto, and New York City. We use test positivity rates (PR) that reflect the spread as well as the Facebook mobility data called “all_day_bing_tiles_visited_relative_change”, which measures positive or negative changes in movement relative to baseline in those three cities.
There are three time-varying metrics that we develop to measure the effects of social mobility on the spread:
Our preliminary study provides a unique methodological framework that can be easily used to understand the differences in the efficacy of mobility related local public health policies in fighting the SARS-CoV-2 transmission. Our findings reveal that the relationship between mobility and the spread is highly unstable even over short time intervals and, after controlling for it, the results show that,
The correlations capture the degree of relatedness between PR and mobility but cannot reveal the responsiveness of PR or so-called “elasticity” that refers to the degree to which the relatedness between PR and mobility is utilized. Our results show that:
In summary, using a counter-factual simulation, we found that Montreal has a significantly lower public sensitivity to COVID-19. Moreover, the level of restrictions and their timing are insufficient in terms of their magnitude and speed in Montreal relative to other cities.
This preliminary work reveals the codes and data sources with their results at the same time. Please refer to the main paper for more details. We thank Roberto Rocha from CBC who has inspired us for this topic and for providing us with the data for Montreal.
The data on Montreal is from INSPQ and obtained by using an unofficial API that powers the INSPQ’s own dashboard. Otherwise the data is not publicly available.
df <- dfmontr_v1[dfmontr_v1$date > "2020-02-29", ]
df <- df[-nrow(df),] #2021-02-03 is zero
# PR
df$psi_quo_pos_t <- as.numeric(df$psi_quo_pos_t)
pr <- df$psi_quo_pos_t #PR
prdf <- data.frame(date = df$date, days = 1:length(pr), pr = pr)
# Mobility data
mobdf <- read_dta("movement-range-data-2021-02-06/montrealmob.dta")[]
mob1 <- mobdf[1:nrow(df),1:2]
workingdf <- data.frame(Date = df$date, day = 1:nrow(df), dpr = prdf$pr, mob1$all_day_bing_tiles_visited_relat)
names(workingdf)[4] <- "mob1"
pdata <- data.frame(mob = workingdf$mob1, pr = workingdf$dpr,
Date = as.Date("2020-03-01") + 1:(nrow(workingdf)))
coeff <- 20
# A few constants
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=pr), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=mob*coeff), size=0.5, color=prcolor) +
scale_y_continuous(
# Features of the first axis
name = "Positivity Rates",
# Add a second axis and specify its features
sec.axis = sec_axis(~./coeff, name="Mobility Changes")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
ggtitle("Mobility restrictions vs PR - Montreal")
It’s clear from these plots that the early numbers are not usable due to very low tests numbers, hence high PR. We will ignore the \(1^{st}\) wave and look at the epi curve after June 08, 2020 in each city. Can we use PR? The main issue is that the PR numbers would be misleading when the testing is not random. Jeffrey E. Harris discusses this issue in his blog: As an indicator of the adequacy of testing, the TPR appears to have little or no informative value. When it comes to gauging the severity of the epidemic, the TPR might serve as a proxy for the case incidence rate. In the comment section, Raphael Thomadsen states the following: (…) that on a day-to-day level the positivity rate is a really good indicator about how COVID is changing day-by-day. It seems then, that it depends on what your goal is, but for short-run analysis it looks like the positivity rate may be a fantastic predictor. We discuss this issue later in details, specifically whether simulated infection rates based on SIR models or TPR data should be used in our analysis.
Here is a naive way of looking at the relationship by cross correlations with and without first-differencing:
x <- workingdf$mob1
y <- workingdf$dpr
lag2.plot(x, y, 15)
lag2.plot(diff(x), diff(y), 15)
Cross correlations use the whole data with varying lags up to 21. As it’s clear from the plots, the correlation between mobility and PR is negative for all lags. In other words, as mobility goes down, the spread rises.
In order to calculate the dynamic nature of this relationship, we develop a trainable nonparameteric approach based on the Dynamic Functional Connectivity (DFC) method. However, despite the fact that DFC has been long used, there are still considerable technical issues associated with the approach.
A great effort has recently been dedicated to investigate how the size of sliding windows affects DFC estimations. For example, a very recent review on DFC and how it is used in neuroscience is provided by Hutchison et al. (2013). They show that analogous to a moving average function, a sliding window analysis computes a succession of pairwise correlation matrices using the time series from a given parcellation of brain regions. However, despite the growing success of this methodology in neuroimaging, the sliding window technique has multiple parameters such as window function, length, and step size that must be set. But the appropriate settings remain unknown due to lack of “ground truth”, for example, in resting state fMRI data (Mokhtari et al. 2019).
Unlike the applications of DFC in neuroimaging, we have an unquestionable epidemiological “ground truth” indicating that the spread of an infectious disease must be positively related to the intensity of contacts in a population, which is the core idea in the so-called SIR models. However, the problem of “window-size” still remains as a main challenge as identified in neuroscience: increasing a window length results in decreasing the sensitivity for identifying fast changes with very long windows eventually measuring static connectivity. On the other hand, shorter windows can increase sensitivity for detecting short transition states but at the expense of increasing the spurious fluctuations in the dynamic connectivity Leonardi and Van De Ville, 2015.
Thus, it is essential to determine a window length that allows reducing spurious fluctuations and at the same time capturing faster dynamic correlations. One of the most suggested method to address spurious fluctuations in DFC is to estimate a method when the window length is not shorter than the largest wavelength present in both series. This length is about 7 days in the positivity rates and mobility series, which is also verified in the literature as “weekend effect”.
We use this method with its recommended applications in neuroscience and create a grid search that finds the optimal lag to maximize the positive correlations in 7-day sliding windows, which will be justified below.
Most approaches to finding periodic behavior assume that the underlying series are stationary. For nonstationary series, wavelets have been developed to summarize the variation in frequency composition through time. In our case, both series, PR and mobility, are nonstationary. Wavelets allow us to study localized periodic behavior. In particular, we look for regions of high-power in the frequency-time plot.
x <- as.data.frame(workingdf$dpr)
x <- x[complete.cases(x[,1]),]
x <- x[-c(1:100)]
#from https://ms.mcmaster.ca/~bolker/eeid/2010/Ecology/
mk.cwt = function(thisz, noctave, nvoice, moments=30, smooth=T, smoothsd=1, filtradius=3, do.center=F) {
ret = cwtTh(thisz, noctave=noctave,
nvoice=nvoice, moments=moments, plot=F)
if (do.center) {
ret = ret/sum(ret)
}
return(list(cwt=ret, noctave=noctave, nvoice=nvoice, moments=moments))
}
plot.cwt = function(tmp, mod=T, ylab='period', xlab='', main='', cex=1.2, center=T) {
if (mod) {
tmp$cwt = Mod(tmp$cwt)
}
if (center) {
tmp$cwt = tmp$cwt/sum(tmp$cwt)
}
breaks.at = pretty( range(tmp$cwt), n=100)
my.pal = rev(rainbow(length(breaks.at) , start=0, end=4/6))
myplot = levelplot(tmp$cwt, pretty=F,
scales=list(
y=list(tick.number=tmp$noctave, labels=format(2^(1+seq(from=0,by=tmp$nvoice,
to=tmp$noctave*tmp$nvoice)/tmp$nvoice))),
cex=cex
),
ylab=list(label=ylab, cex=cex),
xlab=list(label=xlab, cex=cex),
colorkey=list(labels=list(cex=cex)),
aspect='fill',
main=main,
col.regions=my.pal,
cuts=length(my.pal)-1)
return(myplot)
}
tmp<-mk.cwt(x,noctave = floor(log2(length(x)))-1,nvoice=10)
plot.cwt(tmp,xlab="time (units of sampling interval)",
main = "Wavelet of PR after Day 7")
The intensity of the colormap represents the variance of the time series that is associated with particular frequencies (y-axis) through time (x-axis). Our wavelet analysis is able to detect frequencies that are localized in time, and therefore if the dominant period of a time series changes over time, wavelets can be used to detect this transition. The map shows that around days 7 and 8, the \(2^{nd}\) wave shows a dominant variation. We ignore the higher variations around day 100 which captures the increasing variations during the \(1^{th}\) and \(2^{nd}\) waves of the epidemic. This is also captured by the spectral analysis applied on the first-differenced PR and mobility series after Day 100, which indicates the same frequency, 7 days, in both series.
#http://web.stanford.edu/class/earthsys214/notes/series.html
dspect <- spectrum(diff(x[-c(1:100)]), log="no", spans=c(2,2), plot=FALSE)
specx <- 1/dspect$freq
specy <- 2*dspect$spec
plot(specx, specy, xlab="Period (days)", ylab="Spectral Density", type="l",
xlim = c(0, 40), main = "Spectral Density of PR",
col = "blue", lwd = 2)
text(15, 10 , paste0("Maximum denisty at Day ", specx[which.max(specy)]))
dspect <- spectrum(diff(y[-c(1:100)]), log="no", spans=c(2,2), plot=FALSE)
specx <- 1/dspect$freq
specy <- 2*dspect$spec
plot(specx, specy, xlab="Period (days)", ylab="Spectral Density", type="l",
xlim = c(0, 40), main = "Spectral Density of Mobility",
col = "red", lwd = 2)
text(15, 10 , paste0("Maximum denisty at Day ", specx[which.max(specy)]))
The following plot is based on a static connectivity with a 2-day delay, which maximizes the positive correlation, the “grand truth”. It verifies the fact that a common lag should not be applied to the whole epidemiological curve, in which the relationship between two series changes over time. Instead, the delay between mobility and PR has to be tuned in each sliding window separately to see which day(s) in the past would have the maximum positive impact.
#Correlations for 7-day sliding window 1:21 lags
lag = 1:21
w = 7:48
grid <- as.matrix(expand.grid(lag, w))
rol <- c()
for(i in 1:nrow(grid)) {
montl <- workingdf[2:nrow(workingdf), ] %>%
mutate(dprL = lead(dpr, grid[i,1]))
montll <- montl[complete.cases(montl), 4:5]
tmp <- rollapply(montll, width=grid[i,2], function(x) cor(x[,1],x[,2]),
by.column=FALSE)
sub <- tmp[100:length(tmp)]
rol[i] <- mean(sub)
}
#lag that gives the maximum correlations
opt <- grid[which(rol == max(rol)), ]
montl <- workingdf[2:nrow(workingdf), ] %>%
mutate(dprL = lead(dpr, opt[1]))
montll <- montl[complete.cases(montl), 4:5]
roll <- rollapply(montll, width=opt[2], function(x) cor(x[,1],x[,2]),
by.column=FALSE)
pdata <- data.frame(roll = roll,
Date = as.Date("2020-03-02") + 1:length(roll),
pr = workingdf$dpr[1:length(roll)])
coeff <- 0.05
# A few constants
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=roll), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=pr*coeff), size=0.5, color=prcolor) +
scale_y_continuous(
name = "Correlations", limits = c(-1, 1.3),
sec.axis = sec_axis(~./coeff, name="Positivity Rate - %")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
ggtitle("Rolling Correlations - Montreal",
subtitle = "2-day Delay and 7-day Sliding Window")
lag = 1:21
w = 7
mco <- c()
lagg <- c()
mdf <- workingdf[2:nrow(workingdf), ]
n <- nrow(mdf)-w
unrot <- vector(mode = "list", length = n)
for(j in 1:n){
co <-c()
for(i in 1:length(lag)) {
mdf1 <- mdf %>% mutate(dprL = lead(dpr, lag[i]))
mdf2 <- mdf1[complete.cases(mdf1), 4:5]
k <- j-1
mdf3 <- mdf2[j:(w+k), ]
co[i] <- cor(mdf3[,1], mdf3[,2])
unrot[[j]][[i]] <- cbind(ndiffs(mdf3[,1]), ndiffs(mdf3[,2]))
}
mco[j] <- max(co, na.rm = TRUE)
lagg[j] <- lag[which.max(co)]
}
roll <- mco[-length(mco)]
laga <- lagg[-length(lagg)]
pdata <- data.frame(roll = roll[100:length(roll)],
Date = as.Date("2020-06-08") + 0:(length(roll)-100),
pr = workingdf$dpr[100:length(roll)])
saveRDS(pdata, file = "pdata_mtr.RDS")
coeff <-11
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=roll), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=pr/coeff), size=0.7, color=prcolor) +
scale_y_continuous(
name = "Correaltion", limits = c(0,1.3),
sec.axis = sec_axis(~.*coeff, name= "Positivity Rate - %")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$roll), slope = 0, linetype=3,
color = "red") +
annotate("text", col = "darkred", x= as.Date("2020-07-20"), y=0.85,
label= paste0("Average Correlation = ",
round(mean(pdata$roll),2))) +
ggtitle("Time-varying relationship between Mobility & PR",
subtitle = "Montreal (after June 08)")
As explained before, we ignore the first 100 days and start on June 8 due to misleading PR values. The graph above shows the nonparametric estimation of rolling correlations (red) between mobility and PR. The average is very high around 77%. We will compare this correlation with Toronto and NYC later to see how effective the similar policies in two different big cities.
When time series data are used, the cross-correlation can be impacted by a possible dependence within-series. Although our use of very short window spans and that we only calculate empirical correlations, in many cases the within-series dependence should be removed first. Otherwise, the spurious correlation would be a problem where series may appear correlated, but the correlation itself would be meaningless. We test the mobility and PR series in every rolling window at the optimal lag corresponding to that window to see the presence of nonstationarity. Although very short intervals make the reliability of available tests questionable, our results show that few windows show nonstationarity.
However, to address this possible problem, we take daily growth rates of both series. The results presented here show that the correlations we calculated earlier are not spurious.
lag = 1:21
w = 7
mco <- c()
lagg <- c()
mdf <- workingdf[9:nrow(workingdf), ]
mdf$dpr <- diff(mdf$dpr)/mdf$dpr
mdf$mob1 <- diff(mdf$mob1)/mdf$mob1
n <- nrow(mdf)-w
for(j in 1:n){
co <-c()
for(i in 1:length(lag)) {
mdf1 <- mdf %>% mutate(dprL = lead(dpr, lag[i]))
mdf2 <- mdf1[complete.cases(mdf1), 4:5]
k <- j-1
mdf3 <- mdf2[j:(w+k), ]
co[i] <- cor(mdf3[,1], mdf3[,2])
}
mco[j] <- max(co, na.rm = TRUE)
lagg[j] <- lag[which.max(co)]
}
rollg <- mco[-length(mco)]
lagag <- lagg[-length(lagg)]
pdata <- data.frame(roll = rollg[100:length(rollg)],
Date = as.Date("2020-06-08") + 0:(length(rollg)-100),
pr = workingdf$dpr[100:length(rollg)])
coeff <-11
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=roll), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=pr/coeff), size=0.7, color=prcolor) +
scale_y_continuous(
name = "Correaltion", limits = c(0,1.3),
sec.axis = sec_axis(~.*coeff, name= "Positivity Rate - %")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$roll), slope = 0, linetype=3,
color = "red") +
annotate("text", col = "darkred", x= as.Date("2020-07-20"), y=0.85,
label= paste0("Average Correlation = ",
round(mean(pdata$roll),2))) +
ggtitle("Time-varying relationship between Mobility & PR",
subtitle = "Montreal (with first-differenced after July 08)")
Now we look at how many days of delay between mobility and PR in each 7-day rolling window is identified by the algorithm.
pdata <- data.frame(laga = laga[100:length(laga)],
Date = as.Date("2020-06-08") + 0:(length(roll)-100),
pr = workingdf$dpr[100:length(laga)])
coeff <- 1.6
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=laga), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=pr*coeff), size=0.7, color=prcolor) +
scale_y_continuous(
name = "Delays - Day",
sec.axis = sec_axis(~./coeff, name="Positivity Rate - %")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$laga), slope = 0, linetype=3,
color = "red") +
annotate("text", x= as.Date("2020-07-20"), y=11, col = "darkred",
label= paste0("Mean Delay = ",
round(mean(pdata$laga),2),
" days")) +
ggtitle("Delays in the effect of mobility restrictions",
subtitle = "Montreal (after June 08)")
We can think that more immediate effects of mobility on PR happen when the effectiveness of contact tracing is high. When there is an increase in community spread without well identified sources, the effect of mobility on PR stretches back to the upper bound of the COVID-19 incubation period, which is estimated to be 14 days, or beyond. To understand how the time for observing the maximum effect of mobility restrictions varies and could be less or more than 14 days, we can look at a benchmark case, where there is no contact tracing and no mobility restrictions.
In this case, we assume that the incubation period for each person is 14 days and that they could be tested and get the result without any delay at the onset of symptoms. There is one infected person (John) and assume that John will be in contact with only 4 people at the same time during the incubation period (14 days). When there are no mobility restrictions, the total 4 new positive tests will be recorded at the onset of their symptoms after 14 days of incubation period. Now let’s reduce the mobility by 50%. The number of new positive test will fall from 4 to 2, but it will still take 14 days to see the reduction in positive test numbers from 4 to 2. When we employ a very effective contact tracing in addition to mobility restrictions, as soon as John shows up in the test center, they will call John’s two contacts, say, in two days. hence the effect of reduction in mobility will be observed in positive cases in 2 days as opposed to 14 days.
We can obviously think of an opposite case. When there is no effective contact tracing in place, the effect of restrictions would take more than 14 days if we introduce some logistical imperfections, such as delays in testing the symptomatic people, long gaps between testing and processing and so on.
The correlation captures the degree of relatedness between PR and mobility but cannot reveal the responsiveness of PR or to degree of which PR changes in response to changes in mobility. This is also called elasticity and can be defined as follows:
\[ \text { x-elasticity of } y: \epsilon=\frac{\partial y / y}{\partial x / x}, \] which is the ratio of the percentage change in one variable to the percentage change in another variable, when the latter variable has a causal influence on the former. It is a useful tool for measuring the sensitivity of one variable to changes in another, causative variable. It also has the advantage of being a unitless ratio, independent of the type of quantities being varied.
A slope coefficient in a single variable regression, \(y_{i}=\alpha+\beta x_{i}+\varepsilon_{i}\), can be expressed as
\[ \begin{aligned} \beta &=\frac{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2}} \\ &=\frac{s_{x, y}}{s_{x}^{2}} \\ &=r\frac{s_{y}}{s_{x}} \end{aligned} \] where \(s_{x,y}\) is a simple covariance between \(x\) and \(y\), \(s^2_{x}\) is the variance of \(x\), \(r\) is the coefficient of correlation between the \(y\) and \(x\), and \(s_{y}\) and \(s_{x}\) are their standard deviations, respectively. Hence, the “mean” elasticity, \(\epsilon\) may be written with \(y= PR\) (Positivity Rate) and \(x=R\) (Restrictions), as follows
\[ \epsilon=\frac{\partial PR / PR}{\partial R / R} = r\frac{s_{pr}}{s_{r}}\frac{\bar{R}}{\bar{PR}} \]
It follows, therefore, that when \(r\) is in the neighborhood of 1, the spread will be more sensitive or less (i.e. \(\epsilon \lessgtr1\)) according as the spread of COVID-19 is more or less variable than the mobility (\(\frac{s_{PR}}{s_{R}}\)) and the magnitude of restrictions relative to how widespread PR is (\(\frac{\bar{R}}{\bar{PR}}\)). This could be better seen in a simulations:
set.seed(1234567)
x1<-c(1,3,5,7,9)
x2 <- c(2,4,6,8,10)
y1 <- 3*x1 + rnorm(5,0,1)
y2 <- 10 + 6*x2 + rnorm(5,0,10)
plot(x2,y2, col="red", pch=16, xlim=c(0,10), ylim=c(0,80), xlab="x", ylab="y",
main = "Higher correlation, lower slope and elasticity")
points(x1,y1, col="blue", pch=16)
abline(lm(y1~x1), col="blue")
abline(lm(y2~x2), col="red")
text(2,40,paste("Correlation =",round(cor(x2,y2), digits = 3)))
text(6,5,paste("Correlation =",round(cor(x1,y1), digits = 3)))
text(2,45,paste("Slope =",round(coef(summary(lm(y2~x2)))["x2", "Estimate"], digits = 3)))
text(6,10,paste("Slope =",round(coef(summary(lm(y1~x1)))["x1", "Estimate"], digits = 3)))
Here are the the responsiveness of positivity rates to mobility changes where the intensity of their relationship (correlations) is at the maximum.
lag = 1:21
w = 7
mel <- c()
sd <- c()
xy <- c()
mcor <- c()
beta_mtr <- c()
x <- c()
mdf <- workingdf[2:nrow(workingdf), ]
n <- nrow(mdf)-w
for(j in 1:n){
cor <-c()
sdd<- c()
xyy <- c()
xx <- c()
el <- c()
for(i in 1:length(lag)) {
mdf1 <- mdf %>% mutate(dprL = lead(dpr, lag[i]))
mdf2 <- mdf1[complete.cases(mdf1), 4:5]
k <- j-1
mdf3 <- mdf2[j:(w+k), ]
cor[i] <- cor(mdf3[,1], mdf3[,2])
sdd[i] <- sd(mdf3[,2])/sd(mdf3[,1])
xyy[i] <- abs(mean(mdf3[,1]))/mean(mdf3[,2])
xx[i] <- abs(mean(mdf3[,1]))
el[i] <- cor[i]*sd[i]*xy[i]
}
mcor[j] <- cor[which(cor == max(cor, na.rm = TRUE))]
sd[j] <- sdd[which(cor == max(cor, na.rm = TRUE))]
xy[j] <- xyy[which(cor == max(cor, na.rm = TRUE))]
x[j] <- xx[which(cor == max(cor, na.rm = TRUE))]
mel[j] <- mcor[j]*sd[j]*xy[j]
beta_mtr[j] <- mcor[j]*sd[j]
}
saveRDS(beta_mtr, file = "beta_mtr.RDS")
roll <- mel[-length(mel)]
pdata <- data.frame(roll = roll[100:length(roll)],
Date = as.Date("2020-06-08") + 0:(length(roll)-100),
pr = workingdf$dpr[100:length(roll)])
coeff <-17.5
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=roll), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=pr/coeff), size= 0.7, color=prcolor) +
scale_y_continuous(
name = "Elasticity", limits = c(0,0.8),
sec.axis = sec_axis(~.*coeff, name= "Positivity Rate - %")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$roll), slope = 0, linetype=3,
color = "red") +
annotate("text", col = "darkred", x= as.Date("2020-07-20"), y=0.5,
label= paste0("Average Elasticity = ",
round(mean(pdata$roll),2))) +
annotate("text", col = "darkred", x= as.Date("2020-07-20"), y=0.4,
label= paste0("After October = ",
round(mean(pdata$roll[100:length(pdata$roll)]),2))) +
ggtitle("Responsiveness of PR to Mobility",
subtitle = "Montreal (after June 08)")
This mean elasticity, 0.23, shows that PR is not sensitive to mobility changes on average when we consider the the whole period after Day 100. At the beginning of the \(2^{nd}\) wave this changes and the mean elasticity becomes 0.34, which is still low. For example, when the mobility goes down 10%, PR falls, on average, 3.4% during the period of \(2^{nd}\) wave. We will see in following sections that, although Toronto and New York City have lower correlation coefficients around 0.70, their elasticities are higher about 0.55 and 0.72, respectively. In other words, a 10-percent decline in mobility reduces PR 7.2%. That means that there are other things affecting the positivity rates that might not be captured by the mobility. This could be the implementations of restrictions that are different in each city or other measures such as differences in enforcement of mandatory masks, school closures, and so on.
Furthermore, it is also possible that Montreal may be at a point where mobility restrictions may not be as effective as in NYC and Toronto. In other words, the Montreal’s lower elasticity may indicate that the level of mobility restrictions could be inadequate relative to the level and the speed of the spread. This could be seen if take the ratio of mean of mobility to PR and plot it against the elasticity:
fck <- data.frame(roll = roll, xy = xy[-length(xy)])
fcor <- fck[order(fck[,2]), ]
loefc <- loess(roll~xy, degree=2, span = 0.4, data = fcor)
fitrc <- predict(loefc, fcor$xy)
plot(fcor$xy, fcor$roll, type = "h", col = "lightgrey", ylab = "Elasticity",
xlab = "Mean of Mobility/PR", main = "Restrictions are more effective when the ratio of mobility to PR is high",
cex.main = 0.8, cex.lab=0.7, cex.axis = 0.7)
lines(fcor$xy, fitrc, col = "red", lwd = 2)
The mean of mobility-PR ratio increases when mobility restrictions rise up more than PR (the mobility is taken as an absolute value in elasticity calculations). The plot shows that the elasticity (the responsiveness of PR) goes up as the ratio rises until about 0.03. When the ratio exceeds this point, however, the effectiveness of mobility restrictions dies slowly and stays almost constant. Beyond this point, the increase in the ratio would be mostly due to a decline in PR rather than increase in restrictions. The smoothed elasticity line can shift up, when the mobility-PR ratio is drastically higher, which explains the differences across cities. In Section 5, we investigate this issue: had Montreal had the same ratio as what NYC has, it would have been much higher elasticity than what NYC has.
fck <- data.frame(roll = roll, x = x[-length(x)])
fcor <- fck[order(fck[,2]), ]
loefc <- loess(roll~x, degree=2, span = 0.2, data = fcor)
fitrc <- predict(loefc, fcor$x)
plot(fcor$x, fcor$roll, type = "h", col = "lightgrey", ylab = "Elasticity",
xlab = "Mean of Mobility", main = "Responsiveness of PR declines with more mobility restrictions ",
cex.main = 0.8, cex.lab=0.7, cex.axis = 0.7)
lines(fcor$x, fitrc, col = "blue", lwd = 2)
This plot shows that the elasticity is almost constant when the mean of mobility is between 0.1 and 0.45, which indicates that the effectiveness of restrictions does lose its power in reducing the PR. In other words, the level decline in mobility does not create the same level increase in its effectiveness. Among many other reasons, this could happen when the decline in mobility is not proportional to the rise in PR to have the strong effect on the spread.
The number of tests, percent of positive results by age and location and time taken for test processing are obtained from https://covid-19.ontario.ca/data/testing-volumes-and-results#byPHU. The total test numbers reflect 7-day averages of how many tests were processed by the lab in each day. The available information also starts (as of 02-04-2021) from May 1. 2021. In order to extract the daily tests, positivity rate (PR), and positive results (dtest
, dpr
, and dpos
, respectively) from 7-day moving averages, we applied a simple linear algebra:
Here is the plot with daily numbers:
# Adding mobility data
TorontoMob <- read_dta("~/Dropbox/RNS2/DataON/Rinda/TorontoMob.dta")
tordf <- data.frame(pr = dff$dpr, mob = TorontoMob$all_day_bing_tiles_visited_relat[62:327])
ppdata <- data.frame(pr = dff$dpr[1:length(dff$dpr)],
Date = as.Date("2020-05-01") + 0:(length(dff$dpr)-1),
mob = TorontoMob$all_day_bing_tiles_visited_relat[62:327])
loepr <- loess(ppdata$pr~c(1:length(ppdata$pr)), degree=2, span = 0.02)
fitpr <- predict(loepr, 1:length(ppdata$pr))
coeff <-20
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(ppdata, aes(x=Date)) +
geom_bar( aes(y=pr), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=mob*coeff), size=0.5, color=prcolor) +
#geom_line( aes(y=fitpr*.85), size=0.5, color="darkgreen") +
scale_y_continuous(
name = "Positivity Rates - %",
sec.axis = sec_axis(~./coeff, name="Mobility Changes")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
ggtitle("Mobility restrictions and PR",
subtitle = "Toronto (after May 1)")
While it was not obvious when the data smoothed with a 7-day moving average, the daily test data (not presented here) show clear drops in Victoria Day (May 18), Thanksgiving (Oct 12), Christmas (Dec 25), and New Year eve (Jan 1, 2021). This pattern can also be observed in PR and mobility series above. We know that lower test numbers could translate into lower case numbers. This sort of bias in the case numbers are well reported in the literature. That’s why using PR is important. On the other hand, the positive test results calculated by (tests x PR) are different than officially reported case numbers for Toronto. This could be resulted by false positives.
lag = 1:21
w = 7
mco <- c()
lagg <- c()
mdf <- tordf
n <- nrow(mdf)-w
rol <- c()
for(j in 1:n){
co <-c()
for(i in 1:length(lag)) {
mdf1 <- mdf %>% mutate(dprL = lead(pr, lag[i]))
mdf2 <- mdf1[complete.cases(mdf1), ]
k <- j-1
mdf3 <- mdf2[j:(w+k), ]
co[i] <- cor(mdf3[,2], mdf3[,3])
}
mco[j] <- max(co, na.rm = TRUE)
lagg[j] <- lag[which.max(co)]
}
roll <- mco[-length(mco)]
laga <- lagg[-length(lagg)]
pdata <- data.frame(roll = roll[39:length(roll)],
Date = as.Date("2020-06-08") + 0:(length(roll)-39),
"pr" = fitpr[39:length(roll)])
coeff <-10.5
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=roll), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=pr/coeff), size=0.7, color=prcolor) +
scale_y_continuous(
name = "Correaltion", limits = c(0,1.3),
sec.axis = sec_axis(~.*coeff, name= "Positivity Rate - %")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$roll), slope = 0, linetype=3,
color = "red") +
annotate("text", col = "darkred", x= as.Date("2020-07-20"), y=0.8,
label= paste0("Average Correlation = ",
round(mean(pdata$roll),2))) +
ggtitle("Time-varying relationship between Mobility & PR",
subtitle = "Toronto (after June 08)")
Now we look at how many days of delay between mobility and PR in each 7-day rolling window is identified by the algorithm.
pdata <- data.frame(laga = laga[39:length(laga)],
Date = as.Date("2020-06-08") + 0:(length(roll)-39),
"pr" = fitpr[39:length(roll)])
coeff <- 1.6
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=laga), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=pr*coeff), size=0.7, color=prcolor) +
scale_y_continuous(
name = "Delays - Day",
sec.axis = sec_axis(~./coeff, name="Positivity Rate")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$laga), slope = 0, linetype=3,
color = "red") +
annotate("text", x= as.Date("2020-07-20"), y=12, col = "darkred",
label= paste0("Mean Delay = ",
round(mean(pdata$laga),2),
" days")) +
ggtitle("Delays in the effect of mobility restrictions",
subtitle = "Toronto (after June 08)")
lag = 1:21
w = 7
mel <- c()
sd <- c()
xy <- c()
mcor <- c()
beta_tr <- c()
mdf <- tordf
n <- nrow(mdf)-w
for(j in 1:n){
cor <-c()
sdd<- c()
xyy <- c()
el <- c()
for(i in 1:length(lag)) {
mdf1 <- mdf %>% mutate(dprL = lead(pr, lag[i]))
mdf2 <- mdf1[complete.cases(mdf1), ]
k <- j-1
mdf3 <- mdf2[j:(w+k), ]
cor[i] <- cor(mdf3[,2], mdf3[,3])
sdd[i] <- sd(mdf3[,3])/sd(mdf3[,2])
xyy[i] <- abs(mean(mdf3[,2]))/mean(mdf3[,3])
el[i] <- cor[i]*sd[i]*xy[i]
}
mcor[j] <- cor[which(cor == max(cor, na.rm = TRUE))]
sd[j] <- sdd[which(cor == max(cor, na.rm = TRUE))]
xy[j] <- xyy[which(cor == max(cor, na.rm = TRUE))]
mel[j] <- mcor[j]*sd[j]*xy[j]
beta_tr[j] <- mcor[j]*sd[j]
}
roll <- mel[-length(mel)]
saveRDS(beta_tr, file = "beta_tr.RDS")
pdata <- data.frame(roll = roll[39:length(roll)],
Date = as.Date("2020-06-08") + 0:(length(roll)-39),
"pr" = fitpr[39:length(roll)])
coeff <-9
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=roll), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=pr/coeff), size=0.7, color=prcolor) +
scale_y_continuous(
name = "Elasticity", limits = c(0,1.5),
sec.axis = sec_axis(~.*coeff, name= "Positivity Rate")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$roll), slope = 0, linetype=3,
color = "red") +
annotate("text", col = "darkred", x= as.Date("2020-07-20"), y=0.8,
label= paste0("Average Elasticity = ",
round(mean(pdata$roll),2))) +
annotate("text", col = "darkred", x= as.Date("2020-07-20"), y=0.6,
label= paste0("After October = ",
round(mean(pdata$roll[116:length(pdata$roll)]),2))) +
ggtitle("Responsiveness of PR to Mobility",
subtitle = "Toronto (after June 08)")
The results are interesting. There is a long period that the responsiveness is less than the average. However, a simple visual inspection shows that there are multiple sections where the high elasticity overlaps with low positivity rates.
When we compare both cities, this shows that, although the intensity of the relationship is almost the same in both cities, the responsiveness of PR to mobility is much higher in Toronto than in Montreal due to a higher ratio of PR and mobility variations and/or a higher ratio of their averages in Toronto relative to in Montreal. This means that mobility restrictions have much more impact on the COVID-19 spread in Toronto than Montreal.
pdata <- data.frame(pr = nyc$pr[3:length(nyc$pr)],
Date = as.Date("2020-03-03") + 0:(length(nyc$pr)-3),
mob = nyc$mob[3:length(nyc$mob)])
coeff <- 60
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=pr), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=mob*coeff), size=0.5, color=prcolor) +
scale_y_continuous(
name = "Positivity Rates - %",
sec.axis = sec_axis(~./coeff, name="Mobility Changes")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
ggtitle("Mobility restrictions and PR",
subtitle = "New York City")
ppdata <- data.frame(pr = nyc$pr[215:length(nyc$pr)],
Date = as.Date("2020-10-01") + 0:(length(nyc$pr)-215),
mob = nyc$mob[215:length(nyc$mob)])
coeff <- 10
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(ppdata, aes(x=Date)) +
geom_bar( aes(y=pr), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=mob*coeff), size=0.7, color=prcolor) +
scale_y_continuous(
name = "Positivity Rates - %",
sec.axis = sec_axis(~./coeff, name="Mobility Changes")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
ggtitle("Mobility restrictions and PR",
subtitle = "New York City (After October 1)")
# Rolling correlations
lag = 1:21
w = 7
mco <- c()
lagg <- c()
mdf <- nyc[-c(1:2), c(2,5)]
n <- nrow(mdf)-w
saveRDS(mdf, file = "NYCdf.RDS")
for(j in 1:n){
co <-c()
for(i in 1:length(lag)) {
mdf1 <- mdf %>% mutate(dprL = lead(pr, lag[i]))
mdf2 <- mdf1[complete.cases(mdf1), ]
k <- j-1
mdf3 <- mdf2[j:(w+k), ]
co[i] <- cor(mdf3[,1], mdf3[,3])
}
mco[j] <- max(co, na.rm = TRUE)
lagg[j] <- lag[which.max(co)]
}
roll <- mco[-length(mco)]
laga <- lagg[-length(lagg)]
pdata <- data.frame(roll = roll[98:length(roll)],
Date = as.Date("2020-06-08") + 0:(length(roll)-98),
pr = nyc$pr[100:(length(roll)+2)])
saveRDS(pdata, file = "pdata_ny.RDS")
coeff <-4.2
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=roll), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=pr/coeff), size=0.7, color=prcolor) +
scale_y_continuous(
name = "Correlation", limits = c(0,1.2),
sec.axis = sec_axis(~.*coeff, name= "Positivity Rate - %")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$roll), slope = 0, linetype=3,
color = "red") +
annotate("text", col = "darkred", x= as.Date("2020-07-20"), y=0.8,
label= paste0("Average Correlation = ",
round(mean(pdata$roll),2))) +
ggtitle("Time-varying relationship between Mobility & PR",
subtitle = "New York City (after June 08)")
Now we look at how many days of delay between mobility and PR in each 7-day rolling window is identified by the algorithm.
pdata <- data.frame(laga = laga[98:length(laga)],
Date = as.Date("2020-06-08") + 0:(length(roll)-98),
pr = nyc$pr[100:(length(roll)+2)])
coeff <- 6
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=laga), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=pr*coeff), size=.7, color=prcolor) +
scale_y_continuous(
name = "Delays - Day",
sec.axis = sec_axis(~./coeff, name="Positivity Rate - %")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$laga), slope = 0, linetype=3,
color = "red") +
annotate("text", x= as.Date("2020-07-20"), y=12, col = "darkred",
label= paste0("Mean Delay = ",
round(mean(pdata$laga),2),
" days")) +
ggtitle("Delays in the effect of mobility restrictions",
subtitle = "New York City (after June 08)")
lag = 1:21
w = 7
mel <- c()
sd <- c()
xy <- c()
mcor <- c()
beta_nyc <- c()
mdf <- nyc[-c(1:2), c(2,5)]
n <- nrow(mdf)-w
for(j in 1:n){
cor <-c()
sdd <- c()
xyy <- c()
el <- c()
for(i in 1:length(lag)) {
mdf1 <- mdf %>% mutate(dprL = lead(pr, lag[i]))
mdf2 <- mdf1[complete.cases(mdf1), ]
k <- j-1
mdf3 <- mdf2[j:(w+k), ]
cor[i] <- cor(mdf3[,1], mdf3[,3])
sdd[i] <- sd(mdf3[,3])/sd(mdf3[,1])
xyy[i] <- abs(mean(mdf3[,1]))/mean(mdf3[,3])
el[i] <- cor[i]*sdd[i]*xyy[i]
}
mcor[j] <- cor[which(cor == max(cor, na.rm = TRUE))]
sd[j] <- sdd[which(cor == max(cor, na.rm = TRUE))]
xy[j] <- xyy[which(cor == max(cor, na.rm = TRUE))]
mel[j] <- mcor[j]*sd[j]*xy[j]
beta_nyc[j] <- mcor[j]*sd[j]
}
roll <- mel[-length(mel)]
pdata <- data.frame(roll = roll[98:length(roll)],
Date = as.Date("2020-06-08") + 0:(length(roll)-98),
pr = nyc$pr[100:(length(roll)+2)])
coeff <-3.3
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_bar( aes(y=roll), stat="identity", size=.1, fill=dcolor,
color="black", alpha=.4) +
geom_line( aes(y=pr/coeff), size=0.7, color=prcolor) +
scale_y_continuous(
name = "Elasticity", limits = c(0,1.5),
sec.axis = sec_axis(~.*coeff, name= "Positivity Rate - %")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$roll), slope = 0, linetype=3,
color = "red") +
annotate("text", col = "darkred", x= as.Date("2020-07-20"), y=1,
label= paste0("Average Elasticity = ",
round(mean(pdata$roll),2))) +
annotate("text", col = "darkred", x= as.Date("2020-07-20"), y=0.85,
label= paste0("After October = ",
round(mean(pdata$roll[116:length(pdata$roll)]),2))) +
ggtitle("Responsiveness of PR to Mobility",
subtitle = "New York (after June 08)")
The results show that the average responsiveness of PR to mobility restrictions is much higher in New York than both Montreal and Toronto.
Here, we will use elasticities for normative-referenced evaluations and compare the effectiveness of mobility restrictions in Montreal amd NYC. We create a hypothetical case where we calculate elasticities for Montreal using the data for NYC. This shows how effective the mobility restrictions would have been if Montreal had had the same PR and mobility measures as NYC during the \(2^{nd}\) wave between September 16, 2020 and January 24, 2021. Let’s put the sliding window correlations for both cities on the same plot
pdata_mtr <- readRDS("pdata_mtr.RDS")
pdata_ny <- readRDS("pdata_ny.RDS")
pdata <- data.frame(roll_ny = pdata_ny$roll[101:length(pdata_mtr$roll)],
roll_mtr = pdata_mtr$roll[101:length(pdata_mtr$roll)],
Date = as.Date("2020-09-16") + 0:(length(pdata_mtr$roll)-101))
coeff <-1
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_line( aes(y=roll_ny), size=0.5, color="red") +
geom_line( aes(y=roll_mtr/coeff), size=0.5, color=prcolor) +
scale_y_continuous(
name = "New York", limits = c(0.3,1),
sec.axis = sec_axis(~.*coeff, name= "Montreal")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = "red", size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$roll_ny), slope = 0, linetype=3,
color = "red") +
geom_abline(intercept = mean(pdata$roll_mtr), slope = 0, linetype=3,
color = prcolor) +
annotate("text", col = "red", x= as.Date("2020-10-05"), y=0.5,
label= paste0("Average Correlation = ",
round(mean(pdata$roll_ny),2))) +
annotate("text", col = prcolor, x= as.Date("2020-10-05"), y=0.45,
label= paste0("Average Correlation = ",
round(mean(pdata$roll_mtr),2))) +
ggtitle("Correlations in Montreal & New York",
subtitle = "Between September 26 & January 24")
Here is the formal explanation our counter-factual simulations:
\[ \epsilon_M=\left[\frac{\partial PR /PR}{\partial R / R}\right]^M = 0.77\left[\frac{s_{pr}}{s_{r}}\right]^M \left[\frac{\bar{R}}{\bar{PR}}\right]^M=0.34 \]
\[ \epsilon_{NYC}=\left[\frac{\partial PR /PR}{\partial R / R}\right]^{NYC} = 0.71\left[\frac{s_{pr}}{s_{r}}\right]^{NYC} \left[\frac{\bar{R}}{\bar{PR}}\right]^{NYC}=0.65 \]
\[ \epsilon^{CounterFactual}_M=\left[\frac{\partial PR /PR}{\partial R / R}\right]^M = 0.77\left[\frac{s_{pr}}{s_{r}}\right]^M \left[\frac{\bar{R}}{\bar{PR}}\right]^{NYC}=1.35 \]
The result is shown below:
mtrbeta <- readRDS("beta_mtr.RDS")
lag = 1:21
w = 7
mel <- c()
sd <- c()
xy <- c()
mcor <- c()
beta_nyc <- c()
mdf <- nyc[-c(1:2), c(2,5)]
n <- nrow(mdf)-w
for(j in 1:n){
cor <-c()
sdd <- c()
xyy <- c()
el <- c()
for(i in 1:length(lag)) {
mdf1 <- mdf %>% mutate(dprL = lead(pr, lag[i]))
mdf2 <- mdf1[complete.cases(mdf1), ]
k <- j-1
mdf3 <- mdf2[j:(w+k), ]
cor[i] <- cor(mdf3[,1], mdf3[,3])
sdd[i] <- sd(mdf3[,3])/sd(mdf3[,1])
xyy[i] <- abs(mean(mdf3[,1]))/mean(mdf3[,3])
el[i] <- cor[i]*sdd[i]*xyy[i]
}
mcor[j] <- cor[which(cor == max(cor, na.rm = TRUE))]
sd[j] <- sdd[which(cor == max(cor, na.rm = TRUE))]
xy[j] <- xyy[which(cor == max(cor, na.rm = TRUE))]
mel[j] <- mcor[j]*sd[j]*xy[j]
beta_nyc[j] <- mcor[j]*sd[j]
}
roll <- mel[200:331]
roll_mtr <- mtrbeta[200:327]*xy[200:331]
pdata <- data.frame(roll = roll,
Date = as.Date("2020-09-26") + 0:(length(roll)-1),
rollmtr = roll_mtr)
coeff <-1
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)
ggplot(pdata, aes(x=Date)) +
geom_line( aes(y=roll), size=0.6, color="red") +
geom_line( aes(y=rollmtr/coeff), size=0.6, color=prcolor) +
scale_y_continuous(
name = "New York", limits = c(0,4),
sec.axis = sec_axis(~.*coeff, name= "Montreal")
) +
theme_ipsum() +
theme(
axis.title.y = element_text(color = "red", size=13, face = "bold"),
axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
plot.title = element_text(size = 15, face = "bold")
) +
geom_abline(intercept = mean(pdata$roll), slope = 0, linetype=3,
color = "red") +
geom_abline(intercept = mean(pdata$rollmtr), slope = 0, linetype=3,
color = prcolor) +
annotate("text", col = "red", x= as.Date("2020-11-25"), y=3,
label= paste0("Average Elasticity = ",
round(mean(pdata$roll),2))) +
annotate("text", col = prcolor, x= as.Date("2020-11-25"), y=3.5,
label= paste0("Average Elasticity = ",
round(mean(pdata$rollmtr),2))) +
ggtitle("Elasticity of Montreal's PR in New York",
subtitle = "Between September 26 & January 24")
In order to have this much jump in the elasticity, two things have to be true:
The second one indicates that, given that the mobility metrics rather measure the people’s behavioral response to the spread (Herby, 2021), this difference implies a significantly lower public sensitivity to COVID-19 in Montreal than other cities. The first one is much more a policy indicator and consistent with the second one: the average reduction in mobility relative to the spread might not have been enough in Montreal in terms of its magnitude and speed compared to NYC.
For the first time in history, we have now multiple mobility metrics at finer spatial and temporal scales. The question is very simple: do we have a significant and meaningful relationship between the transmission of viral infection and the geotemporal fluctuations in mobility? Although it looks like a simple question that can be answered with available data, there are fundamental challenges that make the question a very complex problem:
In this short work, we intend to show basic difficulties and possible solutions in capturing the underlying relationship between mobility restrictions and the COVID-19 spread. Our results show that:
A work by ML-Portal