-
Notifications
You must be signed in to change notification settings - Fork 13
/
README.Rmd
307 lines (205 loc) · 9.33 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
---
output: github_document
---
```{r setup, echo = FALSE, message=FALSE, results='hide'}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.width = 8,
fig.height = 6,
fig.path = "figs/",
echo = TRUE
)
```
<img src="https://raw.githubusercontent.com/reconhub/incidence/master/artwork/banner.png">
<br>
[![Travis-CI Build Status](https://travis-ci.org/reconhub/incidence.svg?branch=master)](https://travis-ci.org/reconhub/incidence)
[![Build status](https://ci.appveyor.com/api/projects/status/7h2mgej230dv5r7w/branch/master?svg=true)](https://ci.appveyor.com/project/thibautjombart/incidence/branch/master)
[![Coverage Status](https://codecov.io/github/reconhub/incidence/coverage.svg?branch=master)](https://codecov.io/github/reconhub/incidence?branch=master)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/incidence)](https://cran.r-project.org/package=incidence)
[![CRAN Downloads](https://cranlogs.r-pkg.org/badges/incidence)](https://cran.r-project.org/package=incidence)
[![Downloads from Rstudio mirror](https://cranlogs.r-pkg.org/badges/grand-total/incidence)](http://www.r-pkg.org/pkg/incidence)
[![DOI](https://zenodo.org/badge/49890366.svg)](https://zenodo.org/badge/latestdoi/49890366)
# Installing the package
To install the current stable, CRAN version of the package, type:
```{r install, eval=FALSE}
install.packages("incidence")
```
To benefit from the latest features and bug fixes, install the development,
*github* version of the package using:
```{r install2, eval=FALSE}
devtools::install_github("reconhub/incidence")
```
Note that this requires the package *devtools* installed.
# What does it do?
The main features of the package include:
- **`incidence`**: compute incidence from dates in various formats; any fixed
time interval can be used; the returned object is an instance of the (S3)
class *incidence*.
- **`plot`**: this method (see `?plot.incidence` for details) plots *incidence*
objects, and can also add predictions of the model(s) contained in an
*incidence_fit* object (or a list of such objects).
- **`fit`**: fit one or two exponential models (i.e. linear regression on
log-incidence) to an *incidence* object; two models are calibrated only if a
date is provided to split the time series in two (argument `split`); this is
typically useful to model the two phases of exponential growth, and decrease
of an outbreak; each model returned is an instance of the (S3) class
*incidence_fit*, each of which contains various useful information
(e.g. growth rate *r*, doubling/halving time, predictions and confidence
intervals); results can be plotted using `plot`, or added to an existing
`uncudence` plot using the piping-friendly function `add_incidence_fit`.
- **`fit_optim_split`**: finds the optimal date to split the time series in two,
typically around the peak of the epidemic.
- **`[`**: lower-level subsetan of *incidence* objects, permiting to specify
which dates and groups to retain; uses a syntax similar to matrices,
i.e. `x[i, j]`, where `x` is the *incidence* object, `i` a subset of dates,
and `j` a subset of groups.
- **`subset`**: subset an *incidence* object by specifying a time window.
- **`pool`**: pool incidence from different groups into one global incidence
time series.
- **`cumulate`**: computes cumulative incidence over time from and `incidence`
object.
- **`as.data.frame`**: converts an *incidence* object into a `data.frame`
containing dates and incidence values.
- **`bootstrap`**: generates a bootstrapped *incidence* object by re-sampling,
with replacement, the original dates of events.
- **`find_peak`**: locates the peak time of the epicurve.
- **`estimate_peak`**: uses bootstrap to estimate the peak time (and related
confidence interval) of a partially observed outbreak.
# Resources
## Vignettes
An overview of *incidence* is provided below in the worked example below.
More detailed tutorials are distributed as vignettes with the package:
```{r, vignettes2, eval = FALSE}
vignette("overview", package="incidence")
vignette("customize_plot", package="incidence")
vignette("incidence_class", package="incidence")
vignette("incidence_fit_class", package="incidence")
vignette("conversions", package="incidence")
```
## Websites
The following websites are available:
- The official *incidence* website, providing an overview of the package's functionalities, up-to-date tutorials and documentation: <br>
<https://www.repidemicsconsortium.org/incidence>
- The *incidence* project on *github*, useful for developers, contributors, and users wanting to post issues, bug reports and feature requests: <br>
<https://github.com/reconhub/incidence>
- The *incidence* page on CRAN: <br>
<https://CRAN.R-project.org/package=incidence>
## Getting help online
Bug reports and feature requests should be posted on *github* using the [*issue* system](https://github.com/reconhub/incidence/issues). All other questions should be posted on the **RECON forum**: <br>
<https://www.repidemicsconsortium.org/forum/>
# A quick overview
The following worked example provides a brief overview of the package's
functionalities. See the [*vignettes section*](#vignettes) for more detailed tutorials.
## Loading the data
This example uses the simulated Ebola Virus Disease (EVD) outbreak from the
package [*outbreaks*](https://cran.r-project.org/package=outbreaks). We will compute
incidence for various time steps, calibrate two exponential models around the
peak of the epidemic, and analyse the results.
First, we load the data:
```{r, data}
library(outbreaks)
library(ggplot2)
library(incidence)
dat <- ebola_sim$linelist$date_of_onset
class(dat)
head(dat)
```
## Computing and plotting incidence
We compute the weekly incidence:
```{r, incid1}
i.7 <- incidence(dat, interval = 7)
i.7
plot(i.7)
```
`incidence` can also compute incidence by specified groups using the `groups`
argument. For instance, we can compute the weekly incidence by gender:
```{r, gender}
i.7.sex <- incidence(dat, interval = "week", groups = ebola_sim$linelist$gender)
i.7.sex
plot(i.7.sex, stack = TRUE, border = "grey")
```
## Handling `incidence` objects
`incidence` objects can be manipulated easily. The `[` operator implements
subetting of dates (first argument) and groups (second argument). For
instance, to keep only the first 20 weeks of the epidemic:
```{r, start}
i.7[1:20]
plot(i.7[1:20])
```
Some temporal subsetting can be even simpler using `subset`, which permits to
retain data within a specified time window:
```{r, tail}
i.tail <- subset(i.7, from = as.Date("2015-01-01"))
i.tail
plot(i.tail, border = "white")
```
Subsetting groups can also matter. For instance, let's try and visualise the
incidence based on onset of symptoms by outcome:
```{r, i7outcome}
i.7.outcome <- incidence(dat, "week", groups = ebola_sim$linelist$outcome)
i.7.outcome
plot(i.7.outcome, stack = TRUE, border = "grey")
```
To visualise the cumulative incidence:
```{r, i7outcome_cum}
i.7.outcome.c <- cumulate(i.7.outcome)
i.7.outcome.c
plot(i.7.outcome.c)
```
Groups can also be collapsed into a single time series using `pool`:
```{r, pool}
i.pooled <- pool(i.7.outcome)
i.pooled
identical(i.7$counts, i.pooled$counts)
```
## Modelling incidence
Incidence data, excluding zeros, can be modelled using log-linear regression of the form:
log(*y*) = *r* x *t* + *b*
where *y* is the incidence, *r* is the growth rate, *t* is the number of days since a specific point in time (typically the start of the outbreak), and *b* is the intercept.
Such model can be fitted to any incidence object using `fit`.
Of course, a single log-linear model is not sufficient for modelling our time series, as there is clearly an growing and a decreasing phase.
As a start, we can calibrate a model on the first 20 weeks of the epidemic:
```{r, fit1}
plot(i.7[1:20])
early.fit <- fit(i.7[1:20])
early.fit
```
The resulting objects can be plotted, in which case the prediction and its confidence interval is displayed:
```{r fit-plot-1}
plot(early.fit)
```
However, a better way to display these predictions is adding them to the
incidence plot using the argument `fit`:
```{r fit-plot-2}
plot(i.7[1:20], fit = early.fit)
```
Alternatively, these can be piped using:
```{r pipe-fit}
library(magrittr)
plot(i.7[1:20]) %>% add_incidence_fit(early.fit)
```
In this case, we would ideally like to fit two models, before and after the peak
of the epidemic. This is possible using the following approach, in which the
best possible splitting date (i.e. the one maximizing the average fit of both
models), is determined automatically:
```{r, optim}
best.fit <- fit_optim_split(i.7)
best.fit
plot(i.7, fit = best.fit$fit)
```
# Credits
- Contributors (by alphabetic order):
- [Sangeeta Bhatia](https://github.com/sangeetabhatia03)
- [Jun Cai](https://github.com/caijun)
- [Rich Fitzjohn](https://github.com/richfitz)
- [Thibaut Jombart](https://github.com/thibautjombart)
- [Zhian Kamvar](https://github.com/zkamvar)
- [Juliet RC Pulliam](https://github.com/jrcpulliam)
- [Jakob Schumacher](https://github.com/jakobschumacher)
See details of contributions on:
<br>
https://github.com/reconhub/incidence/graphs/contributors
Contributions are welcome via **pull requests**.
Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.
**Maintainer:** Zhian N. Kamvar ([email protected])