## Introduction to Infectious Disease Dynamics ## Clinic on the Meaningful Modeling of Epidemiological Data ## International Clinics on Infectious Disease Dynamics and Data (ICI3D) Program ## African Institute for Mathematical Sciences, Muizenberg, RSA ## ## Juliet R.C. Pulliam, 2012-2018 ## ## ## The goal of this tutorial is to acquaint you with ways of ## analysing simple infectious disease models in R. By the end of ## this tutorial, you should be able to: ## ## * create a function describing a system of ODE's ## * use the package deSolve to numerically analyze a system of ODEs ## * plot the output of different types of functions ## ## NOTE: The comments will guide you through the tutorial but you ## should make sure you understand what the code is doing. Some ## function arguments are assigned to ?. This will give you an ## error. You should try out values for these arguments as ## suggested in the comments or find them yourselves in a help file. ###################################################################### ## Section 1: Creating a function to describe the behavior of an ODE ###################################################################### ## Before you start, it is a good idea to clear the workspace of things ## you have defined previously. Do this now: rm(list=ls()) # Clear all variables and functions ## We will now define a function to describe a simple SIR model with ## constant population size and no population turnover (that is, no ## births and deaths). This model was discussed in "Introduction to dynamic ## modeling of infectious diseases," and slides are available here: ## https://ndownloader.figshare.com/files/8541817 ## ## The version of the model we'll work with here has 3 state variables (though ## only two of these need to be specified to define the state of the system, ## since the population size is constant): ## ## S - the number of susceptibles in the population ## I - the number of infected/infectious individuals in the population ## R - the number of recovered/immune individuals in the population ## ## and 3 parameters: ## ## N - the total population size; N = S + I + R ## beta - the transmission coefficient; beta = the contact rate * infectivity ## gamma - 1 / the infectious period ## ## If you do not remember the model, review the lecture notes before you proceed ## with this tutorial. If you're not entirely comfortable with the distinction ## between state variables and parameters, you may also want to review the ## DAIDD Glossary: ## https://github.com/ICI3D/MMEDparticipants/raw/master/Resources/DAIDD2016_Glossary.pdf ## We will now define a function that describes the change in the state ## variables with time. The function takes the input t (the current time), y (a ## vector giving the current value of the state variables - that is, their ## value at time t), and params (a vector that defines the parameter values); ## the function outputs the values of dS/dt and dI/dt at time t: ## ##-- Model 1: SIR model --## ## ## sir() -- Function for numerical analysis of model 1 sir <- function(t,y,parms){ # The with() function gives access to the named values of parms within the # local environment created by the function with(c(as.list(y),parms),{ dSdt <- -beta*S*I/N dIdt <- beta*S*I/N - gamma*I # Note: Population size is constant, so don't need to specify dRdt return(list(c(dSdt,dIdt))) }) } ###################################################################### ## Section 2: Understanding the function that defines our ODE model ###################################################################### ## We have defined the function, but we now need to make sure we ## understand how it works. To do this, let's think about how we would ## approximate the model using discrete timesteps, similar to the way we ## used spreadsheets to analyze epidemic models on Monday. ## ## Since the function takes t, y, and parms as inputs, we first need to define ## these. Let us start at time zero: time <- 0 ## We will use parameter values that reflect a measles outbreak that occurred ## in New York City (many years ago!), so let our initial population size (N0) be N0 <- 7780000 ## The second argument of the sir() function defined above is y, which I have ## said is a vector giving the current value of the state variables. Let's ## assume that the initial population has 20.5 infecteds and that 6.5% of the ## population is susceptible. (Remember that this model assumes a large ## population size and represents population averages, so it is ok that these ## values are not integers!) We will define a vector pop.SI that gives the ## initial values of our 2 state variables: pop.SI <- c(S = 0.065*N0, # Initially 6.5% of the population is susceptible I = ???) # ENTER THE NUMBER INITIALLY INFECTED (20.5) ## Notice that I've named the two values in the initial population vector to ## help us keep track of which value is which. This also allows us to take ## advantage of the with() function called within the sir() function because I ## have used the same names that are used to refer to the state variables ## within the function. ## ## The final input our function needs is a vector of parameter values, which we ## can create in the same way: values <- c(beta = 3.6, # Transmission coefficient gamma = 1/5, # 1 / infectious period = 1/5 days N = N0) # population size (constant) ## Note that we can calculate the value of the basic reproduction number, ## R0 = beta / gamma, as follows: values["beta"]/values["gamma"] # value of R0 # HINT: Try putting this command inside the # function as.numeric() to keep R from # confusingly retaining the name of the first # value. ## Now that we have defined the inputs, we are ready to try out our function! sir(t=time,y=pop.SI,parms=values) ## Look at the output. What does this mean? ## ## Recall that I said the function outputs the time derivatives of S and I at ## the time t. This means that in order to get the (approximate) values of S ## and I at some time delta.t in the future, I need to calculate the equivalent ## of ## ## S(t+delta.t) = S(t) - (beta*S(t)*I(t)/N) * delta.t ## ## I(t+delta.t) = I(t) + (beta*S(t)*I(t)/N - gamma*I(t)) * delta.t ## ## This can be done as follows: delta.t <- 0.1 # Set a small value for delta.t (0.1 day) pop.next <- pop.SI + unlist(sir(t=time,y=pop.SI,parms=values)) * delta.t ## We could then iterate this process, updating our values of pop.SI and time ## with each iteration to get a discrete-time approximation of the values of the ## state variables through time. Remember, however, that the differential ## equations describe the rates of change in the limit as delta.t goes to zero. ## It turns out that using the above discrete time approximation of this ## process, as we have done on the spreadsheets (Track B) and live coding ## example, leads to rapid accumulation of error in our estimate of the state ## variables through time, even if we set our value of delta.t to be very ## small. Luckily for us, there are a number of algorithms that have been ## developed to come up with better (both more accurate and more ## computationally efficient) approximations to the values of the state ## variables through time. ###################################################################### ## Section 3: Numerical integration of our ODE model ###################################################################### ## In R, we can access these algorithms through the deSolve library. We should ## now load this library so we can access its functions: library(deSolve) # Load libary to be used for numerical integration ## The function within deSolve that we will be using is called lsoda(). Let's ## look at the help file for this function: ?lsoda ## This help file gives some detail on the history of the algorithm as well as ## describing the function's usage. For now, take a look at the "Usage" section ## to look at the different types of arguments the function can take. There are ## a lot of them! For now, we will focus on the first 4 inputs: y, times, func, ## and parms. Read through the "Arguments" section for the descriptions of these ## four inputs. Because we have already defined our function and seen how it ## works, these arguments should seem familiar. ## ## We have already defined the vectors that we will use for two of the arguments ## (y, the vector of the initial values of the state variables, and parms, the ## vector of parameter values). We have also defined a third argumet, func, ## which provides the model definition; it is our function sir(). All that ## remains is to define the vector of times, which tells lsoda() the timepoints ## for which we would like to know the values of S and I. (NOTE: This is ## different from specifying the time points at which to evaluate our function, ## which is determined within the lsoda() function and optimized to efficiently ## provide an accurate approximation.) ## ## Let's ask for output every 0.1 days for 365 days, starting at time 0: time.out <- seq(0,365,???) # INCLUDE THE APPROPRIATE VALUE TO GET A SEQUENCE # FROM 0 to 365 BY STEPS OF 0.1 ## Now let's see what happens if we plug our inputs into lsoda()... lsoda( y = pop.SI, # Initial conditions for population times = time.out, # Timepoints for evaluation func = sir, # Function to evaluate parms = values # Vector of parameters ) ## Well, that seems to have done something, but it's hard to understand the ## output. Let's make it easier to look at by saving the output as a data.frame ## named ts.sir ("ts" being short for timeseries). ts.sir <- data.frame(lsoda( y = pop.SI, # Initial conditions for population times = time.out, # Timepoints for evaluation func = sir, # Function to evaluate parms = values # Vector of parameters )) ## Now let's look at the output: head(ts.sir) ## Our data frame has 3 columns, conveniently labeled "time", "S", and "I", ## showing the values of time and the two state variables through time. We can ## now look at the output in other ways. For example, we saw yesterday that we ## expect the infection to go extinct because there is no source of new ## susceptibles in our equations (eg, through births). Let's see what has ## happened after a year: subset(ts.sir,time==365) ## There are more infecteds after a year than there were at time 0, so it looks ## like it takes more than a year for the infection to burn out with these ## initial conditions. If we want to see how the epidemic has progressed ## through time, we can plot to output from lsoda(). Remember that the ## parameters we have used are the same as in the live coding example, so ## this is a model of a measles epidemic in New York. We'll label the plot ## accordingly. (You'll learn a lot more about plotting - and modifying the ## appearance of plots - in Tutorial 4.) plot(ts.sir$time, # Time on the x axis ts.sir$I, # Number infected (I) on the y axis xlab = "Time in days", # Label the x axis ylab = "Number infected", # Label the y axis main = "Measles in New York", # Plot title xlim = c(0,400), # type = "l", # Use a line plot bty = "n") # Remove the box around the plot ## While there are many more infecteds at time 365 than there were at time 0, ## there are a lot fewer than there were at time 200, and it looks like the ## epidemic is nearly done. ###################################################################### ## BENCHMARK QUESTIONS ###################################################################### ## Question 1: ## ## Run the SIR model above out to 5 years to convince yourself that the epidemic ## is ending and will not come back. ## ## Question 2: ## ## Create a new function called sir.bd() that describes the SIR model with ## births and deaths (but constant population size), and compare the output ## to the model defined above. You may want to refer to yesterday's lecture ## notes for the model definition. Assume that the life expectancy in the ## population is 60 years. ## ## Question 3: ## ## Use lsoda() to evaluate the values of S and I through time for your new ## function defining the model with births and deaths over 5 years, with the ## same initial conditions as before. How do the dynamics compare? ## ## Question 4: ## ## Now change the value of the life expectancy for the population in the model. ## Compare the long-term dynamics of the model with different values for the ## life expectancy.