--- title: "Exercise 2: Importing Weather Station Data using the Synoptic API" format: html: theme: cosmo df-print: paged code-link: true link-external-icon: true link-external-newwindow: true number-sections: true editor: source --- ::: {.callout-note} ## Summary This Notebook demonstrates how to query the Synoptic API using [`httr2`](https://httr2.r-lib.org/). The desired output is a table containing: - __daily minimum and maximum__ air temperature - one weather station ( __CIMIS Station 077 (Oakville)__ ) - the current growing season (__Jan 1st thru yesterday__) The table should have the following columns: - `loc_id`: location id (we'll use the Synoptic station ID for CIMIS station 077, "CI077") - `period`: 'rp' (recent past) - `date`: date - `tasmin`: minimum daily temperature - `tasmax`: maximum daily temperature ![](./images/oakville_recent_past.png) ::: \ # Read about Synotic's data and API The first step in using any API is to read about the __organization__, the __data__, and tht __API documentation__. Highlights of [Synoptic](https://synopticdata.com/): - Synoptic aggregates and redistributes data from weather station networks all over the world - every station has a unique ID - data are provided hourly - a public token is required to make calls to the API \ # Gather all the information needed to query the API 1. Sign-up for account and create a __public token__. 1. Find the Station ID of your station of interest: Start here: Check data availability: 1. Determine which end point you need: 1. Read the docs for the end point Make a list of the search parameters you need \ ::: {.callout-tip} ## Pro Tip A good way to construct a test search is using the Synoptic Weather Query API Builder: [https://demos.synopticdata.com/query-builder/](https://demos.synopticdata.com/query-builder/) ::: \ # Create the API request object Our work horse for calling APIs is [httr2](https://httr2.r-lib.org/). ```{r chunk01} library(httr2) ``` \ Define the base URL: ```{r chunk02} synoptic_ts_baseurl <- "https://api.synopticdata.com/v2/stations/timeseries" ``` \ Create a variable with your Synoptic public token: ```{r chunk03} # my_token <- "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" my_token <- here::here("exercises/my_synoptic_token.txt") |> readLines(n=1) ``` \ Define the Station ID (for this exercise we are using `CI077` (Oakville CIMIS Station): ```{r chunk04} station_id_chr <- "CI077" ``` \ Define the start time (midnight on January 1st): ```{r chunk05} library(lubridate) |> suppressPackageStartupMessages() start_local_dt <- make_datetime(year = 2024, month = 1, day = 1, hour = 0, min = 0, sec = 0, tz = "America/Los_Angeles") start_local_dt ``` \ Convert the start time i) to UTC, then ii) to a character: ```{r chunk06} start_utc_chr <- start_local_dt |> with_tz("UTC") |> format("%Y%m%d%H%M") start_utc_chr ``` \ For the end time, we will use 11pm yesterday: ```{r chunk07} yesterday_11pm_pdt_dt <- lubridate::as_datetime(Sys.Date() - 1, tz = "America/Los_Angeles") + hours(23) yesterday_11pm_pdt_dt ``` \ Convert the end time i) to UTC, then ii) to a character: ```{r chunk08} end_utc_chr <- yesterday_11pm_pdt_dt |> with_tz("UTC") |> format("%Y%m%d%H%M") end_utc_chr ``` \ Construct an object for the weather variables needed (see [https://demos.synopticdata.com/variables/](https://demos.synopticdata.com/variables/)): ```{r chunk09} weather_vars <- "air_temp" ``` \ __We now have everything we need to create a request object!__ # Create the request object ```{r chunk10} stn_tas_req <- request(synoptic_ts_baseurl) |> req_headers("Accept" = "application/json") |> req_url_query(token = my_token, start = start_utc_chr, end = end_utc_chr, stid = station_id_chr, vars = weather_vars, units = "english", obtimezone = "local", .multi = "comma") stn_tas_req ``` \ # Call the API See what will be sent when we send the request: ```{r chunk11} stn_tas_req |> req_dry_run() ``` \ Send the request: ```{r chunk12} # Load a cached copy stn_tas_resp <- readRDS(here::here("exercises/cached_api_responses/ex02_stn_tas_resp.Rds")) # If you really want to send the request, uncomment the following: # stn_tas_resp <- stn_tas_req |> req_perform() # saveRDS(stn_tas_resp, file = here::here("exercises/cached_api_responses/ex02_stn_tas_resp.Rds")) ## Look at the response stn_tas_resp ``` \ Check the status: ```{r chunk13} stn_tas_resp |> resp_status() stn_tas_resp |> resp_status_desc() ``` \ # CHALLENGE #1 Create an API request object that asks for the temperature values in Celsius. [Solution](https://bit.ly/3w7M4gJ) ```{r chunk14} ## Your answer here ``` \ ## Process the response ### Convert the body to a list Step 1 to process the response body is to extract it as a list: ```{r chunk15} stn_tas_lst <- stn_tas_resp |> resp_body_json() ``` \ View the structure of the list: ::: {.callout-tip} ## Pro Tip A good way to explore the structure of the body is to open it in a View window: ```{r chunk16} # stn_tas_lst |> View() ``` ::: ```{r chunk17} str(stn_tas_lst, max.level = 3) ``` \ ### Extract vectors of data for the data frame Get the number of stations requested: ```{r chunk18} stn_tas_lst$SUMMARY$NUMBER_OF_OBJECTS ``` \ Extract the name of the ith station : ```{r chunk19} i <- 1 stn_tas_stationdata <- stn_tas_lst$STATION[[i]] (stid_chr <- stn_tas_stationdata$STID) ``` \ Extract the date-times: ```{r chunk20} obs_dt <- stn_tas_stationdata$OBSERVATIONS$date_time |> unlist() |> ymd_hms(tz = "America/Los_Angeles") ## Inspect the vector: class(obs_dt) length(obs_dt) head(obs_dt) range(obs_dt) ``` \ Extract the hourly temperatures: ```{r chunk21} obs_tas <- stn_tas_stationdata$OBSERVATIONS$air_temp_set_1 |> unlist() head(obs_tas) length(obs_tas) ``` \ ### Create a tibble with the required structure Bring them all together in a tibble. For this, we'll want to use `dplyr`: ```{r chunk22} library(dplyr) |> suppressPackageStartupMessages() # Set preferences for functions with common names library(conflicted) conflict_prefer("filter", "dplyr", quiet = TRUE) conflict_prefer("count", "dplyr", quiet = TRUE) conflict_prefer("select", "dplyr", quiet = TRUE) conflict_prefer("arrange", "dplyr", quiet = TRUE) ``` \ ```{r chunk23} stn_hrlytas_tbl <- tibble(stid = stid_chr, dt = obs_dt, tas = obs_tas) head(stn_hrlytas_tbl) # View(stn_hrly_tbl) ``` \ Convert from hourly to daily data: ```{r chunk24} stn_dlytas_tbl <- stn_hrlytas_tbl |> mutate(date = date(dt)) |> group_by(stid, date) |> summarise(count_obs = n(), tasmin = min(tas), tasmax = max(tas), .groups = "drop") ``` \ Inspect the results: ```{r chunk25} stn_dlytas_tbl ``` \ Finish-up to get the final format: `loc_id` | `period` | `date` | `tasmin` | `tasmax` ```{r chunk26} stn_rctpast_dlytas_tbl <- stn_dlytas_tbl |> mutate(period = "rp") |> select(loc_id = stid, period, date, tasmin, tasmax) head(stn_rctpast_dlytas_tbl) # View(stn_rctpast_dlytas_tbl) ``` \ ### Save results Save the final table to disk so we can open it in other exercises: ```{r chunk27} saveRDS(stn_rctpast_dlytas_tbl, file = here::here("exercises/data/stn_rctpast_dlytas_tbl.Rds")) ``` \ # HOMEWORK Bundle up this code in a function that returns a tibble of daily minimum and maximum temperature for any station in Synoptic. The function should cache the results in temp space for the current R session, which it should check first before calling the API. ```{r chunk28, eval = FALSE} syn_dailytas <- function(stid, start_dt, end_dt, token, cache = TRUE) { ## Insert your answer here } ```