---
title: "Exercise 2: Importing Weather Station Data using the Synoptic API"
format:
html:
theme: cosmo
df-print: paged
code-link: true
link-external-icon: true
link-external-newwindow: true
number-sections: true
editor: source
---
::: {.callout-note}
## Summary
This Notebook demonstrates how to query the Synoptic API using [`httr2`](https://httr2.r-lib.org/).
The desired output is a table containing:
- __daily minimum and maximum__ air temperature
- one weather station ( __CIMIS Station 077 (Oakville)__ )
- the current growing season (__Jan 1st thru yesterday__)
The table should have the following columns:
- `loc_id`: location id (we'll use the Synoptic station ID for CIMIS station 077, "CI077")
- `period`: 'rp' (recent past)
- `date`: date
- `tasmin`: minimum daily temperature
- `tasmax`: maximum daily temperature
![](./images/oakville_recent_past.png)
:::
\
# Read about Synotic's data and API
The first step in using any API is to read about the __organization__, the __data__, and tht __API documentation__.
Highlights of [Synoptic](https://synopticdata.com/):
- Synoptic aggregates and redistributes data from weather station networks all over the world
- every station has a unique ID
- data are provided hourly
- a public token is required to make calls to the API
\
# Gather all the information needed to query the API
1. Sign-up for account and create a __public token__.
1. Find the Station ID of your station of interest:
Start here:
Check data availability:
1. Determine which end point you need:
1. Read the docs for the end point
Make a list of the search parameters you need
\
::: {.callout-tip}
## Pro Tip
A good way to construct a test search is using the Synoptic Weather Query API Builder:
[https://demos.synopticdata.com/query-builder/](https://demos.synopticdata.com/query-builder/)
:::
\
# Create the API request object
Our work horse for calling APIs is [httr2](https://httr2.r-lib.org/).
```{r chunk01}
library(httr2)
```
\
Define the base URL:
```{r chunk02}
synoptic_ts_baseurl <- "https://api.synopticdata.com/v2/stations/timeseries"
```
\
Create a variable with your Synoptic public token:
```{r chunk03}
# my_token <- "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
my_token <- here::here("exercises/my_synoptic_token.txt") |> readLines(n=1)
```
\
Define the Station ID (for this exercise we are using `CI077` (Oakville CIMIS Station):
```{r chunk04}
station_id_chr <- "CI077"
```
\
Define the start time (midnight on January 1st):
```{r chunk05}
library(lubridate) |> suppressPackageStartupMessages()
start_local_dt <- make_datetime(year = 2024, month = 1, day = 1,
hour = 0, min = 0, sec = 0,
tz = "America/Los_Angeles")
start_local_dt
```
\
Convert the start time i) to UTC, then ii) to a character:
```{r chunk06}
start_utc_chr <- start_local_dt |> with_tz("UTC") |> format("%Y%m%d%H%M")
start_utc_chr
```
\
For the end time, we will use 11pm yesterday:
```{r chunk07}
yesterday_11pm_pdt_dt <- lubridate::as_datetime(Sys.Date() - 1, tz = "America/Los_Angeles") +
hours(23)
yesterday_11pm_pdt_dt
```
\
Convert the end time i) to UTC, then ii) to a character:
```{r chunk08}
end_utc_chr <- yesterday_11pm_pdt_dt |> with_tz("UTC") |> format("%Y%m%d%H%M")
end_utc_chr
```
\
Construct an object for the weather variables needed (see [https://demos.synopticdata.com/variables/](https://demos.synopticdata.com/variables/)):
```{r chunk09}
weather_vars <- "air_temp"
```
\
__We now have everything we need to create a request object!__
# Create the request object
```{r chunk10}
stn_tas_req <- request(synoptic_ts_baseurl) |>
req_headers("Accept" = "application/json") |>
req_url_query(token = my_token,
start = start_utc_chr,
end = end_utc_chr,
stid = station_id_chr,
vars = weather_vars,
units = "english",
obtimezone = "local",
.multi = "comma")
stn_tas_req
```
\
# Call the API
See what will be sent when we send the request:
```{r chunk11}
stn_tas_req |> req_dry_run()
```
\
Send the request:
```{r chunk12}
# Load a cached copy
stn_tas_resp <- readRDS(here::here("exercises/cached_api_responses/ex02_stn_tas_resp.Rds"))
# If you really want to send the request, uncomment the following:
# stn_tas_resp <- stn_tas_req |> req_perform()
# saveRDS(stn_tas_resp, file = here::here("exercises/cached_api_responses/ex02_stn_tas_resp.Rds"))
## Look at the response
stn_tas_resp
```
\
Check the status:
```{r chunk13}
stn_tas_resp |> resp_status()
stn_tas_resp |> resp_status_desc()
```
\
# CHALLENGE #1
Create an API request object that asks for the temperature values in Celsius. [Solution](https://bit.ly/3w7M4gJ)
```{r chunk14}
## Your answer here
```
\
## Process the response
### Convert the body to a list
Step 1 to process the response body is to extract it as a list:
```{r chunk15}
stn_tas_lst <- stn_tas_resp |> resp_body_json()
```
\
View the structure of the list:
::: {.callout-tip}
## Pro Tip
A good way to explore the structure of the body is to open it in a View window:
```{r chunk16}
# stn_tas_lst |> View()
```
:::
```{r chunk17}
str(stn_tas_lst, max.level = 3)
```
\
### Extract vectors of data for the data frame
Get the number of stations requested:
```{r chunk18}
stn_tas_lst$SUMMARY$NUMBER_OF_OBJECTS
```
\
Extract the name of the ith station :
```{r chunk19}
i <- 1
stn_tas_stationdata <- stn_tas_lst$STATION[[i]]
(stid_chr <- stn_tas_stationdata$STID)
```
\
Extract the date-times:
```{r chunk20}
obs_dt <- stn_tas_stationdata$OBSERVATIONS$date_time |>
unlist() |>
ymd_hms(tz = "America/Los_Angeles")
## Inspect the vector:
class(obs_dt)
length(obs_dt)
head(obs_dt)
range(obs_dt)
```
\
Extract the hourly temperatures:
```{r chunk21}
obs_tas <- stn_tas_stationdata$OBSERVATIONS$air_temp_set_1 |> unlist()
head(obs_tas)
length(obs_tas)
```
\
### Create a tibble with the required structure
Bring them all together in a tibble. For this, we'll want to use `dplyr`:
```{r chunk22}
library(dplyr) |> suppressPackageStartupMessages()
# Set preferences for functions with common names
library(conflicted)
conflict_prefer("filter", "dplyr", quiet = TRUE)
conflict_prefer("count", "dplyr", quiet = TRUE)
conflict_prefer("select", "dplyr", quiet = TRUE)
conflict_prefer("arrange", "dplyr", quiet = TRUE)
```
\
```{r chunk23}
stn_hrlytas_tbl <- tibble(stid = stid_chr,
dt = obs_dt,
tas = obs_tas)
head(stn_hrlytas_tbl)
# View(stn_hrly_tbl)
```
\
Convert from hourly to daily data:
```{r chunk24}
stn_dlytas_tbl <- stn_hrlytas_tbl |>
mutate(date = date(dt)) |>
group_by(stid, date) |>
summarise(count_obs = n(), tasmin = min(tas), tasmax = max(tas), .groups = "drop")
```
\
Inspect the results:
```{r chunk25}
stn_dlytas_tbl
```
\
Finish-up to get the final format:
`loc_id` | `period` | `date` | `tasmin` | `tasmax`
```{r chunk26}
stn_rctpast_dlytas_tbl <- stn_dlytas_tbl |>
mutate(period = "rp") |>
select(loc_id = stid, period, date, tasmin, tasmax)
head(stn_rctpast_dlytas_tbl)
# View(stn_rctpast_dlytas_tbl)
```
\
### Save results
Save the final table to disk so we can open it in other exercises:
```{r chunk27}
saveRDS(stn_rctpast_dlytas_tbl,
file = here::here("exercises/data/stn_rctpast_dlytas_tbl.Rds"))
```
\
# HOMEWORK
Bundle up this code in a function that returns a tibble of daily minimum and maximum temperature for any station in Synoptic. The function should cache the results in temp space for the current R session, which it should check first before calling the API.
```{r chunk28, eval = FALSE}
syn_dailytas <- function(stid, start_dt, end_dt, token, cache = TRUE) {
## Insert your answer here
}
```