**CRIB SHEET RULES OF THE ROAD:**

This crib sheet is provided to support access, utilization, and plotting of UCalgary optical datasets. It is intended as a base set of code that a user may edit and manipulate to serve their own needs.  Crib sheets contains UCalgary verified and validated procedures for plotting and manipulating UCalgary ASI data for common use cases.  Use of this crib sheet does not require acknowledgment, it is freely distributed for scientific use. Please also remember to perform due diligence on all data use.  We recommend comparison with verified data products on [data.phys.ucalgary.ca](https://data.phys.ucalgary.ca) to ensure that any user output does not contradict operational summary plots.  Data use must be acknowledged according to the information available for each data set - please see [data.phys.ucalgary.ca](https://data.phys.ucalgary.ca).  If you encounter any issues with the data or the crib sheet, please contact the UCalgary team for support (Emma Spanswick, elspansw@ucalgary.ca). Copyright © University of Calgary.

---
# **Download and read ASI raw data using PyAuroraX**
---


Data can be downloaded from the UCalgary Space Remote Sensing Open Data Platform using the one of the following methods:
  - PyAuroraX (for all-sky imager data only) <-- we'll explore this method in this crib sheet
  - FTP (ftp://data.phys.ucalgary.ca)
  - Rsync (rsync://data.phys.ucalgary.ca)
  - HTTP via browser (https://data.phys.ucalgary.ca).
  - Directly using the API (https://api.phys.ucalgary.ca)

Please note that the API is currently under development and we will do our best to keep this crib sheet up-to-date with the latest changes. If you have any questions, please reach out to the UCalgary Team (Emma Spanswick, elspansw@ucalgary.ca).

### **Crib Sheet Summary**

Below, we'll go through how to download and read data using PyAuroraX, the recommended library for working with All-Sky Imager (ASI) data that we provide.

</br>

---

</br>

## **Install dependencies**

Here we'll install [PyAuroraX](https://github.com/aurorax-space/pyaurorax), and import it.

Some helpful links:
  - [PyAuroraX documentation](https://docs.aurorax.space/code/overview)
  - [PyAuroraX API Reference](https://docs.aurorax.space/code/pyaurorax_api_reference/pyaurorax)
  - [Jupyter notebook examples](https://github.com/aurorax-space/pyaurorax/tree/main/examples/notebooks)

In [None]:
!pip install pyaurorax

Looking in indexes: https://test.pypi.org/pypi/, https://pypi.org/simple
Collecting pyaurorax==1.0.0-rc1
  Downloading https://test-files.pythonhosted.org/packages/1b/6a/7032956187fa0b6c7bdbc4c2de8b0e1d70a130947ac2acc13083cc380065/pyaurorax-1.0.0rc1-py3-none-any.whl (193 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m193.3/193.3 kB[0m [31m5.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting aacgmv2<3.0.0,>=2.6.2 (from pyaurorax==1.0.0-rc1)
  Downloading aacgmv2-2.6.3.tar.gz (1.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting cartopy<0.24.0,>=0.23.0 (from pyaurorax==1.0.0-rc1)
  Downloading Cartopy-0.23.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━

In [None]:
import pprint
import datetime
import pyaurorax

aurorax = pyaurorax.PyAuroraX()

## **Explore datasets**

All data available are organized by unique 'dataset' identifier strings, for example, 'THEMIS_ASI_RAW'. There are a few functions available for exploring and finding more information about the datasets. Let's take a look at these.

In [None]:
# list all datasets
datasets = aurorax.data.list_datasets()

# print them out in a table
aurorax.data.list_datasets_in_table()

Name                                     Provider   Level   DOI Details                                               Short Description                                                         
REGO_CALIBRATION_FLATFIELD_IDLSAV        UCalgary   L3      https://commons.datacite.org/doi.org/10.11575/z7x6-5c42   REGO All Sky Imagers Flatfield calibration data (IDL save format)         
REGO_CALIBRATION_RAYLEIGHS_IDLSAV        UCalgary   L3      https://commons.datacite.org/doi.org/10.11575/z7x6-5c42   REGO All Sky Imagers Rayleighs calibration data (IDL save format)         
REGO_DAILY_KEOGRAM_JPG                   UCalgary   L2      https://commons.datacite.org/doi.org/10.11575/z7x6-5c42   REGO All Sky Imagers daily keogram summary product (JPG format)           
REGO_DAILY_KEOGRAM_PGM                   UCalgary   L2      https://commons.datacite.org/doi.org/10.11575/z7x6-5c42   REGO All Sky Imagers daily keogram summary product (PGM format)           
REGO_DAILY_KEOGRAM_PNG             

In [None]:
# get the TREx RGB raw dataset
dataset = aurorax.data.list_datasets("TREX_RGB_RAW_NOMINAL")[0]

# view Dataset object in a few different ways
print("Object representation:\n----------------")
print(dataset)

print()
print()

# and now as a dictionary
print("Object as a dictionary:\n----------------")
pprint.pprint(dataset.__dict__)

Object representation:
----------------
Dataset(name=TREX_RGB_RAW_NOMINAL, short_description='TREx RGB All Sky Imagers 3-sec raw data', provider='UCalgary', level='L0', doi_details='https://commons.datacite.org/doi.org/10.11575/4p8e-1k65', ...)


Object as a dictionary:
----------------
{'citation': 'Spanswick, E., & Donovan, E. Transition Region Explorer - RGB '
             'Dataset [Data set]. University of Calgary. '
             'https://doi.org/10.11575/4P8E-1K65',
 'data_tree_url': 'https://data.phys.ucalgary.ca/sort_by_project/TREx/RGB/stream0',
 'doi': 'https://doi.org/10.11575/4P8E-1K65',
 'doi_details': 'https://commons.datacite.org/doi.org/10.11575/4p8e-1k65',
 'file_listing_supported': True,
 'file_reading_supported': True,
 'level': 'L0',
 'long_description': 'Transition Region Explorer (TREx) full-color RGB All Sky '
                     'Imager array. More information can be found at '
                     'https://data.phys.ucalgary.ca',
 'name': 'TREX_RGB_RAW_NOMINAL'

In [None]:
# there is also a pretty_print function to show all information in a different way
dataset.pretty_print()

Dataset:
  citation                   : Spanswick, E., & Donovan, E. Transition Region Explorer - RGB Dataset [Data set]. University of Calgary. https://doi.org/10.11575/4P8E-1K65
  data_tree_url              : https://data.phys.ucalgary.ca/sort_by_project/TREx/RGB/stream0
  doi                        : https://doi.org/10.11575/4P8E-1K65
  doi_details                : https://commons.datacite.org/doi.org/10.11575/4p8e-1k65
  file_listing_supported     : True
  file_reading_supported     : True
  level                      : L0
  long_description           : Transition Region Explorer (TREx) full-color RGB All Sky Imager array. More information can be found at https://data.phys.ucalgary.ca
  name                       : TREX_RGB_RAW_NOMINAL
  provider                   : UCalgary
  short_description          : TREx RGB All Sky Imagers 3-sec raw data


## **Explore observatories**

Each ASI array has a set of observatories where instruments are deployed. We can easily view observatory information using the `list_observatories()` functions. We show some examples below.

In [None]:
# list all observatories
#
# we must supply an 'instrument array' parameter to this function. View the
# documentation or integrated type hinting in VSCode to see possible instrument
# array choices.
observatories = aurorax.data.list_observatories("themis_asi")    # some choices are: themis_asi, rego, trex_nir, trex_rgb, trex_blue

# print them out in a table
aurorax.data.list_observatories_in_table("themis_asi")

UID    Full Name                   Geodetic Latitude   Geodetic Longitude   Provider
atha   Athabasca, AB, Canada       54.6                -113.64              UCalgary
chbg   Chibougamau, QC, Canada     49.81               -74.42               UCalgary
ekat   Ekati, NWT, Canada          64.73               -110.67              UCalgary
fsim   Fort Simpson, NWT, Canada   61.76               -121.27              UCalgary
fsmi   Fort Smith, NWT, Canada     60.03               -111.93              UCalgary
fykn   Fort Yukon, AK, USA         66.56               -145.21              UCalgary
gako   Gakona, AK, USA             62.41               -145.16              UCalgary
gbay   Goose Bay, NL, Canada       53.32               -60.46               UCalgary
gill   Gillam, MB, Canada          56.38               -94.64               UCalgary
inuv   Inuvik, NWT, Canada         68.41               -133.77              UCalgary
kapu   Kapuskasing, ON, Canada     49.39               -82.32    

In [None]:
# view the first dataset
print(observatories[0])

print()

# show all values in the Dataset class
pprint.pprint(observatories[0].__dict__)

Observatory(uid=atha, full_name='Athabasca, AB, Canada', geodetic_latitude=54.6, geodetic_longitude=-113.64, provider='UCalgary')

{'full_name': 'Athabasca, AB, Canada',
 'geodetic_latitude': 54.6,
 'geodetic_longitude': -113.64,
 'provider': 'UCalgary',
 'uid': 'atha'}


In [None]:
# there is also a pretty_print function to show all information in a different way
observatories[0].pretty_print()

Observatory:
  full_name             : Athabasca, AB, Canada
  geodetic_latitude     : 54.6
  geodetic_longitude    : -113.64
  provider              : UCalgary
  uid                   : atha


## **Download the data**

We are going to download an hour of THEMIS data from the camera in Athabasca, AB.

The `download()` function provides the ability to download a timeframe of data for a specific dataset. As a dataset can have data from many different sites, the site is an optional flag. There are also a few more options to choose from, including disabling or customizing the progress bar, forcing the data to be redownloaded even if it exists locally already, or adjusting the number of parallel download streams. Explore the API reference documentation for the function to learn more.

By default, PyAuroraX will save data to the `aurorax.download_output_root_path` variable, which is your home directory. If you want to change this, you can edit this attribute like so:

```python
aurorax.download_output_root_path = "some other path"
```

<small>Note: If you are running this notebook from within VSCode and you want the progress bar to be **dark theme**, insert the HTML markup [found here](https://github.com/microsoft/vscode-jupyter/issues/7161#issuecomment-1616627670) at the top of the notebook.</small>

In [None]:
# We can set the path of where we want to save data to. By default,
# PyAuroraX saved data to your home directory. Since we can be running
# this crib sheet on Google Colab, we're going to do a special case
# and set the path if running in Colab. We'll leave the default if
# running in a different environment.
import sys
if ("google.colab" in sys.modules):
    aurorax.download_output_root_path = "/content/ucalgary_data"
    aurorax.read_tar_temp_path = "/content/ucalgary_data/tar_temp_working"
print(aurorax)

PyAuroraX(download_output_root_path='/content/ucalgary_data', read_tar_temp_path='/content/ucalgary_data/tar_temp_working', api_base_url='https://api.aurorax.space', api_headers={'content-type': 'application/json', 'user-agent': 'python-pyaurorax/1.0.0-rc1'}, api_timeout=10, api_key='None', srs_obj=PyUCalgarySRS(...))


In [None]:
# download an hour of THEMIS ASI data
dataset_name = "THEMIS_ASI_RAW"
start_dt = datetime.datetime(2021, 11, 4, 9, 0)
end_dt = datetime.datetime(2021, 11, 4, 9, 59)
site_uid = "atha"
r = aurorax.data.ucalgary.download(dataset_name, start_dt, end_dt, site_uid=site_uid)

Downloading THEMIS_ASI_RAW files:   0%|          | 0.00/128M [00:00<?, ?B/s]

If you want to download 5 minutes of data from all sites, you can do that too by adjusting the start and end times, and excluding the site_uid parameter.

In [None]:
# download 5 minutes of data from all sites
dataset_name = "THEMIS_ASI_RAW"
start_dt = datetime.datetime(2021, 11, 4, 9, 0)
end_dt = datetime.datetime(2021, 11, 4, 9, 4)
_ = aurorax.data.ucalgary.download(dataset_name, start_dt, end_dt)

Downloading THEMIS_ASI_RAW files:   0%|          | 0.00/136M [00:00<?, ?B/s]

When data is downloaded, a `FileDownloadResult` object is returned that has details about the data just downloaded. Though this information is not normally used by many of the common workflows, it is still available if needed.

In [None]:
# view information about the downloaded data
print(r)
print()

pprint.pprint(r.__dict__)

FileDownloadResult(filenames=[PosixPath('/content/ucalgary_data/THEMIS_ASI_RAW/2021/11/04/atha_themis02/ut09/20211104_0900_atha_themis02_full.pgm.gz'), PosixPath('/content/ucalgary_data/THEMIS_ASI_RAW/2021/11/04/atha_themis02/ut09/20211104_0901_atha_themis02_full.pgm.gz'), PosixPath('/content/ucalgary_data/THEMIS_ASI_RAW/2021/11/04/atha_themis02/ut09/20211104_0902_atha_themis02_full.pgm.gz'), PosixPath('/content/ucalgary_data/THEMIS_ASI_RAW/2021/11/04/atha_themis02/ut09/20211104_0903_atha_themis02_full.pgm.gz'), PosixPath('/content/ucalgary_data/THEMIS_ASI_RAW/2021/11/04/atha_themis02/ut09/20211104_0904_atha_themis02_full.pgm.gz'), PosixPath('/content/ucalgary_data/THEMIS_ASI_RAW/2021/11/04/atha_themis02/ut09/20211104_0905_atha_themis02_full.pgm.gz'), PosixPath('/content/ucalgary_data/THEMIS_ASI_RAW/2021/11/04/atha_themis02/ut09/20211104_0906_atha_themis02_full.pgm.gz'), PosixPath('/content/ucalgary_data/THEMIS_ASI_RAW/2021/11/04/atha_themis02/ut09/20211104_0907_atha_themis02_full.pgm.

## **Read the data**

The data reading routines simply take in a list of filenames on your computer. The list of files is returned in a `download()` call, but can also be created using `glob` or similar by just searching on your computer for the filenames manually.

There are two methods available for reading data:

1) using the generic method (most common)
2) using a specific dataset read function

The generic method is the easiest and most common way. If more control is wanted, you can use the specific read functions directly. The generic method simply uses the dataset name to determine which specific read function to use; there are no other differences.


In [None]:
# let's show the generic method first, since it is the easiest way
#
# a Data object is returned
data = aurorax.data.ucalgary.read(r.dataset, r.filenames, n_parallel=2)

print(data)
print()

# Data objects have a print function to nicely view the information in it. You'll
# find that many classes in PyAuroraX have this pretty_print() function to help view
# the information. It can be handy at times.
data.pretty_print()

Data(data=array(dims=(256, 256, 1200), dtype=uint16), timestamp=[1200 datetimes], metadata=[1200 dictionaries], problematic_files=[], calibrated_data=None, dataset=Dataset(name=THEMIS_ASI_RAW, short_description='THEMIS All Sky Imagers 3-se...))

Data:
  data                  : array(dims=(256, 256, 1200), dtype=uint16)
  timestamp             : [1200 datetimes]
  metadata              : [1200 dictionaries]
  problematic_files     : []
  calibrated_data       : None
  dataset               : Dataset(name=THEMIS_ASI_RAW, short_description='THEMIS All Sky Imagers 3-se...)


Now let's show the second method, whereby we use the specific read routine. We know that the data we're reading is THEMIS raw data, so we'll use the `read_themis()` function.

In [None]:
# since we know we're reading in THEMIS raw data, we can also use the specific read routine. In most
# circumstances, this method isn't necessary to use.
data = aurorax.data.ucalgary.readers.read_themis(r.filenames, n_parallel=2, dataset=r.dataset)

print(data)
print()

data.pretty_print()

Data(data=array(dims=(256, 256, 1200), dtype=uint16), timestamp=[1200 datetimes], metadata=[1200 dictionaries], problematic_files=[], calibrated_data=None, dataset=Dataset(name=THEMIS_ASI_RAW, short_description='THEMIS All Sky Imagers 3-se...))

Data:
  data                  : array(dims=(256, 256, 1200), dtype=uint16)
  timestamp             : [1200 datetimes]
  metadata              : [1200 dictionaries]
  problematic_files     : []
  calibrated_data       : None
  dataset               : Dataset(name=THEMIS_ASI_RAW, short_description='THEMIS All Sky Imagers 3-se...)


## **Managing downloaded data**

There are also a few functions available to help manage the downloaded data. We can see the local disk usage, and can clean it out if it's getting too big.

In [None]:
# to view the amount of data that is currently downloaded, do the following
aurorax.show_data_usage()

Dataset name                        Size     
THEMIS_ASI_RAW                      253.6 MB 
THEMIS_ASI_SKYMAP_IDLSAV            7.1 MB   
REGO_RAW                            5.5 MB   
REGO_CALIBRATION_FLATFIELD_IDLSAV   1.4 MB   
REGO_CALIBRATION_RAYLEIGHS_IDLSAV   443 Bytes

Total size: 267.7 MB


In [None]:
# alternatively, you can get the data usage information returned as a dictionary
data_usage_dict = aurorax.show_data_usage(return_dict=True)

pprint.pprint(data_usage_dict)

{'REGO_CALIBRATION_FLATFIELD_IDLSAV': {'path_obj': PosixPath('/content/ucalgary_data/REGO_CALIBRATION_FLATFIELD_IDLSAV'),
                                       'size_bytes': 1425416,
                                       'size_str': '1.4 MB'},
 'REGO_CALIBRATION_RAYLEIGHS_IDLSAV': {'path_obj': PosixPath('/content/ucalgary_data/REGO_CALIBRATION_RAYLEIGHS_IDLSAV'),
                                       'size_bytes': 443,
                                       'size_str': '443 Bytes'},
 'REGO_RAW': {'path_obj': PosixPath('/content/ucalgary_data/REGO_RAW'),
              'size_bytes': 5528318,
              'size_str': '5.5 MB'},
 'THEMIS_ASI_RAW': {'path_obj': PosixPath('/content/ucalgary_data/THEMIS_ASI_RAW'),
                    'size_bytes': 253634069,
                    'size_str': '253.6 MB'},
 'THEMIS_ASI_SKYMAP_IDLSAV': {'path_obj': PosixPath('/content/ucalgary_data/THEMIS_ASI_SKYMAP_IDLSAV'),
                              'size_bytes': 7078784,
                              's

To clean up all data we've downloaded, you can do:

```python
aurorax.purge_download_output_root_path()
```

Alternatively, you can manually delete data yourself. They're just regular files on your computer, so very easy to manage.

## **Skymaps**

Skymap files are used for projecting ASI image data on a map. More information about why this is important can be found in the other crib sheets.

In [None]:
# search datasets for skymaps
aurorax.data.list_datasets_in_table(name="SKYMAP")

Name                       Provider   Level   DOI Details                                               Short Description                                          
REGO_SKYMAP_IDLSAV         UCalgary   L3      https://commons.datacite.org/doi.org/10.11575/z7x6-5c42   REGO All Sky Imagers skymap data (IDL save format)         
THEMIS_ASI_SKYMAP_IDLSAV   UCalgary   L3      None                                                      THEMIS All Sky Imagers skymap data (IDL save format)       
TREX_BLUE_SKYMAP_IDLSAV    UCalgary   L3      https://commons.datacite.org/doi.org/10.11575/80pf-0p02   TREx Blueline All Sky Imagers skymap data (IDL save format)
TREX_NIR_SKYMAP_IDLSAV     UCalgary   L3      https://commons.datacite.org/doi.org/10.11575/98w7-jp47   TREx NIR All Sky Imagers skymap data (IDL save format)     
TREX_RGB_SKYMAP_IDLSAV     UCalgary   L3      https://commons.datacite.org/doi.org/10.11575/4p8e-1k65   TREx RGB All Sky Imagers skymap data (IDL save format)     


In [None]:
# we'll set our dataset for use later
dataset = aurorax.data.list_datasets(name="THEMIS_ASI_SKYMAP_IDLSAV")[0]

When selecting a skymap to use for projecting an image on a map, we have two methods available to us:

1. choosing manually
2. using the `download_best_skymap()` function to choose automatically

Skymaps are generated for each site, and for a given time period. It is important to choose a skymap that is valid for the date you're looking at data for, otherwise the image may not appear accurately when projected on a map.

### **Choosing a skymap manually**

In [None]:
# First, let's choose the skymap we want manually. Let's assume we are working on data
# from the Gillam THEMIS ASI on 2021-11-04.
#
# We'll download the skymaps for a few years around that time, and then we'll choose which one we want after
r = aurorax.data.ucalgary.download(
    "THEMIS_ASI_SKYMAP_IDLSAV",
    datetime.datetime(2021, 1, 1, 0, 0),
    datetime.datetime(2023, 1, 1, 0, 0),
    site_uid="gill",
)
r.filenames

Downloading THEMIS_ASI_SKYMAP_IDLSAV files:   0%|          | 0.00/7.08M [00:00<?, ?B/s]

[PosixPath('/content/ucalgary_data/THEMIS_ASI_SKYMAP_IDLSAV/gill/gill_20210308/themis_skymap_gill_20210308-+_v02.sav'),
 PosixPath('/content/ucalgary_data/THEMIS_ASI_SKYMAP_IDLSAV/gill/gill_20220301/themis_skymap_gill_20220301-+_v02.sav')]

Looks like we have a couple skymaps to choose from around 2021-11-04. We'll choose the first one since the date for it is before, and the second one's date is after.

The date indicates the first date it is valid for. There are some cases where a later or earlier skymap can be used. That is a situation where you can play around and try different skymaps, looking for which one works best for you. Most skymaps have small differences, but some have large ones that you'll notice very quickly when working with the projected data on a map.


In [None]:
# Now that we know which one we'll use, we can read it in.
#
# You can also read in all of them and choose later using the resulting Data object.
skymap_data = aurorax.data.ucalgary.read(dataset, r.filenames[0])

print(skymap_data)
print()
skymap_data.pretty_print()

print()
skymap_data.data[0].pretty_print()

Data(data=[1 Skymap object], timestamp=[], metadata=[], problematic_files=[], calibrated_data=None, dataset=Dataset(name=THEMIS_ASI_SKYMAP_IDLSAV, short_description='THEMIS All Sky Im...))

Data:
  data                  : [1 Skymap object]
  timestamp             : []
  metadata              : []
  problematic_files     : []
  calibrated_data       : None
  dataset               : Dataset(name=THEMIS_ASI_SKYMAP_IDLSAV, short_description='THEMIS All Sky Im...)

Skymap:
  filename               : /content/ucalgary_data/THEMIS_ASI_SKYMAP_IDLSAV/gill/gill_20210308/themis_skymap_gill_20210308-+_v02.sav
  full_azimuth           : array(dims=(256, 256), dtype=>f4)
  full_elevation         : array(dims=(256, 256), dtype=>f4)
  full_map_altitude      : array(dims=(3,), dtype=>f4)
  full_map_latitude      : array(dims=(3, 257, 257), dtype=>f4)
  full_map_longitude     : array(dims=(3, 257, 257), dtype=>f4)
  generation_info        : SkymapGenerationInfo(...)
  get_precalculated_altitudes: <bound

### **Automatically choosing a skymap**

You can also let the library choose the skymap for you using the `download_best_skymap()` function.

In [None]:
# set params
dataset_name = "THEMIS_ASI_SKYMAP_IDLSAV"
site_uid = "gill"
dt = datetime.datetime(2021, 11, 4)

# get the recommendation
r = aurorax.data.ucalgary.download_best_skymap(dataset_name, site_uid, dt)
r.filenames

[PosixPath('/content/ucalgary_data/THEMIS_ASI_SKYMAP_IDLSAV/gill/gill_20210308/themis_skymap_gill_20210308-+_v02.sav')]

In [None]:
# now that we have the skymap file, we'll read it
skymap_data = aurorax.data.ucalgary.read(dataset, r.filenames)

print(skymap_data)
print()
skymap_data.pretty_print()

print()
skymap_data.data[0].pretty_print()

Data(data=[1 Skymap object], timestamp=[], metadata=[], problematic_files=[], calibrated_data=None, dataset=Dataset(name=THEMIS_ASI_SKYMAP_IDLSAV, short_description='THEMIS All Sky Im...))

Data:
  data                  : [1 Skymap object]
  timestamp             : []
  metadata              : []
  problematic_files     : []
  calibrated_data       : None
  dataset               : Dataset(name=THEMIS_ASI_SKYMAP_IDLSAV, short_description='THEMIS All Sky Im...)

Skymap:
  filename               : /content/ucalgary_data/THEMIS_ASI_SKYMAP_IDLSAV/gill/gill_20210308/themis_skymap_gill_20210308-+_v02.sav
  full_azimuth           : array(dims=(256, 256), dtype=>f4)
  full_elevation         : array(dims=(256, 256), dtype=>f4)
  full_map_altitude      : array(dims=(3,), dtype=>f4)
  full_map_latitude      : array(dims=(3, 257, 257), dtype=>f4)
  full_map_longitude     : array(dims=(3, 257, 257), dtype=>f4)
  generation_info        : SkymapGenerationInfo(...)
  get_precalculated_altitudes: <bound

## **Calibration data**

Calibration data is used for converting data to Rayleighs, or applying corrections such as a flatfield. More information about why this is important can be found in the other crib sheets.

In [None]:
# search datasets for calibrations
aurorax.data.list_datasets_in_table(name="CALIBRATION")

Name                                     Provider   Level   DOI Details                                               Short Description                                                         
REGO_CALIBRATION_FLATFIELD_IDLSAV        UCalgary   L3      https://commons.datacite.org/doi.org/10.11575/z7x6-5c42   REGO All Sky Imagers Flatfield calibration data (IDL save format)         
REGO_CALIBRATION_RAYLEIGHS_IDLSAV        UCalgary   L3      https://commons.datacite.org/doi.org/10.11575/z7x6-5c42   REGO All Sky Imagers Rayleighs calibration data (IDL save format)         
TREX_BLUE_CALIBRATION_FLATFIELD_IDLSAV   UCalgary   L3      https://commons.datacite.org/doi.org/10.11575/80pf-0p02   TREx Blueline All Sky Imagers Flatfield calibration data (IDL save format)
TREX_BLUE_CALIBRATION_RAYLEIGHS_IDLSAV   UCalgary   L3      https://commons.datacite.org/doi.org/10.11575/80pf-0p02   TREx Blueline All Sky Imagers Rayleighs calibration data (IDL save format)
TREX_NIR_CALIBRATION_FLATFIELD_IDLS

When selecting calibration files to use for converting ASI image counts to Rayleighs, we have two methods available to us:

1. choosing manually
2. using the `download_best_flatfield_calibration()` or `download_best_rayleighs_calibration()` function to choose automatically

Cameras are calibrated before they are deployed to the field, and after any in-house repairs are performed. There exist flatfield and Rayleighs calibration files, for each specific camera detector. A detector can live at multiple sites thoughout the years of operating the instrument array. Hence, why they are not associated with a specific site at all. We use the data device UID value to know what calibration files to use for the data we're processing.

### **Choosing a calibration file manually**

In [None]:
# download a minute of REGO data
dataset_name = "REGO_RAW"
start_dt = datetime.datetime(2021, 11, 4, 6, 0)
end_dt = datetime.datetime(2021, 11, 4, 6, 0)
site_uid = "resu"
r = aurorax.data.ucalgary.download(dataset_name, start_dt, end_dt, site_uid=site_uid, progress_bar_disable=True)

# determine the device uid
#
# you can either inspect the URLs and determine it by the filename, or you can read the
# data and inspect the device UID field of the metadata
print(r.filenames[0])
print()

data = aurorax.data.ucalgary.read(r.dataset, r.filenames)
print(data.metadata[0]["Imager unique ID"])

/content/ucalgary_data/REGO_RAW/2021/11/04/resu_rego-655/ut06/20211104_0600_resu_rego-655_6300.pgm.gz

rego-655


In [None]:
# now that we know the device UID we are interested in, we can get a list
# of all the flatfield and rayleighs calibration files, and then choose which
# one we want to download
start_dt = datetime.datetime(2014, 1, 1, 0, 0)
end_dt = datetime.datetime.now()
device_uid = "655"
r_rayleighs = aurorax.data.ucalgary.get_urls("REGO_CALIBRATION_RAYLEIGHS_IDLSAV", start_dt, end_dt, device_uid=device_uid)
pprint.pprint(r_rayleighs.urls)

r_flatfield = aurorax.data.ucalgary.get_urls("REGO_CALIBRATION_FLATFIELD_IDLSAV", start_dt, end_dt, device_uid=device_uid)
pprint.pprint(r_flatfield.urls)

['https://data.phys.ucalgary.ca/sort_by_project/GO-Canada/REGO/calibration/REGO_Rayleighs_15655_20141002-+_v01.sav']
['https://data.phys.ucalgary.ca/sort_by_project/GO-Canada/REGO/calibration/REGO_flatfield_15655_20141002-+_v01.sav']


In [None]:
# this is simple as there is only one to choose from
#
# now let's download the data
d_rayleighs = aurorax.data.ucalgary.download_using_urls(r_rayleighs, progress_bar_disable=True)
d_flatfield = aurorax.data.ucalgary.download_using_urls(r_flatfield, progress_bar_disable=True)

print(d_rayleighs.filenames)

print(d_flatfield.filenames)

[PosixPath('/content/ucalgary_data/REGO_CALIBRATION_RAYLEIGHS_IDLSAV/REGO_Rayleighs_15655_20141002-+_v01.sav')]
[PosixPath('/content/ucalgary_data/REGO_CALIBRATION_FLATFIELD_IDLSAV/REGO_flatfield_15655_20141002-+_v01.sav')]


In [None]:
# now that we have the calibration files, we'll read them
cal_rayleighs_data = aurorax.data.ucalgary.read(d_rayleighs.dataset, d_rayleighs.filenames)
cal_flatfield_data = aurorax.data.ucalgary.read(d_flatfield.dataset, d_flatfield.filenames)

print(cal_rayleighs_data)
print(cal_flatfield_data)
print()

cal_rayleighs_data.data[0].pretty_print()
print()
cal_flatfield_data.data[0].pretty_print()


Data(data=[1 Calibration object], timestamp=[], metadata=[], problematic_files=[], calibrated_data=None, dataset=Dataset(name=REGO_CALIBRATION_RAYLEIGHS_IDLSAV, short_description='REGO All...))
Data(data=[1 Calibration object], timestamp=[], metadata=[], problematic_files=[], calibrated_data=None, dataset=Dataset(name=REGO_CALIBRATION_FLATFIELD_IDLSAV, short_description='REGO All...))

Calibration:
  dataset                       : Dataset(...)
  detector_uid                  : 15655
  filename                      : /content/ucalgary_data/REGO_CALIBRATION_RAYLEIGHS_IDLSAV/REGO_Rayleighs_15655_20141002-+_v01.sav
  flat_field_multiplier         : None
  generation_info               : CalibrationGenerationInfo(...)
  rayleighs_perdn_persecond     : 10.399999618530273
  version                       : v01

Calibration:
  dataset                       : Dataset(...)
  detector_uid                  : 15655
  filename                      : /content/ucalgary_data/REGO_CALIBRATION_FLATFIELD_

### **Automatically choosing a calibration file**

You can also let the library choose the calibration for you using the `download_best_flatfield_calibration()` and `download_best_rayleighs_calibration()` functions.

In [None]:
# set params
device_uid = "654"
dt = datetime.datetime(2021, 11, 4)

# get the recommendations
r_rayleighs = aurorax.data.ucalgary.download_best_rayleighs_calibration("REGO_CALIBRATION_RAYLEIGHS_IDLSAV", device_uid, dt)
r_flatfield = aurorax.data.ucalgary.download_best_flatfield_calibration("REGO_CALIBRATION_FLATFIELD_IDLSAV", device_uid, dt)

# show results
print(r_rayleighs.filenames)
print(r_flatfield.filenames)

[PosixPath('/content/ucalgary_data/REGO_CALIBRATION_RAYLEIGHS_IDLSAV/REGO_Rayleighs_15654_20210806-+_v02.sav')]
[PosixPath('/content/ucalgary_data/REGO_CALIBRATION_FLATFIELD_IDLSAV/REGO_flatfield_15654_20210806-+_v02.sav')]


In [None]:
# now that we have the calibration file, we'll read it
cal_rayleighs_data = aurorax.data.ucalgary.read(r_rayleighs.dataset, r_rayleighs.filenames)
cal_flatfield_data = aurorax.data.ucalgary.read(r_flatfield.dataset, r_flatfield.filenames)

cal_rayleighs_data.data[0].pretty_print()
print()
cal_flatfield_data.data[0].pretty_print()

Calibration:
  dataset                       : Dataset(...)
  detector_uid                  : 15654
  filename                      : /content/ucalgary_data/REGO_CALIBRATION_RAYLEIGHS_IDLSAV/REGO_Rayleighs_15654_20210806-+_v02.sav
  flat_field_multiplier         : None
  generation_info               : CalibrationGenerationInfo(...)
  rayleighs_perdn_persecond     : 10.137431837782506
  version                       : v02

Calibration:
  dataset                       : Dataset(...)
  detector_uid                  : 15654
  filename                      : /content/ucalgary_data/REGO_CALIBRATION_FLATFIELD_IDLSAV/REGO_flatfield_15654_20210806-+_v02.sav
  flat_field_multiplier         : array(dims=(512, 512), dtype=>f8)
  generation_info               : CalibrationGenerationInfo(...)
  rayleighs_perdn_persecond     : None
  version                       : v02
