Using the rIACI Package for the Iberian Actuarial Climate Index Calculations

Introduction

In our paper (Zhou et al. 2023), we follow the work of North American actuaries, [Actuaries Climate Index (ACI)], and have created an index to show how climate changes in the Iberian Peninsula, the Iberian Actuarial Climate Index. The rIACI package is designed for climatologists and researchers working with climate data, particularly those interested in calculating climate indices such as the Iberian Actuarial Climate Index (IACI). This package provides tools to:

  • Download ERA5-Land data from the ECMWF Climate Data Store.

  • Process and merge NetCDF files. Export data to CSV format for individual grid points.

  • Calculate various standardized climate indices.

  • Aggregate indices to compute the IACI.

This vignette will guide you through the steps to use the rIACI package effectively.

Prerequisites

Before using the rIACI package, ensure you have the following:

  • An ECMWF user ID and API key to access the Climate Data Store.

  • Python installed on your system, along with the necessary Python packages (xarray, pandas, numpy, etc.).

  • The ecmwfr package installed in R for downloading data.

  • The reticulate package in R for interfacing with Python scripts.

Installation

You can install the rIACI package from GitHub using the devtools package:

# Install devtools if you haven't already
install.packages("devtools")

# Install rIACI from GitHub
devtools::install_github("https://github.com/Nan-Z-byte/rIACI")

Workflow Overview

The general workflow using rIACI involves the following steps:

  1. Download ERA5-Land data using the download_data() function.

  2. Process the downloaded data with process_data(), export_data_to_csv(), and csv_to_netcdf().

  3. Create a climate input object using climate_input().

  4. Calculate various climate indices such as TX90p, TX10p, TN90p, TN10p, Rx5day, CDD, W90p, and Sea.

  5. Integrate sea level data using sea_input().

  6. Generate the IACI output with iaci_output() or output_all().

Downloading ERA5-Land Data

The download_data() function allows you to download ERA5-Land data from the ECMWF Climate Data Store for specified variables, years, months, and geographical areas..

download_data(start_year, end_year,
              start_month = 1,
              end_month = 12,
              variables = c("10m_u_component_of_wind",
                            "10m_v_component_of_wind",
                            "2m_temperature",
                            "total_precipitation"),
              dataset = "reanalysis-era5-land",
              area = c(North, West, South, East),
              output_dir = "cds_data",
              user_id, user_key,
              max_retries = 3,
              retry_delay = 5,
              timeout = 7200)

Parameters

  • start_year (Integer): Starting year for data download.

  • end_year (Integer): Ending year for data download.

  • start_month (Integer, default 1): Starting month.

  • end_month (Integer, default 12): Ending month.

  • variables (Character vector): Variables to download. Default includes common variables:

    • "10m_u_component_of_wind"

    • "10m_v_component_of_wind"

    • "2m_temperature"

    • "total_precipitation"

  • dataset (Character, default "reanalysis-era5-land"): Dataset short name.

  • area (Numeric vector): Geographical area as c(North, West, South, East). Default is global.

  • output_dir (Character, default "cds_data"): Directory to save downloaded data.

  • user_id (Character): Your ECMWF user ID.

  • user_key (Character): Your ECMWF API key.

  • max_retries (Integer, default 3): Maximum retry attempts for download failures.

  • retry_delay (Numeric, default 5): Delay between retries in seconds.

  • timeout (Numeric, default 7200): Timeout duration for each request in seconds.

Example

# Set your ECMWF user ID and key
user_id <- "your_user_id"
user_key <- "your_api_key"

# Define the geographical area (North, West, South, East)
# Example: Iberian Peninsula roughly bounded by 44N, -10W, 35N, 5E
area_iberia <- c(44, -10, 35, 5)

# Download data form the year 1960 to 2023
download_data(
  start_year = 1960,
  end_year = 2023,
  area = area_iberia,
  user_id = user_id,
  user_key = user_key
)

Processing Climate Data

After downloading the data, you may need to process it before analysis. The package provides functions to handle NetCDF files and convert them to CSV format for easier manipulation.

Process Data Function

process_data()

Processes NetCDF files in the input directory and saves merged and processed data to the output directory.

Usage

process_data(input_dir, output_dir)

Parameters

  • input_dir (Character): Directory containing input NetCDF files.

  • output_dir (Character): Directory to save output files.

Example

input_directory <- "cds_data"
output_directory <- "processed_data"

process_data(input_dir = input_directory, output_dir = output_directory)

Export Data to CSV Function

export_data_to_csv()

Exports data from a NetCDF file to CSV files, one for each latitude and longitude point.

Usage

export_data_to_csv(nc_file, output_dir)

Parameters

  • nc_file (Character): Path to the NetCDF file.

  • output_dir (Character): Output directory to save CSV files.

Example

netcdf_file <- "processed_data/2020_01.nc"
csv_output_directory <- "csv_output"

export_data_to_csv(nc_file = netcdf_file, output_dir = csv_output_directory)

CSV to NetCDF Function

csv_to_netcdf()

Merges CSV files in a specified directory into a single NetCDF file.

Usage

csv_to_netcdf(csv_dir, output_file)

Parameters

  • csv_dir (Character): Directory containing CSV files. Filenames should follow the 'lat_lon.csv' format.

  • output_file (Character): Path to the output NetCDF file.

Example

csv_directory <- "csv_output"
output_netcdf_file <- "final_data/merged_data.nc"

csv_to_netcdf(csv_dir = csv_directory, output_file = output_netcdf_file)

Creating a Climate Input Object

Before calculating climate indices, create a climate input object that organizes your climate data.

climate_input()

Creates a climate input object containing processed climate data and relevant statistics.

Usage

climate_input(tmax = NULL, tmin = NULL, prec = NULL, wind = NULL,
              dates = NULL,base.range = c(1961, 1990), n = 5,
              quantiles = NULL,
              temp.qtiles = c(0.10, 0.90), wind.qtile = 0.90,
              max.missing.days = c(annual = 15, monthly = 3),
              min.base.data.fraction_present = 0.1)

Parameters

  • tmax (Numeric vector): Maximum temperature data.

  • tmin (Numeric vector): Minimum temperature data.

  • prec (Numeric vector): Precipitation data.

  • wind (Numeric vector): Wind speed data.

  • dates (Date vector): Dates corresponding to the data.

  • base.range (Numeric vector, default c(1961, 1990)): Base range years for calculations.

  • n (Integer, default 5): Window size for running averages.

  • quantiles (List, optional): Pre-calculated quantiles.

  • temp.qtiles (Numeric vector, default c(0.10, 0.90)): Temperature quantiles to calculate.

  • wind.qtile (Numeric, default 0.90): Wind quantile to calculate.

  • max.missing.days (Named numeric vector, default c(annual = 15, monthly = 3)): Maximum allowed missing days.

  • min.base.data.fraction_present (Numeric, default 0.1): Minimum fraction of data required in base range.

Example

# Assume you have a CSV file with climate data
climate_data <- read.csv("processed_data/climate_data.csv")

# Create climate input object
ci <- climate_input(
  tmax = climate_data$TMAX,
  tmin = climate_data$TMIN,
  prec = climate_data$PRCP,
  wind = climate_data$WIND,
  dates = as.Date(climate_data$DATE, format = "%Y-%m-%d")
)

Calculating Climate Indices

The rIACI package provides functions to calculate various climate indices, both standardized and non-standardized.

Available Indices

  • TX90p: Percentage of days when maximum temperature is above the 90th percentile.

  • TX10p: Percentage of days when maximum temperature is below the 10th percentile.

  • TN90p: Percentage of days when minimum temperature is above the 90th percentile.

  • TN10p: Percentage of days when minimum temperature is below the 10th percentile.

  • Rx5day: Maximum consecutive 5-day precipitation amount.

  • CDD: Maximum length of consecutive dry days.

  • W90p: Percentage of days when wind speed is above the 90th percentile.

  • Sea: Sea level data integration.

  • T90p, T10p: Combined temperature indices.

  • IACI: Iberian Actuarial Climate Index

Example: Calculating TX90p

# Calculate monthly TX90p index
tx90p_values <- tx90p(ci, freq = "monthly")

# View the results
head(tx90p_values)

Example: Calculating standardized T90p

# Calculate standardized T90p index on a monthly basis
t90p_std_values <- t90p_std(ci, freq = "monthly")

# View the results
head(tx90p_std_values)

Example: Calculating monthly TX90p

# Calculate seasonal TN10p index
tx90p_seasonal <- monthly_to_seasonal(tx90p_values)

# View the results
head(tx90p_seasonal)

Integrating Sea Level Data

To incorporate sea level data into the IACI, use the sea_input() function.

sea_input()

Creates a data frame for sea level data input.

Usage

sea_input(Date = levels(ci$date_factors$monthly), Value = NA)

Parameters

  • Date (Character vector): Dates in “YYYY-MM” format.

  • Value (Numeric vector, default NA): Sea level values.

Example

# Create sea level data
sea_dates <- c("2020-01", "2020-02", "2020-03")
sea_values <- c(1.2, 1.3, 1.4)

sea_data <- sea_input(Date = sea_dates, Value = sea_values)

Generating IACI Output

The final step is to compute the IACI by integrating all standardized indices.

iaci_output()

Integrates various standardized indices to compute the IACI.

Usage

iaci_output(ci, si, freq = c("monthly", "seasonal"))

Parameters

  • ci (List): Climate input object created by climate_input().

  • si (Data frame): Sea level input data created by sea_input().

  • freq (Character, default c("monthly", "seasonal")): Frequency of calculation.

Example

# Generate IACI
iaci <- iaci_output(ci, sea_data, freq = "monthly")

# View the IACI
head(iaci)

output_all()

Processes all CSV files in the input directory and outputs the IACI results to the output directory.

Usage

output_all(si, input_dir, output_dir, freq = c("monthly",
                                               "seasonal"), 
           base.range = c(1961, 1990), time.span = c(1961, 2022))

Parameters

  • si (Data frame): Sea level input data.

  • input_dir (Character): Directory containing input CSV files.

  • output_dir (Character): Directory to save output files.

  • freq (Character, default c("monthly", "seasonal")): Frequency of calculation.

  • base.range (Numeric vector, default c(1961, 1990)): Base range years.

  • time.span (Numeric vector, default c(1961, 2022)): Time span for output data.

Example

# Define input and output directories
input_dir <- "csv_output"
output_dir <- "iaci_results"

# Run the output_all function with monthly frequency
output_all(
  si = sea_std_values,
  input_dir = input_dir,
  output_dir = output_dir,
  freq = "monthly",
  base.range = c(1961, 1990),
  time.span = c(1961, 2022)
)

Complete Workflow Example

Below is a comprehensive example demonstrating the complete workflow from downloading data to generating the IACI.

# Load the package
library(rIACI)

# Step 1: Download ERA5-Land data
user_id <- "your_user_id"
user_key <- "your_api_key"
area_iberia <- c(44, -10, 35, 5) # Approximate bounds of Iberian Peninsula

download_data(
  start_year = 2020,
  end_year = 2020,
  variables = c("2m_temperature", "total_precipitation", 
                "10m_u_component_of_wind", "10m_v_component_of_wind"),
  area = area_iberia,
  user_id = user_id,
  user_key = user_key
)

# Step 2: Process downloaded data
input_directory <- "cds_data"
output_directory <- "processed_data"

process_data(input_dir = input_directory, output_dir = output_directory)

# Step 3: Export processed NetCDF to CSV
netcdf_file <- "processed_data/2020_01.nc"
csv_output_directory <- "csv_output"

export_data_to_csv(nc_file = netcdf_file, output_dir = csv_output_directory)

# Step 4: Create climate input object
climate_data <- read.csv("processed_data/climate_data.csv")
ci <- climate_input(
  tmax = climate_data$TMAX,
  tmin = climate_data$TMIN,
  prec = climate_data$PRCP,
  wind = climate_data$WIND,
  dates = as.Date(climate_data$DATE, format = "%Y-%m-%d")
)

# Step 5: Integrate sea level data
sea_dates <- c("2020-01", "2020-02", "2020-03")
sea_values <- c(1.2, 1.3, 1.4)
sea_data <- sea_input(Date = sea_dates, Value = sea_values)
sea_std_values <- sea_std(sea_data, freq = "monthly")

# Step 6: Generate IACI
iaci <- iaci_output(ci, sea_std_values, freq = "monthly")
print(head(iaci))

# Step 7: Output all results
output_all(
  si = sea_std_values,
  input_dir = csv_output_directory,
  output_dir = "iaci_results",
  freq = "monthly",
  base.range = c(1961, 1990),
  time.span = c(1961, 2022)
)

# Step 8: Merge CSVs into NetCDF (optional)
merged_netcdf <- "iaci.nc"
csv_to_netcdf(csv_dir = iaci_results_directory, output_file = merged_netcdf)

Additional Resources

For further assistance, please refer to the package documentation or contact the package maintainer.

Conclusion

The rIACI package offers a comprehensive suite of tools for climate data analysis, enabling users to compute the Iberian Actuarial Climate Index effectively. By following this guide, you can seamlessly download, process, and analyze climate data to gain valuable insights into climate variability and extremes in the Iberian Peninsula.

Acknowledgements

This package benefited fundamentally from the collective expertise and encouragement of Jose Luis Vilar-Zanon, Jose Garrido, and Antonio Jose Heras Martinez.

Bibliography

Zhou, Nan, José Luis Vilar-Zanón, José Garrido, and Antonio-José Heras Martínez. 2023. “On the Definition of an Actuarial Climate Index for the Iberian Peninsula.” Anales Del Instituto de Actuarios Españoles, no. 29: 37–59. https://doi.org/10.26360/2023_3.