| Title: | Pan-European Phenological Data Analysis |
|---|---|
| Description: | Provides a framework for quality-aware analysis of ground-based phenological data from the PEP725 Pan-European Phenology Database (Templ et al. (2018) <doi:10.1007/s00484-018-1512-8>; Templ et al. (2026) <doi:10.1111/nph.70869>) and similar observation networks. Implements station-level data quality grading, outlier detection, phenological normals (climate baselines), anomaly detection, elevation and latitude gradient estimation with robust regression, spatial synchrony quantification, partial least squares (PLS) regression for identifying temperature-sensitive periods, and sequential Mann-Kendall trend analysis. Supports data import from PEP725 files, conversion of user-supplied data, and downloadable synthetic datasets for teaching without barriers of registration. All analysis outputs provide 'print', 'summary', and 'plot' methods. Interactive spatial visualization is available via 'leaflet'. |
| Authors: | Matthias Templ [aut, cre] (ORCID: <https://orcid.org/0000-0002-8638-5276>), Barbara Templ [aut] (ORCID: <https://orcid.org/0000-0002-9391-5888>) |
| Maintainer: | Matthias Templ <[email protected]> |
| License: | GPL-3 |
| Version: | 1.1.0 |
| Built: | 2026-05-23 09:11:58 UTC |
| Source: | https://github.com/matthias-da/pep725 |
Ensures that subsetting operations return a pep object
when the result still has the required structure.
## S3 method for class 'pep' x[...]## S3 method for class 'pep' x[...]
x |
A |
... |
Arguments passed to the data.table subset method. |
A pep object if structure is preserved, otherwise a data.table.
Matthias Templ
This function adds a new column country by performing a fast spatial
join between point coordinates (lat, lon) and world country
polygons from rnaturalearth. Geometry is handled via sf.
add_country(dt, keep_geometry = FALSE)add_country(dt, keep_geometry = FALSE)
dt |
A |
keep_geometry |
Logical. If |
The input data must contain numeric columns named lat and lon
in WGS84 (EPSG:4326). Rows with NA coordinates receive
country = NA.
After assigning countries, Germany is automatically split into
"Germany-North" and "Germany-South" using a latitude threshold:
lat >= 50 → "Germany-North"
lat < 50 → "Germany-South"
If the input already contains a country column it is removed and
replaced by the spatially derived one.
A data.table with a new column country.
Matthias Templ
small <- data.table( lat = c(48.2, 51.0, 47.5), lon = c(16.3, 10.1, 8.5) ) pep_with_country <- add_country(small) pep_with_countrysmall <- data.table( lat = c(48.2, 51.0, 47.5), lon = c(16.3, 10.1, 8.5) ) pep_with_country <- add_country(small) pep_with_country
Adds a daylength column to a phenological dataset based on the observation day and station latitude.
add_daylength(pep)add_daylength(pep)
pep |
A |
The input data with an added daylength column (hours).
Matthias Templ
calc_daylength for the underlying calculation
pep <- pep_download() pep <- add_daylength(pep) head(pep[, .(day, lat, daylength)])pep <- pep_download() pep <- add_daylength(pep) head(pep[, .(day, lat, daylength)])
Coerce to PEP725 Data Object
as.pep(x, ...)as.pep(x, ...)
x |
A data.frame or data.table to coerce. |
... |
Additional arguments (currently unused). |
An object of class pep.
Matthias Templ
Returns human-readable descriptions for BBCH phenological codes.
bbch_description(codes, na.rm = TRUE, sort = TRUE)bbch_description(codes, na.rm = TRUE, sort = TRUE)
codes |
Integer vector of BBCH codes. |
na.rm |
Logical. If |
sort |
Logical. If |
A data.frame with columns phase_id and description,
ordered by phase code (if sort = TRUE).
Matthias Templ
bbch_description(c(60, 65, 100)) bbch_description(c(100, 60, 65, NA), na.rm = TRUE, sort = TRUE)bbch_description(c(60, 65, 100)) bbch_description(c(100, 60, 65, NA), na.rm = TRUE, sort = TRUE)
Computes the astronomical daylength (hours of daylight) for a given day of year and latitude. Photoperiod is an important environmental driver of plant phenology.
calc_daylength(doy, lat, method = c("brock", "cbm"))calc_daylength(doy, lat, method = c("brock", "cbm"))
doy |
Integer or numeric vector. Day of year (1-365/366). |
lat |
Numeric. Latitude in decimal degrees. Positive for Northern Hemisphere, negative for Southern Hemisphere. |
method |
Character. Daylength model:
|
Both methods assume a "flat horizon" (sunrise/sunset when the geometric centre of the sun crosses the horizon) and do not account for atmospheric refraction, elevation, or local topography.
At polar latitudes (), continuous daylight or
darkness may occur around solstices.
A list with components:
Daylength in hours
Solar declination angle in degrees
If doy is a vector, returns a data.frame with these columns.
Many phenological events are triggered by photoperiod thresholds:
Spring bud burst often requires both temperature accumulation and minimum daylength
Autumn leaf senescence may be triggered by shortening days
Flowering in many species is photoperiod-dependent
Matthias Templ
Spencer, J. W. (1971). Fourier series representation of the position of
the sun. Search 2(5):172. (Declination formula used by the
"brock" method.)
Brock, T. D. (1981). Calculating solar radiation for ecological studies. Ecological Modelling 14(1-2):1-19. doi:10.1016/0304-3800(81)90011-9
Forsythe, W. C., Rykiel, E. J., Stahl, R. S., Wu, H., and
Schoolfield, R. M. (1995). A model comparison for daylength as a
function of latitude and day of year. Ecological Modelling
80(1):87-95. doi:10.1016/0304-3800(94)00034-F
(CBM daylength model used by the "cbm" method.)
# Daylength at spring equinox (DOY 80) at 50°N calc_daylength(80, 50) # Summer solstice at different latitudes calc_daylength(172, c(30, 45, 60)) # Daylength through the year at 45°N yearly <- calc_daylength(1:365, 45) plot(yearly$doy, yearly$daylength, type = "l", xlab = "Day of Year", ylab = "Daylength (hours)")# Daylength at spring equinox (DOY 80) at 50°N calc_daylength(80, 50) # Summer solstice at different latitudes calc_daylength(172, c(30, 45, 60)) # Daylength through the year at 45°N yearly <- calc_daylength(1:365, 45) plot(yearly$doy, yearly$daylength, type = "l", xlab = "Day of Year", ylab = "Daylength (hours)")
Calculates the maximum possible daylength (at summer solstice) for a given latitude.
calc_max_daylength(lat)calc_max_daylength(lat)
lat |
Numeric. Latitude in decimal degrees. |
Numeric. Maximum daylength in hours.
Matthias Templ
# Maximum daylength at different latitudes calc_max_daylength(c(0, 30, 45, 60, 66.5))# Maximum daylength at different latitudes calc_max_daylength(c(0, 30, 45, 60, 66.5))
Computes the accumulated thermal units (GDD) from a start date to the observed phenological event date. Useful for determining the thermal requirements of different phenophases.
calc_thermal_sum( pep, temp_data, t_base = 5, t_start = 1, by = NULL, method = "average" )calc_thermal_sum( pep, temp_data, t_base = 5, t_start = 1, by = NULL, method = "average" )
pep |
A |
temp_data |
A data.frame or data.table with daily temperature data.
Must contain columns: |
t_base |
Numeric. Base temperature (°C). Default 5. |
t_start |
Integer. Start day of year for accumulation. Default 1 (Jan 1). Use 60 for March 1, etc. |
by |
Character vector. Columns to match temperature data to phenology
(e.g., |
method |
Character. GDD calculation method (see |
This function joins phenological observations with temperature data and calculates the thermal sum (accumulated GDD) at each observation date. This is useful for:
Determining thermal requirements of phenophases
Comparing thermal sums across years, locations, or species
Validating phenological models
The input data with an additional thermal_sum column containing
the accumulated GDD from t_start to the observed DOY.
Matthias Templ
calc_thermal_units for the underlying GDD calculation
# Load phenology data data(pep_seed) # Create example temperature data (normally from weather stations/reanalysis) temp <- data.frame( year = rep(2000:2015, each = 365), doy = rep(1:365, 16), tmin = rnorm(365 * 16, mean = 5, sd = 8), tmax = rnorm(365 * 16, mean = 15, sd = 8) ) # Add thermal sum to phenology data pep_thermal <- calc_thermal_sum(pep_seed, temp, t_base = 5)# Load phenology data data(pep_seed) # Create example temperature data (normally from weather stations/reanalysis) temp <- data.frame( year = rep(2000:2015, each = 365), doy = rep(1:365, 16), tmin = rnorm(365 * 16, mean = 5, sd = 8), tmax = rnorm(365 * 16, mean = 15, sd = 8) ) # Add thermal sum to phenology data pep_thermal <- calc_thermal_sum(pep_seed, temp, t_base = 5)
Computes accumulated thermal units (growing degree days, GDD) from daily temperature data. Thermal time is a fundamental concept in phenology, representing the heat accumulation that drives plant development.
calc_thermal_units( tmin, tmax = NULL, t_base = 5, t_cap = NULL, method = c("average", "modified", "single_sine"), cumulative = TRUE, na.rm = TRUE )calc_thermal_units( tmin, tmax = NULL, t_base = 5, t_cap = NULL, method = c("average", "modified", "single_sine"), cumulative = TRUE, na.rm = TRUE )
tmin |
Numeric vector of daily minimum temperatures (°C). |
tmax |
Numeric vector of daily maximum temperatures (°C). If |
t_base |
Numeric. Base temperature (°C) below which no development occurs. Default is 5°C, typical for temperate crops. |
t_cap |
Numeric. Optional upper temperature cap (°C). Temperatures above
this are set to |
method |
Character. Calculation method:
|
cumulative |
Logical. If |
na.rm |
Logical. Remove NA values in accumulation? Default |
Growing Degree Days (GDD) quantify heat accumulation above a base temperature. Plants require a certain thermal sum to reach phenological stages (e.g., flowering typically requires 500-1500 GDD depending on species).
The base temperature represents the minimum temperature for growth:
Cool-season crops (wheat, barley): 0-5°C
Warm-season crops (maize, soybean): 10°C
Fruit trees (apple, cherry): 4-7°C
Grapevine: 10°C
Numeric vector of thermal units (degree-days). Same length as input.
If cumulative = TRUE, returns running sum.
Simple average: GDD = max(0, (Tmax + Tmin)/2 - Tbase). Most common method, used by many agricultural agencies.
Like average, but Tmin and Tmax are first bounded to the range from Tbase to Tcap before averaging. Prevents negative contributions.
Approximates the daily temperature curve as a sine
wave, computing the mean excess over Tbase. More accurate when
temperatures cross the base threshold during the day. Implements the
Baskerville-Emin (1969) / Snyder (1985) single-sine formula: with
, ,
and ,
the daily GDD in the mixed case
() is
.
Matthias Templ
McMaster, G.S., Wilhelm, W.W. (1997). Growing degree-days: one equation, two interpretations. Agricultural and Forest Meteorology 87:291-300.
Baskerville, G.L., Emin, P. (1969). Rapid estimation of heat accumulation from maximum and minimum temperatures. Ecology 50:514-517.
Snyder, R.L. (1985). Hand calculating degree days. Agricultural and Forest Meteorology 35:353-358.
calc_thermal_sum for computing thermal sum at
phenological observations
# Daily temperatures for a week tmin <- c(5, 7, 8, 10, 12, 11, 9) tmax <- c(15, 18, 20, 22, 25, 23, 19) # Cumulative GDD with base 10°C calc_thermal_units(tmin, tmax, t_base = 10) # Daily GDD values calc_thermal_units(tmin, tmax, t_base = 10, cumulative = FALSE) # With upper cap at 30°C calc_thermal_units(tmin, tmax, t_base = 10, t_cap = 30) # Using mean temperature only tmean <- (tmin + tmax) / 2 calc_thermal_units(tmean, t_base = 10)# Daily temperatures for a week tmin <- c(5, 7, 8, 10, 12, 11, 9) tmax <- c(15, 18, 20, 22, 25, 23, 19) # Cumulative GDD with base 10°C calc_thermal_units(tmin, tmax, t_base = 10) # Daily GDD values calc_thermal_units(tmin, tmax, t_base = 10, cumulative = FALSE) # With upper cap at 30°C calc_thermal_units(tmin, tmax, t_base = 10, t_cap = 30) # Using mean temperature only tmean <- (tmin + tmax) / 2 calc_thermal_units(tmean, t_base = 10)
This function computes the mean phenological day-of-year (DOY) for two phenophases (e.g. BBCH 65 and 87) for each country, using either a simple warming-shift model or a GDD-based model.
get_phenology_doys( pep, phase1, phase2, genus_name, phenology_method = c("robust_shift", "simple_shift", "gdd_cmip6"), ctrl_years = 2011:2021, scen_years = 2085:2095, calib_years = 1991:2020, scen_warming = 3, shift_per_deg = -4, cmip6_tas = NULL, gdd_base = 5, gdd_threshold1 = 150, gdd_threshold2 = 350, regions = NULL, delta_Tgl = 2.09, giss_data = NULL )get_phenology_doys( pep, phase1, phase2, genus_name, phenology_method = c("robust_shift", "simple_shift", "gdd_cmip6"), ctrl_years = 2011:2021, scen_years = 2085:2095, calib_years = 1991:2020, scen_warming = 3, shift_per_deg = -4, cmip6_tas = NULL, gdd_base = 5, gdd_threshold1 = 150, gdd_threshold2 = 350, regions = NULL, delta_Tgl = 2.09, giss_data = NULL )
pep |
A data frame containing PEP725 phenology observations, including
columns |
phase1 |
Integer. First phenophase (e.g. 65). |
phase2 |
Integer. Second phenophase (e.g. 87). |
genus_name |
Character. Genus to filter (e.g. "Malus"). |
phenology_method |
Either |
ctrl_years |
Integer vector. Years for CTRL period (default 2011:2021). |
scen_years |
Integer vector. Years for scenario period (default 2085:2095). |
calib_years |
Integer vector. Years for calibration period (default 1991:2020). |
scen_warming |
Numeric. Warming in degC (simple shift model). |
shift_per_deg |
Numeric. DOY shift per degC (simple shift). |
cmip6_tas |
Optional list of daily temperature vectors–only required
if using |
gdd_base |
Numeric. Base temperature for GDD model. |
gdd_threshold1 |
Numeric. GDD threshold for phase1. |
gdd_threshold2 |
Numeric. GDD threshold for phase2. |
regions |
Character vector. Countries/regions to include. If NULL, all are used. |
delta_Tgl |
Numeric. Global temperature change (degC) for scenario. |
giss_data |
Data frame with GISS temperature data (columns: |
A data frame containing:
Country |
Country name |
doy_start_ctrl |
Start DOY for CTRL period |
doy_end_ctrl |
End DOY for CTRL period |
doy_start_scen |
Start DOY for SCEN period |
doy_end_scen |
End DOY for SCEN period |
Matthias Templ
pep <- pep_download() # Use Alpine subset for faster computation regions <- c("Austria", "Switzerland") pep_alpine <- pep[country %in% regions] # Simple shift method d1 <- get_phenology_doys( pep = pep_alpine, phase1 = 65, phase2 = 87, genus_name = "Malus", regions = regions, phenology_method = 'simple_shift' ) d1pep <- pep_download() # Use Alpine subset for faster computation regions <- c("Austria", "Switzerland") pep_alpine <- pep[country %in% regions] # Simple shift method d1 <- get_phenology_doys( pep = pep_alpine, phase1 = 65, phase2 = 87, genus_name = "Malus", regions = regions, phenology_method = 'simple_shift' ) d1
Test if Object is a PEP725 Data Object
is.pep(x)is.pep(x)
x |
An R object. |
Logical; TRUE if x is of class pep.
Matthias Templ
pep <- pep_download() is.pep(pep) is.pep(data.frame(x = 1))pep <- pep_download() is.pep(pep) is.pep(data.frame(x = 1))
mann_kendall_z)This function was misnamed in pep725 <= 1.0.2: it returned the
Mann-Kendall Z-statistic, not Kendall's .
Please call mann_kendall_z instead for identical (and
better: tie-corrected, continuity-corrected) behaviour.
kendall_tau(x)kendall_tau(x)
x |
Numeric vector (time series assumed to be equidistant).
|
Note that this function is now a thin wrapper and will continue to return the Z-statistic until a future major release.
Numeric. Identical output shape to mann_kendall_z().
Matthias Templ
suppressWarnings(kendall_tau(c(120, 118, 115, 112, 110, 108, 105)))suppressWarnings(kendall_tau(c(120, 118, 115, 112, 110, 108, 105)))
Computes the Mann-Kendall Z-statistic for a time series, a non-parametric test of monotonic trend. The Z-statistic is approximately standard normal under the no-trend null hypothesis.
mann_kendall_z(x)mann_kendall_z(x)
x |
Numeric vector (time series assumed to be equidistant).
|
The implementation uses the standard tie correction for Var(S) (so
that tied observations do not inflate the test) and the continuity
correction Z = (S - sign(S))/sqrt(Var(S)), matching
Kendall::MannKendall().
Numeric. The Mann-Kendall Z-statistic. Positive values indicate an
increasing trend, negative values decreasing. Compare against standard
normal quantiles (e.g., |Z| > 1.96 for p < 0.05).
Returns NA if fewer than three non-missing values.
kendall_tau()
Earlier versions of pep725 exported a function called kendall_tau()
that actually returned this Mann-Kendall Z-statistic (not the true
Kendall's = ). kendall_tau() is now
a deprecated alias of mann_kendall_z(); please update callers.
Matthias Templ
Mann, H.B. (1945). Nonparametric tests against trend. Econometrica 13, 245-259.
Kendall, M.G. (1975). Rank Correlation Methods. 4th ed. London: Charles Griffin.
kendall_tau (deprecated alias);
pheno_trend_turning for a full sequential Mann-Kendall
analysis.
# Decreasing trend (earlier phenology) mann_kendall_z(c(120, 118, 115, 112, 110, 108, 105)) # No clear trend mann_kendall_z(c(120, 115, 122, 118, 121, 116, 119))# Decreasing trend (earlier phenology) mann_kendall_z(c(120, 118, 115, 112, 110, 108, 105)) # No clear trend mann_kendall_z(c(120, 115, 122, 118, 121, 116, 119))
Constructor function that creates a validated pep object from a
data.frame or data.table. The object inherits from data.table and
retains all its functionality.
new_pep(x, validate = TRUE)new_pep(x, validate = TRUE)
x |
A data.frame or data.table containing PEP725 phenological data. |
validate |
Logical. If |
Required columns:
s_id - Station ID
lon, lat - Coordinates
genus, species - Plant taxonomy
phase_id - BBCH phenological phase code
year, day - Observation timing
An object of class c("pep", "data.table", "data.frame").
Matthias Templ
# From imported data dt <- pep_download() pep_data <- new_pep(dt)# From imported data dt <- pep_download() pep_data <- new_pep(dt)
Removes cached synthetic PEP data from the local cache directory.
pep_cache_clear(quiet = FALSE)pep_cache_clear(quiet = FALSE)
quiet |
Logical. If |
Invisible TRUE if cache was cleared, FALSE otherwise.
Matthias Templ
# Clear the cache pep_cache_clear() # Next download will fetch fresh data pep <- pep_download()# Clear the cache pep_cache_clear() # Next download will fetch fresh data pep <- pep_download()
Returns information about the cached PEP data.
pep_cache_info()pep_cache_info()
A list with cache information, or NULL if no cache exists.
Matthias Templ
# Check cache status pep_cache_info()# Check cache status pep_cache_info()
Checks whether station-year combinations in phenological data form a connected set, which is required for valid combined time series estimation.
pep_check_connectivity(pep, by = NULL)pep_check_connectivity(pep, by = NULL)
pep |
A |
by |
Optional grouping variables to check connectivity within groups. |
Connectivity means that there exist overlapping years between stations, allowing year effects to be separated from station effects. If data is disconnected, the combined series cannot be uniquely estimated across the disconnected parts.
A list with connectivity information:
Logical. TRUE if all data forms one connected set.
Number of disconnected sets found.
Number of observations in each set.
Stations and years in the largest connected set.
Matthias Templ
pheno_combine which uses this internally
pep <- pep_download() pep_alpine <- pep[country %in% c("Switzerland", "Austria")] conn <- pep_check_connectivity(pep_alpine) if (!conn$is_connected) { warning("Data has ", conn$n_sets, " disconnected sets") }pep <- pep_download() pep_alpine <- pep[country %in% c("Switzerland", "Austria")] conn <- pep_check_connectivity(pep_alpine) if (!conn$is_connected) { warning("Data has ", conn$n_sets, " disconnected sets") }
Validates that expected phenological phases (BBCH codes) are present in the data. Issues informative warnings for missing phases without stopping execution, making it useful for pipeline validation.
pep_check_phases( pep, expected = NULL, species = NULL, year_min = NULL, year_max = NULL, warn = TRUE, label = "Data" )pep_check_phases( pep, expected = NULL, species = NULL, year_min = NULL, year_max = NULL, warn = TRUE, label = "Data" )
pep |
A |
expected |
Integer vector of expected BBCH phase codes. If |
species |
Optional character string to filter by species/genus before checking. If specified, only checks phases for this species. |
year_min |
Optional integer. Only check phases for years >= this value. Useful for checking data availability within an analysis window. |
year_max |
Optional integer. Only check phases for years <= this value. |
warn |
Logical. If |
label |
Character. Optional label to include in warning messages,
useful when checking multiple datasets. Default is |
This function is designed to be used at the start of analysis pipelines to validate that required phenological phases are available. It issues warnings but does not stop execution, allowing users to be informed of potential issues while still proceeding with available data.
A list with class phase_check containing:
The expected phase codes
Phase codes found in the data
Phase codes not found in the data
Phase codes in data but not in expected list
Logical. TRUE if all expected phases are present
Named integer vector of observation counts per present phase
Data.table with phase-level statistics
Leaf development
Beginning of flowering / Heading
Full flowering / Anthesis
End of flowering
Hard dough (cereals)
Harvest
Matthias Templ
bbch_description for phase code descriptions,
pep_completeness for detailed completeness assessment,
pep_quality for quality grading
pep <- pep_download() # Use Swiss subset for faster checking pep_ch <- pep[country == "Switzerland"] # Basic check for common phases check <- pep_check_phases(pep_ch) # Check specific phases for grapevine (longest historical records) check <- pep_check_phases(pep_ch, expected = c(60, 65, 81), species = "Vitis vinifera") # Check within analysis window check <- pep_check_phases(pep_ch, expected = c(60, 65), year_min = 1991, year_max = 2020) # Use in pipeline with custom label vine_data <- pep_ch[species == "Vitis vinifera"] if (nrow(vine_data) > 0) { pep_check_phases(vine_data, expected = c(60, 65), label = "Grapevine analysis") } # Silent check (no warnings) result <- pep_check_phases(pep_ch, warn = FALSE) if (!result$complete) { # Handle missing phases programmatically }pep <- pep_download() # Use Swiss subset for faster checking pep_ch <- pep[country == "Switzerland"] # Basic check for common phases check <- pep_check_phases(pep_ch) # Check specific phases for grapevine (longest historical records) check <- pep_check_phases(pep_ch, expected = c(60, 65, 81), species = "Vitis vinifera") # Check within analysis window check <- pep_check_phases(pep_ch, expected = c(60, 65), year_min = 1991, year_max = 2020) # Use in pipeline with custom label vine_data <- pep_ch[species == "Vitis vinifera"] if (nrow(vine_data) > 0) { pep_check_phases(vine_data, expected = c(60, 65), label = "Grapevine analysis") } # Silent check (no warnings) result <- pep_check_phases(pep_ch, warn = FALSE) if (!result$complete) { # Handle missing phases programmatically }
A convenience function to check phase availability across multiple species at once.
pep_check_phases_multi( pep, species_list, expected = c(60, 65, 100), year_min = NULL, year_max = NULL, warn = TRUE )pep_check_phases_multi( pep, species_list, expected = c(60, 65, 100), year_min = NULL, year_max = NULL, warn = TRUE )
pep |
A |
species_list |
Character vector of species/genus names to check |
expected |
Integer vector of expected BBCH phase codes |
year_min |
Optional minimum year filter |
year_max |
Optional maximum year filter |
warn |
Logical. Issue warnings for missing phases? Default TRUE. |
A data.table summarizing phase availability per species
Matthias Templ
pep_check_phases for single-species checking
# Check multiple species (use subset for speed) pep <- pep_download() pep_subset <- pep[country %in% c("Switzerland", "Austria")] result <- pep_check_phases_multi(pep_subset, species_list = c("Malus domestica", "Vitis vinifera"), expected = c(60, 65)) print(result)# Check multiple species (use subset for speed) pep <- pep_download() pep_subset <- pep[country %in% c("Switzerland", "Austria")] result <- pep_check_phases_multi(pep_subset, species_list = c("Malus domestica", "Vitis vinifera"), expected = c(60, 65)) print(result)
Provides comprehensive assessment of data completeness across species, phases, years, and optionally countries. Useful for identifying data gaps and planning analyses.
pep_completeness( pep, by = c("genus", "species", "phase_id"), year_range = NULL, min_obs = 1, include_years = FALSE )pep_completeness( pep, by = c("genus", "species", "phase_id"), year_range = NULL, min_obs = 1, include_years = FALSE )
pep |
A |
by |
Character vector specifying grouping dimensions. Options include
|
year_range |
Optional integer vector of length 2 specifying the start
and end years for assessment. If |
min_obs |
Integer. Minimum number of observations required to include a group in the output. Default is 1. |
include_years |
Logical. If |
This function assesses data coverage across multiple dimensions, helping users understand where data is available and where gaps exist. It is particularly useful for:
Identifying which species × phase combinations have sufficient data
Finding temporal gaps in time series
Comparing coverage across countries or regions
Planning analyses that require minimum observation counts
A data.table of class pep_completeness with columns:
Grouping variables as specified in by
Total number of observations
Number of unique stations
Number of years with data
First year of observations
Last year of observations
Total span in years
Percentage of years with data within span
Median day of year
Interquartile range of day of year
Completeness is calculated as the percentage of years with at least one observation within the year span (year_max - year_min + 1). A completeness of 100% means every year in the span has data.
Matthias Templ
pep_check_phases for validating expected phases exist,
pep_quality for quality grading,
pep_coverage for overall data coverage
pep <- pep_download() # Use Swiss subset for faster computation pep_ch <- pep[country == "Switzerland"] # Basic completeness by species and phase comp <- pep_completeness(pep_ch) print(comp) # Filter to well-observed combinations comp_good <- comp[n_obs >= 100 & completeness_pct >= 80] # Completeness by country (use Alpine countries) pep_alpine <- pep[country %in% c("Switzerland", "Austria")] comp_country <- pep_completeness(pep_alpine, by = c("country", "genus", "phase_id")) # Year-level detail for a specific period comp_yearly <- pep_completeness(pep_ch, year_range = c(1991, 2020), include_years = TRUE) # Visualize completeness plot(comp)pep <- pep_download() # Use Swiss subset for faster computation pep_ch <- pep[country == "Switzerland"] # Basic completeness by species and phase comp <- pep_completeness(pep_ch) print(comp) # Filter to well-observed combinations comp_good <- comp[n_obs >= 100 & completeness_pct >= 80] # Completeness by country (use Alpine countries) pep_alpine <- pep[country %in% c("Switzerland", "Austria")] comp_country <- pep_completeness(pep_alpine, by = c("country", "genus", "phase_id")) # Year-level detail for a specific period comp_yearly <- pep_completeness(pep_ch, year_range = c(1991, 2020), include_years = TRUE) # Visualize completeness plot(comp)
Provides comprehensive coverage statistics for phenological datasets, including temporal, geographical, and species dimensions.
pep_coverage( x, kind = c("all", "temporal", "geographical", "species"), top = 10, plot = FALSE, by = NULL )pep_coverage( x, kind = c("all", "temporal", "geographical", "species"), top = 10, plot = FALSE, by = NULL )
x |
A |
kind |
Character. Type of coverage to assess:
|
top |
Integer. Number of top entries to show in tables (default: 10). |
plot |
Logical. If |
by |
Character. Optional grouping variable for more detailed analysis
(e.g., |
A list of class pep_coverage containing coverage statistics.
The structure depends on kind:
Year range, number of years, gaps, observations per year
Countries, stations, coordinate bounds, altitude range
Genera, species, subspecies counts and details
All of the above combined
Matthias Templ
pep <- pep_download() pep_ch <- pep[country == "Switzerland"] # Full coverage report pep_coverage(pep_ch) # Temporal coverage only pep_coverage(pep_ch, kind = "temporal") # Species coverage, top 5 pep_coverage(pep_ch, kind = "species", top = 5)pep <- pep_download() pep_ch <- pep[country == "Switzerland"] # Full coverage report pep_coverage(pep_ch) # Temporal coverage only pep_coverage(pep_ch, kind = "temporal") # Species coverage, top 5 pep_coverage(pep_ch, kind = "species", top = 5)
Downloads a pre-generated synthetic version of PEP725 phenological data from an external repository. The data is cached locally after the first download to avoid repeated downloads.
pep_download(url = NULL, cache = TRUE, force = FALSE, quiet = FALSE)pep_download(url = NULL, cache = TRUE, force = FALSE, quiet = FALSE)
url |
Character string specifying the URL to download from. Default uses the official pep725 package data repository. |
cache |
Logical. If |
force |
Logical. If |
quiet |
Logical. If |
The synthetic data preserves the statistical properties and structure of real PEP725 data (stations, species, phases, temporal patterns) while ensuring data privacy. It is suitable for:
Learning and testing package functions
Reproducing examples from vignettes
Developing analysis workflows before accessing real data
The data is cached in the platform-appropriate user cache directory
determined by R_user_dir("pep725", which = "cache").
Typical locations:
Windows: %LOCALAPPDATA%/R/cache/R/pep725/
macOS: ~/Library/Caches/org.R-project.R/R/pep725/
Linux: ~/.cache/R/pep725/
A pep object containing synthetic phenological data with
the same structure as original PEP725 data.
If you have access to real PEP725 data, you can import it using
pep_import instead.
Matthias Templ
pep_simulate for creating your own synthetic data,
pep_import for importing real PEP725 data
# Download synthetic data pep <- pep_download() # Explore the data print(pep) summary(pep) # Force re-download pep <- pep_download(force = TRUE)# Download synthetic data pep <- pep_download() # Explore the data print(pep) summary(pep) # Force re-download pep <- pep_download(force = TRUE)
Identifies outlier observations in phenological data using the 30-day rule or statistical methods. Based on the approach described by Schaber & Badeck (2002) for quality control of phenological observations.
pep_flag_outliers( pep, by = c("s_id", "genus", "species", "phase_id"), method = c("30day", "mad", "iqr", "zscore", "gam_residual", "mahalanobis"), threshold = NULL, center = c("median", "mean"), min_obs = 5, formula = day ~ s(year) + s(alt) + s(lat) + s(s_id, bs = "re"), min_n_per_group = 50, flag_only = TRUE )pep_flag_outliers( pep, by = c("s_id", "genus", "species", "phase_id"), method = c("30day", "mad", "iqr", "zscore", "gam_residual", "mahalanobis"), threshold = NULL, center = c("median", "mean"), min_obs = 5, formula = day ~ s(year) + s(alt) + s(lat) + s(s_id, bs = "re"), min_n_per_group = 50, flag_only = TRUE )
pep |
A |
by |
Character vector of columns defining groups for outlier detection.
Default |
method |
Character. Outlier detection method:
|
threshold |
Numeric. Threshold for outlier detection.
|
center |
Character. Central tendency measure: |
min_obs |
Integer. Minimum observations per group required for outlier detection. Groups with fewer observations are skipped. Default 5. |
formula |
Optional |
min_n_per_group |
Integer. Minimum group size required before fitting
the GAM in |
flag_only |
Logical. If |
The 30-day rule (Schaber & Badeck 2002) is widely used for phenological data quality control. It flags observations that deviate more than 30 days from the expected value for that station-species-phase combination. This threshold is based on the observation that legitimate phenological variation rarely exceeds one month.
If flag_only = TRUE: The input data with additional columns:
Logical. TRUE for flagged outliers.
Numeric. Days deviation from expected value.
Numeric. Expected DOY for the group.
If flag_only = FALSE: Data with outliers removed.
Standard for phenological QC. Use when the 30-day biological threshold is meaningful.
Robust to existing outliers. Good for initial screening of potentially contaminated data.
Standard statistical approach. Useful when comparing with other quality metrics.
Parametric approach. Use when data is approximately normal.
Matthias Templ
Schaber J, Badeck F-W (2002). Evaluation of methods for the combination of phenological time series and outlier detection. Tree Physiology 22:973-982.
Rousseeuw P J (1984). Least median of squares regression.
Journal of the American Statistical Association 79(388):871-880.
(Minimum Covariance Determinant estimator used by
method = "mahalanobis".)
pep_quality for comprehensive quality assessment,
pheno_combine which uses residuals for outlier detection
data(pep_seed) # Flag outliers using 30-day rule pep_flagged <- pep_flag_outliers(pep_seed) table(pep_flagged$is_outlier) # View flagged observations pep_flagged[is_outlier == TRUE] # Remove outliers instead of flagging pep_clean <- pep_flag_outliers(pep_seed, flag_only = FALSE) # Use MAD method with stricter threshold pep_flagged <- pep_flag_outliers(pep_seed, method = "mad", threshold = 2.5) # Group by country instead of station pep_flagged <- pep_flag_outliers(pep_seed, by = c("country", "genus", "phase_id"))data(pep_seed) # Flag outliers using 30-day rule pep_flagged <- pep_flag_outliers(pep_seed) table(pep_flagged$is_outlier) # View flagged observations pep_flagged[is_outlier == TRUE] # Remove outliers instead of flagging pep_clean <- pep_flag_outliers(pep_seed, flag_only = FALSE) # Use MAD method with stricter threshold pep_flagged <- pep_flag_outliers(pep_seed, method = "mad", threshold = 2.5) # Group by country instead of station pep_flagged <- pep_flag_outliers(pep_seed, by = c("country", "genus", "phase_id"))
This function imports all CSV files from the specified folder, reads them efficiently using data.table, and combines them into a single data table. It performs several preprocessing steps, including:
Converting selected columns to factors (provider_id, s_id, gss_id, genus, species, subspecies).
Replacing missing altitude values coded as -9999 with NA.
Creating a combined altitude variable alt2 using alt and alt_dem.
Converting the date column to Date class.
Recoding cultivation season (cult_season) and quality control flags (qc_ori_flag) into labeled factors.
Removing unused or problematic columns (affected_flag, qc_flag).
pep_import(path = "data/Data_PEP725_all", flags = FALSE, add_country = TRUE)pep_import(path = "data/Data_PEP725_all", flags = FALSE, add_country = TRUE)
path |
path to the folder containing PEP725 CSV files (default is |
flags |
Logical indicating whether the pep data contains quality control flags (default is |
add_country |
Logical indicating whether to add country information based on station coordinates (default is |
This function is intended for importing raw PEP725 station data into a standardized format suitable for further phenological analysis, such as sensitivity and trend estimation.
A pep object (extends data.table) containing the combined and preprocessed PEP725 data.
Matthias Templ (FHNW)
# Small example files shipped with the package path <- system.file("extdata", package = "pep725") pep_data <- pep_import(path = path, add_country = FALSE) print(pep_data)# Small example files shipped with the package path <- system.file("extdata", package = "pep725") pep_data <- pep_import(path = path, add_country = FALSE) print(pep_data)
Renders a Leaflet map of the stations in a pep_outliers object,
with each station plotted as a circle marker whose size encodes the
number of flagged observations and whose colour encodes the maximum
absolute residual (or Mahalanobis distance) for the station. Popups
list the worst-offending station-years and their residual values —
useful for interactive inspection during QC.
pep_outliers_leaflet( x, outlier_only = TRUE, top_n = 10, radius_range = c(4, 14), palette = "plasma", ... )pep_outliers_leaflet( x, outlier_only = TRUE, top_n = 10, radius_range = c(4, 14), palette = "plasma", ... )
x |
A |
outlier_only |
Logical. If |
top_n |
Integer. Maximum number of station-year records included in a station's popup. Default 10. |
radius_range |
Numeric length-2. Minimum and maximum circle
marker radius (in pixels) mapped to log1p(n_outliers). Default
|
palette |
Character. Name of a |
... |
Additional arguments reserved for future expansion. |
Unlike pheno_leaflet (which is a selection gadget built
on Shiny + miniUI), this is a pure visualisation: it returns a
leaflet htmlwidget so you can embed it in an R Markdown
report, include it in a vignette, or call print() on it for
interactive exploration.
Colour encodes the station's maximum absolute residual
(for univariate methods: max(|deviation|) in days; for
method = "mahalanobis": the maximum robust Mahalanobis
distance). Size encodes log1p(n_outliers) — a station
flagged once has a small marker, a station flagged 30 times has a
much larger one.
For method = "gam_residual" and "mahalanobis" this is
the natural companion to the Schaber-Badeck-style text tables and
the static diagnostic figure, because spatial structure in outliers
(a local observer network, a regional climate anomaly) is often
only visible on a map.
A leaflet htmlwidget. Call print() to display
in the RStudio Viewer or embed it in an Rmd file.
Matthias Templ
pep_flag_outliers — detection,
pep_plot_outliers — static diagnostic plots,
pheno_leaflet — station-selection Shiny gadget.
## Not run: pep <- pep_download() pep_ch <- pep[country == "Switzerland" & species == "Malus domestica"] flagged <- pep_flag_outliers( pep_ch, by = c("genus", "species", "phase_id"), method = "gam_residual" ) pep_outliers_leaflet(flagged) ## End(Not run)## Not run: pep <- pep_download() pep_ch <- pep[country == "Switzerland" & species == "Malus domestica"] flagged <- pep_flag_outliers( pep_ch, by = c("genus", "species", "phase_id"), method = "gam_residual" ) pep_outliers_leaflet(flagged) ## End(Not run)
Creates diagnostic plots to help distinguish between data errors and biologically meaningful extreme events (e.g., second flowering). Multiple plot types reveal different aspects of outlier patterns.
pep_plot_outliers( x, type = c("overview", "seasonal", "map", "detail", "station", "doy_context", "diagnostic", "profile"), phase_id = NULL, outlier_only = NULL, late_threshold = 250, n_top = 20, ... )pep_plot_outliers( x, type = c("overview", "seasonal", "map", "detail", "station", "doy_context", "diagnostic", "profile"), phase_id = NULL, outlier_only = NULL, late_threshold = 250, n_top = 20, ... )
x |
A |
type |
Character. Type of plot to produce:
|
phase_id |
Optional integer vector to filter specific phases for plotting. |
outlier_only |
Logical. If TRUE (default for some types), show only outlier observations. If FALSE, show all data with outliers highlighted. |
late_threshold |
Integer. DOY threshold for classifying "late" outliers as potential second events (default 250, ~September 7). |
n_top |
Integer. For detail view, number of most extreme outliers to show. Default 20. |
... |
Additional arguments passed to plotting functions. |
This function is designed to support quality control workflows where the goal is to distinguish:
Data errors: Typos, wrong dates, mis-coded phases
Observer errors: Misidentified species or phases
Biologically meaningful extremes: Second flowering, delayed development, climate-driven anomalies
For detecting potential second flowering events:
Use type = "seasonal" to see if late-season outliers cluster
in autumn months
Use type = "doy_context" to visualize outliers relative to
the main flowering distribution
Focus on flowering phases (60, 65) with late DOY values (>250)
A ggplot object (or list of ggplots for "overview").
Matthias Templ
pep_flag_outliers for detecting outliers,
pep_quality for comprehensive quality assessment
pep <- pep_download() # Use Swiss subset for faster outlier detection pep_ch <- pep[country == "Switzerland"] pep_flagged <- pep_flag_outliers(pep_ch) # Overview of all outliers pep_plot_outliers(pep_flagged, type = "overview") # Seasonal distribution - look for autumn outliers in flowering phases pep_plot_outliers(pep_flagged, type = "seasonal", phase_id = c(60, 65)) # Geographic pattern pep_plot_outliers(pep_flagged, type = "map") # Detailed view of most extreme outliers pep_plot_outliers(pep_flagged, type = "detail", n_top = 30) # DOY context for flowering phase - spot potential second flowering pep_plot_outliers(pep_flagged, type = "doy_context", phase_id = 65)pep <- pep_download() # Use Swiss subset for faster outlier detection pep_ch <- pep[country == "Switzerland"] pep_flagged <- pep_flag_outliers(pep_ch) # Overview of all outliers pep_plot_outliers(pep_flagged, type = "overview") # Seasonal distribution - look for autumn outliers in flowering phases pep_plot_outliers(pep_flagged, type = "seasonal", phase_id = c(60, 65)) # Geographic pattern pep_plot_outliers(pep_flagged, type = "map") # Detailed view of most extreme outliers pep_plot_outliers(pep_flagged, type = "detail", n_top = 30) # DOY context for flowering phase - spot potential second flowering pep_plot_outliers(pep_flagged, type = "doy_context", phase_id = 65)
Provides a comprehensive assessment of data completeness, temporal coverage, outlier prevalence, and assigns quality grades to phenological time series.
pep_quality( pep, by = c("s_id", "genus", "phase_id"), year_range = NULL, species = NULL, phase_id = NULL, outlier_method = c("tukey", "zscore"), na.rm = TRUE )pep_quality( pep, by = c("s_id", "genus", "phase_id"), year_range = NULL, species = NULL, phase_id = NULL, outlier_method = c("tukey", "zscore"), na.rm = TRUE )
pep |
A |
by |
Character vector of column names to group by for quality assessment.
Default is |
year_range |
Optional integer vector of length 2 specifying the start and
end years for assessment. If |
species |
Optional character string to filter by species/genus. |
phase_id |
Optional integer vector to filter by BBCH phase codes. |
outlier_method |
Character string specifying outlier detection method.
Either |
na.rm |
Logical. Should missing values be removed? Default |
This function evaluates phenological time series quality based on multiple criteria important for trend analysis and climatological calculations.
A data.table with the following columns:
Grouping variables as specified in by
Total number of observations
First year of observations
Last year of observations
Total span in years (year_max - year_min + 1)
Number of distinct years with data
Percentage of years with data within the span
Length of the longest gap (consecutive missing years)
Number of gaps in the time series
Number of statistical outliers detected
Percentage of observations that are outliers
Mean day-of-year
Standard deviation of day-of-year
Range of day-of-year values
Overall quality grade (A, B, C, or D)
Completeness >= 80%, no gaps > 2 years, outliers < 2%
Completeness >= 60%, no gaps > 5 years, outliers < 5%
Completeness >= 40%, gaps <= 10 years, outliers < 10%
Below grade C thresholds
The Tukey method (default) flags observations outside the interquartile
fences: [Q1 - 1.5*IQR, Q3 + 1.5*IQR]. This is robust to non-normal
distributions common in phenological data.
Matthias Templ
Schaber, J. and Badeck, F.-W. (2002). Evaluation of methods for the combination of phenological time series and outlier detection. Tree Physiology, 22, 973–982. doi:10.1093/treephys/22.14.973
pheno_normals for calculating climatological normals,
pheno_anomaly for anomaly detection
pep <- pep_download() # Subset to one country for speed pep_ch <- pep[country == "Switzerland"] # Assess quality for stations in Switzerland quality <- pep_quality(pep_ch) print(quality) # Filter to high-quality stations only high_quality <- quality[quality_grade %in% c("A", "B")] high_quality # Assess specific species and phase apple_quality <- pep_quality(pep_ch, species = "Malus", phase_id = 60) apple_quality # Country-level assessment (use subset of countries) pep_subset <- pep[country %in% c("Switzerland", "Austria")] country_quality <- pep_quality(pep_subset, by = c("country", "genus", "phase_id")) country_quality # Assess quality within a specific time window modern_quality <- pep_quality(pep_ch, year_range = c(1991, 2020)) modern_qualitypep <- pep_download() # Subset to one country for speed pep_ch <- pep[country == "Switzerland"] # Assess quality for stations in Switzerland quality <- pep_quality(pep_ch) print(quality) # Filter to high-quality stations only high_quality <- quality[quality_grade %in% c("A", "B")] high_quality # Assess specific species and phase apple_quality <- pep_quality(pep_ch, species = "Malus", phase_id = 60) apple_quality # Country-level assessment (use subset of countries) pep_subset <- pep[country %in% c("Switzerland", "Austria")] country_quality <- pep_quality(pep_subset, by = c("country", "genus", "phase_id")) country_quality # Assess quality within a specific time window modern_quality <- pep_quality(pep_ch, year_range = c(1991, 2020)) modern_quality
Identifies potential second flowering events and other irregular repeated phenological observations. These may represent biologically meaningful responses to climate disruption rather than data errors.
pep_second_events( pep, phases = c(60, 65), method = c("late_season", "multiple_per_year", "both"), late_threshold = 250, gap_threshold = 60, reference_period = 1991:2020, z_threshold = 3, min_obs = 10 )pep_second_events( pep, phases = c(60, 65), method = c("late_season", "multiple_per_year", "both"), late_threshold = 250, gap_threshold = 60, reference_period = 1991:2020, z_threshold = 3, min_obs = 10 )
pep |
A |
phases |
Integer vector of phase IDs to analyze. Default is
|
method |
Character. Detection method:
|
late_threshold |
Integer. DOY threshold for "late season" events. Default is 250 (~September 7). Events after this date for spring phases are flagged as potential second events. |
gap_threshold |
Integer. For "multiple_per_year" method, minimum days between observations to consider them separate events. Default is 60. |
reference_period |
Integer vector of years to calculate expected
phenological windows. Default is |
z_threshold |
Numeric. Z-score threshold for identifying late events relative to the reference period. Default is 3. |
min_obs |
Integer. Minimum observations in reference period to calculate expected windows. Default is 10. |
Second flowering and other repeated phenological events are increasingly reported as climate patterns become more irregular. These events may indicate:
Disrupted coordination between environmental cues and development
Response to unusual autumn warmth after summer dormancy
Failure of normal seasonal inhibition mechanisms
The function uses two complementary approaches:
Late season detection: Identifies observations occurring well after the normal phenological window (e.g., flowering in autumn)
Multiple events: Finds stations reporting the same phase multiple times in one year with sufficient gap between observations
A list of class second_events containing:
Data.table of detected second/repeated events with columns: s_id, species, phase_id, year, day, event_type, expected timing info
Summary statistics by species, phase, and event type
Annual counts of detected events
Monthly distribution of events
Total observations per year (for relative scaling in plots)
Detection method used
Parameters used for detection
Not all detected events are necessarily "true" second flowering. Consider:
High confidence: Same station reports phase twice with 60+ day gap; late observation is clearly outside normal window
Medium confidence: Single late observation at a station; could be second event or data entry error
Requires verification: Check original data sources, geographic context, species biology
Matthias Templ
Related to emerging research on phenological irregularity and climate disruption of seasonal timing.
pep_flag_outliers for general outlier detection,
pep_plot_outliers for outlier visualization
pep <- pep_download() # Use grapevine flowering in Alpine region (long records, second events possible) pep_subset <- pep[species == "Vitis vinifera" & phase_id %in% c(60, 65) & country %in% c("Germany-South", "Switzerland", "Austria")] if (nrow(pep_subset) > 0) { # Method 1: Detect late-season flowering events (default) second <- pep_second_events(pep_subset, phases = c(60, 65)) print(second) summary(second) # Plot from 1980 onwards for clearer trends plot(second, from_year = 1980) plot(second, type = "overview", scale = "relative", from_year = 1980) # Method 2: Detect repeated observations at same station repeated <- pep_second_events(pep_subset, method = "multiple_per_year") print(repeated) plot(repeated, type = "timeline", from_year = 1980) # Method 3: Combine both detection methods all_events <- pep_second_events(pep_subset, method = "both") summary(all_events) plot(all_events) plot(all_events, scale = "relative") }pep <- pep_download() # Use grapevine flowering in Alpine region (long records, second events possible) pep_subset <- pep[species == "Vitis vinifera" & phase_id %in% c(60, 65) & country %in% c("Germany-South", "Switzerland", "Austria")] if (nrow(pep_subset) > 0) { # Method 1: Detect late-season flowering events (default) second <- pep_second_events(pep_subset, phases = c(60, 65)) print(second) summary(second) # Plot from 1980 onwards for clearer trends plot(second, from_year = 1980) plot(second, type = "overview", scale = "relative", from_year = 1980) # Method 2: Detect repeated observations at same station repeated <- pep_second_events(pep_subset, method = "multiple_per_year") print(repeated) plot(repeated, type = "timeline", from_year = 1980) # Method 3: Combine both detection methods all_events <- pep_second_events(pep_subset, method = "both") summary(all_events) plot(all_events) plot(all_events, scale = "relative") }
A small subset of phenological data suitable for use with pep_simulate
to generate synthetic phenological datasets. This dataset contains real structure
but limited scope, making it suitable for package examples and testing.
A data.table with 1,319 rows and 10 variables:
Station identifier (factor)
Longitude in decimal degrees
Latitude in decimal degrees
Altitude in meters
Plant genus (factor)
Full species name (factor)
BBCH phenological phase code (integer)
Observation year
Day of year (DOY) of the phenological event
Country name
This dataset includes observations for:
Species: Triticum aestivum (wheat), Vitis vinifera (grapevine), Malus domestica (apple)
Phases: BBCH 10 (emergence), 60 (flowering), 65 (full flowering)
Countries: Austria, Germany-South
Years: 1990-2015
Stations: 30 stations
For a full synthetic dataset, use pep_download to download
pre-generated synthetic data from the package repository.
Derived from PEP725 Pan European Phenology Database https://pep725.eu/
pep_simulate for generating synthetic data,
pep_download for downloading full synthetic dataset
# Load the seed dataset data(pep_seed) # Examine structure str(pep_seed) # Use with pep_simulate to create synthetic data pep_synthetic <- pep_simulate(pep_seed)# Load the seed dataset data(pep_seed) # Examine structure str(pep_seed) # Use with pep_simulate to create synthetic data pep_synthetic <- pep_simulate(pep_seed)
Generates synthetic phenological observations by fitting a GAM smooth
(day ~ s(year)) per station-species-phase group and replacing the
observed day values with predictions plus Gaussian noise. The
station-year-species-phase structure of the input data is kept intact.
pep_simulate(pep, min_obs = 20, seed = 42, progress = TRUE)pep_simulate(pep, min_obs = 20, seed = 42, progress = TRUE)
pep |
A |
min_obs |
Integer. Minimum number of observations required per
station-species-phase group to attempt GAM fitting.
Groups with fewer observations keep their original |
seed |
Integer. Random seed for reproducibility. Default is 42. |
progress |
Logical. Show a progress bar? Default |
For each qualifying group (station x species x phase with at least
min_obs observations), a GAM is fitted and synthetic day-of-year
values are generated as round(predicted + rnorm(n, sd = residual_sd)).
Groups where the GAM fit fails receive a fallback: the species-phase median day plus Gaussian jitter (sd = 5 days).
Groups below the min_obs threshold are left unchanged. A message
reports how many rows were not simulated.
A pep object with the same rows and columns
as the input. The day column is overwritten with synthetic values
for groups that meet the min_obs threshold; other rows retain their
original values.
Matthias Templ
pep <- pep_download() # Single species + country for fast example vine_ch <- pep[species == "Vitis vinifera" & country == "Switzerland"] pep_synth <- pep_simulate(vine_ch)pep <- pep_download() # Single species + country for fast example vine_ch <- pep[species == "Vitis vinifera" & country == "Switzerland"] pep_synth <- pep_simulate(vine_ch)
S3 class for PEP725 phenological data that inherits from data.table.
Provides validation, informative print/summary methods, and convenient defaults.
This function provides an interactive demonstration of the main features of the pep725 package for PEP725 phenological data analysis. It showcases data exploration, filtering, summarization, and visualization capabilities.
pep725_demo(which = "all", pause = interactive())pep725_demo(which = "all", pause = interactive())
which |
Character vector specifying which demos to run. Options are:
|
pause |
Logical. If |
The demonstration downloads synthetic PEP725 data using pep_download()
and walks through typical workflows for phenological data analysis:
Class demo: Shows the enhanced print and summary methods for the pep class, demonstrating how to quickly understand the data.
Filter demo: Demonstrates selecting specific species, phases, and time periods using data.table syntax.
Plot demo: Creates visualizations including station maps, time series, and DOY distributions.
Analysis demo: Shows phenological normals calculation, anomaly detection, and data quality assessment.
Invisibly returns a list containing example outputs from each demo.
Matthias Templ
pep_download for downloading synthetic data,
pheno_normals for calculating normals,
pheno_anomaly for anomaly detection,
pep_quality for quality assessment
if (interactive()) { # Run all demos interactively pep725_demo() # Run specific demos pep725_demo(which = "class") pep725_demo(which = c("filter", "plot")) # Run without pausing (for scripts) pep725_demo(pause = FALSE) }if (interactive()) { # Run all demos interactively pep725_demo() # Run specific demos pep725_demo(which = "class") pep725_demo(which = c("filter", "plot")) # Run without pausing (for scripts) pep725_demo(pause = FALSE) }
Computes deviations from long-term phenological baselines to identify extreme years and detect climate signals. Returns both raw anomalies (in days) and standardized anomalies for cross-species comparison.
pheno_anomaly( pep, baseline_period = 1961:1990, target_years = NULL, by = c("country", "genus", "phase_id"), species = NULL, phase_id = NULL, robust = TRUE, min_years = 15, extreme_threshold = 2, normals = NULL, na.rm = TRUE )pheno_anomaly( pep, baseline_period = 1961:1990, target_years = NULL, by = c("country", "genus", "phase_id"), species = NULL, phase_id = NULL, robust = TRUE, min_years = 15, extreme_threshold = 2, normals = NULL, na.rm = TRUE )
pep |
A |
baseline_period |
Integer vector specifying the years to use for
baseline calculation. Default is |
target_years |
Integer vector specifying years to calculate anomalies
for. If |
by |
Character vector of column names to group by. Must match columns
in the data. Default is |
species |
Optional character string to filter by species/genus.
If |
phase_id |
Optional integer vector to filter by BBCH phase codes.
If |
robust |
Logical. If |
min_years |
Minimum number of years required in the baseline period to calculate valid anomalies. Default is 15. |
extreme_threshold |
Numeric. Z-score threshold for flagging extreme events. Default is 2 (approximately 95th percentile for normal distribution). |
normals |
Optional pre-computed |
na.rm |
Logical. Should missing values be removed? Default |
Phenological anomalies quantify how much a given year deviates from long-term expectations. This is crucial for:
Detecting climate change signals in phenology
Identifying extreme phenological years
Comparing anomalies across different species/regions
The function supports two approaches:
Robust (default): Uses median for central tendency and MAD (Median Absolute Deviation) for spread. More resistant to outliers.
Classical: Uses mean and standard deviation. More sensitive to extreme values but provides interpretable z-scores under normality.
A data.table with the following columns:
Grouping variables as specified in by
Year of observation
Observed mean day-of-year for the group/year
Long-term baseline (median if robust, mean otherwise)
Baseline spread (MAD if robust, SD otherwise)
Deviation in days (negative = early, positive = late)
Standardized anomaly (anomaly / spread)
Percentile rank relative to baseline distribution
Logical flag for extreme events (|z_score| > threshold)
Character: "early", "late", or "normal"
anomaly_days: Direct interpretation - "X days earlier/later than normal"
z_score: Standardized - values > 2 or < -2 are typically considered extreme
percentile: Normal approximation from z-score - e.g., 5th percentile means "earlier than 95\
Matthias Templ
Menzel, A. et al. (2006). European phenological response to climate change matches the warming pattern. Global Change Biology, 12(10), 1969–1976. doi:10.1111/j.1365-2486.2006.01193.x
Templ, M., Templ, B., and Filzmoser, P. (2018). Modelling and prediction of phenological data of the beginning of flowering: Austrian indicators of climate change. International Journal of Biometeorology, 62, 1319–1334. doi:10.1007/s00484-018-1534-2
pheno_normals for calculating baseline statistics,
pep_download for obtaining the main dataset
pep <- pep_download() # Grapevine in Switzerland (longest historical records back to 1775) vine_ch <- pep[country == "Switzerland" & species == "Vitis vinifera"] if (nrow(vine_ch) > 0) { # Calculate anomalies relative to 1961-1990 baseline anomalies <- pheno_anomaly(vine_ch, baseline_period = 1961:1990, phase_id = 65) print(anomalies) # Find extreme early years extreme_early <- anomalies[is_extreme == TRUE & direction == "early"] extreme_early # Anomalies for recent years only recent <- pheno_anomaly(vine_ch, baseline_period = 1961:1990, target_years = 2000:2020, phase_id = 65) recent }pep <- pep_download() # Grapevine in Switzerland (longest historical records back to 1775) vine_ch <- pep[country == "Switzerland" & species == "Vitis vinifera"] if (nrow(vine_ch) > 0) { # Calculate anomalies relative to 1961-1990 baseline anomalies <- pheno_anomaly(vine_ch, baseline_period = 1961:1990, phase_id = 65) print(anomalies) # Find extreme early years extreme_early <- anomalies[is_extreme == TRUE & direction == "early"] extreme_early # Anomalies for recent years only recent <- pheno_anomaly(vine_ch, baseline_period = 1961:1990, target_years = 2000:2020, phase_id = 65) recent }
Estimates a combined (regional/national) phenological time series from multi-station observations by separating year effects from station effects. This is essential for creating representative time series from networks with varying station coverage over time.
pheno_combine( pep, by = NULL, method = c("robust", "mixed", "ols"), min_years = 5, min_stations = 3, check_connectivity = TRUE )pheno_combine( pep, by = NULL, method = c("robust", "mixed", "ols"), min_years = 5, min_stations = 3, check_connectivity = TRUE )
pep |
A |
by |
Character vector of grouping variables (e.g., |
method |
Character. Estimation method:
|
min_years |
Integer. Minimum years of overlap required. Default 5. |
min_stations |
Integer. Minimum stations required. Default 3. |
check_connectivity |
Logical. If |
The model fitted is:
where is the year effect (combined series), is
the station effect, and is the residual.
Station effects represent systematic differences between stations (e.g., due to altitude, microclimate, or observer differences). The combined series removes these effects to reveal the underlying temporal trend.
An object of class pheno_combined containing:
Data.table with the combined time series: year, year_effect (the combined DOY), se (standard error, if available)
Data.table of station effects: s_id, effect, se
Data.table with original data plus fitted values and residuals for outlier detection
Estimation method used
Connectivity check results (if performed)
Grouping variables used
For grouped analysis, results per group
A prerequisite for valid estimation is that station-year combinations form a "connected" set - meaning there must be overlapping years between stations to separate year and station effects. The function checks this automatically and warns if data is disconnected.
With method = "robust", large residuals (e.g., > 30 days) indicate
potential outliers or data errors. The residuals table can be used to
identify and investigate these.
Matthias Templ
Schaber J, Badeck F-W (2002). Evaluation of methods for the combination of phenological time series and outlier detection. Tree Physiology 22:973-982.
pheno_normals for climatological baselines,
pheno_trend_turning for trend turning point detection,
pep_check_connectivity for connectivity assessment
pep <- pep_download() # Grapevine flowering in Switzerland (longest historical records) vine_ch <- pep[species == "Vitis vinifera" & phase_id == 65 & country == "Switzerland"] if (nrow(vine_ch) > 0) { # Create combined series using OLS combined <- pheno_combine(vine_ch, method = "ols") print(combined) # Identify potential outliers (residuals > 30 days) outliers <- combined$residuals[abs(residual) > 30] outliers }pep <- pep_download() # Grapevine flowering in Switzerland (longest historical records) vine_ch <- pep[species == "Vitis vinifera" & phase_id == 65 & country == "Switzerland"] if (nrow(vine_ch) > 0) { # Create combined series using OLS combined <- pheno_combine(vine_ch, method = "ols") print(combined) # Identify potential outliers (residuals > 30 days) outliers <- combined$residuals[abs(residual) > 30] outliers }
Quantifies how phenological timing shifts with altitude or latitude, calculating the phenological lapse rate using robust regression methods.
pheno_gradient( pep, variable = c("alt", "lat"), species = NULL, phase_id = NULL, year_range = NULL, by = NULL, method = c("robust", "ols", "quantile"), aggregate = TRUE, na.rm = TRUE )pheno_gradient( pep, variable = c("alt", "lat"), species = NULL, phase_id = NULL, year_range = NULL, by = NULL, method = c("robust", "ols", "quantile"), aggregate = TRUE, na.rm = TRUE )
pep |
A |
variable |
Character string specifying the gradient variable.
Either |
species |
Optional character string to filter by species/genus. |
phase_id |
Optional integer to filter by a single BBCH phase code. |
year_range |
Optional integer vector of length 2 specifying years to include. |
by |
Optional character vector of grouping variables (e.g., "country"). If provided, separate gradients are calculated for each group. |
method |
Character string specifying regression method:
|
aggregate |
Logical. If |
na.rm |
Logical. Should missing values be removed? Default |
Phenological gradients describe how the timing of biological events changes with environmental factors. This function calculates:
Altitude gradient: Typically 2-4 days delay per 100m elevation gain in temperate regions
Latitude gradient: Typically 2-5 days delay per degree north in Europe
The robust regression method (default) is recommended because phenological data often contains outliers from observation errors or microclimate effects.
A list with class pheno_gradient containing:
A data.table with slope, intercept, R-squared,
RMSE, n_stations, and interpretation
The fitted model object(s)
The data used for fitting
The gradient variable used
The regression method used
The slope represents:
For altitude: days delay per 100 meters elevation increase
For latitude: days delay per degree latitude increase
Positive slopes indicate later phenology at higher elevations/latitudes.
Matthias Templ
Defila, C. and Clot, B. (2001). Phytophenological trends in Switzerland. International Journal of Biometeorology, 45, 203–207. doi:10.1007/s004840100101
Ziello, C. et al. (2009). Influence of altitude on phenology of selected plant species in the Alpine region (1971–2000). Climate Research, 39, 227–234. doi:10.3354/cr00822
pheno_normals for calculating climatological normals,
lmrob for robust regression details
pep <- pep_download() # Subset to Alpine countries (good elevation variation) pep_subset <- pep[country %in% c("Switzerland", "Austria")] # Altitude gradient for apple flowering grad_alt <- pheno_gradient(pep_subset, variable = "alt", species = "Malus", phase_id = 60) print(grad_alt) # Latitude gradient grad_lat <- pheno_gradient(pep_subset, variable = "lat", species = "Malus", phase_id = 60) print(grad_lat) # Compare regression methods grad_ols <- pheno_gradient(pep_subset, species = "Malus", phase_id = 60, method = "ols") print(grad_ols) grad_robust <- pheno_gradient(pep_subset, species = "Malus", phase_id = 60, method = "robust") print(grad_robust) # Gradient by country (separate regression per country) grad_by_country <- pheno_gradient(pep_subset, variable = "alt", species = "Malus", phase_id = 60, by = "country") print(grad_by_country)pep <- pep_download() # Subset to Alpine countries (good elevation variation) pep_subset <- pep[country %in% c("Switzerland", "Austria")] # Altitude gradient for apple flowering grad_alt <- pheno_gradient(pep_subset, variable = "alt", species = "Malus", phase_id = 60) print(grad_alt) # Latitude gradient grad_lat <- pheno_gradient(pep_subset, variable = "lat", species = "Malus", phase_id = 60) print(grad_lat) # Compare regression methods grad_ols <- pheno_gradient(pep_subset, species = "Malus", phase_id = 60, method = "ols") print(grad_ols) grad_robust <- pheno_gradient(pep_subset, species = "Malus", phase_id = 60, method = "robust") print(grad_robust) # Gradient by country (separate regression per country) grad_by_country <- pheno_gradient(pep_subset, variable = "alt", species = "Malus", phase_id = 60, by = "country") print(grad_by_country)
Launches a Shiny gadget that allows selection of PEP phenological stations by drawing a bounding box or clicking individual points on the map.
pheno_leaflet(pep, label_col = NULL, quiet = FALSE)pheno_leaflet(pep, label_col = NULL, quiet = FALSE)
pep |
A data frame or data.table with columns |
label_col |
Optional column name to show as label in popups (e.g. |
quiet |
Logical. If |
Note: Loading time depends on the size of the dataset. For the full PEP725 dataset with millions of observations, expect 10-20 seconds for the map to fully render. Consider filtering the data before calling this function to speed up loading:
# Filter to a specific species first wheat <- pep[genus == "Triticum"] selected <- pheno_leaflet(wheat)
A data.frame with selected stations (columns: lon, lat, and metadata).
Matthias Templ
if (interactive()) { pep <- pep_download() # For faster loading, filter the data first (recommended) # Grapevine has the longest historical records pep_subset <- pep[species == "Vitis vinifera"] if (nrow(pep_subset) > 0) { # Launch interactive selection gadget selected <- pheno_leaflet(pep_subset, label_col = "species") # Inspect selected subset print(selected) } }if (interactive()) { pep <- pep_download() # For faster loading, filter the data first (recommended) # Grapevine has the longest historical records pep_subset <- pep[species == "Vitis vinifera"] if (nrow(pep_subset) > 0) { # Launch interactive selection gadget selected <- pheno_leaflet(pep_subset, label_col = "species") # Inspect selected subset print(selected) } }
Generates a map of PEP725 phenological station locations with optional coloring by various statistics including mean phenological timing, trends, and species variation.
pheno_map( pep, location = c(lon = 6.233, lat = 46.4), zoom = 4, background = c("google", "none"), color_by = c("none", "n_obs", "n_species", "mean_doy", "trend", "species_cv"), phase_id = NULL, period = NULL, min_years = 10, min_species = 3, point_size = 0.8, output_file = NULL, key = NULL )pheno_map( pep, location = c(lon = 6.233, lat = 46.4), zoom = 4, background = c("google", "none"), color_by = c("none", "n_obs", "n_species", "mean_doy", "trend", "species_cv"), phase_id = NULL, period = NULL, min_years = 10, min_species = 3, point_size = 0.8, output_file = NULL, key = NULL )
pep |
A
|
location |
Named vector with center longitude and latitude for the map.
Defaults to near Changins: |
zoom |
Zoom level for the map. For |
background |
Character. Map background type:
|
color_by |
Character. What to color stations by:
|
phase_id |
Integer. BBCH phase code to filter by. Recommended for
|
period |
Integer vector of years to include. Default is all years. For trend calculation, at least 10 years are recommended. |
min_years |
Integer. Minimum years required for trend calculation. Stations with fewer years show as NA. Default is 10. |
min_species |
Integer. Minimum species required for |
point_size |
Size of station points (default: 0.8). |
output_file |
Optional file path to export the plot (e.g. |
key |
Google Maps API key. Only needed for |
The function supports two background types:
Google Maps background (background = "google"):
Provides detailed satellite or terrain imagery but requires a Google Maps API key.
Register at https://console.cloud.google.com/ and enable the Maps Static API.
No background (background = "none"):
Uses country outlines from Natural Earth data. No API key required, making it
suitable for:
CRAN package vignettes (no external API calls)
Reproducible research (no authentication needed)
Publication-quality maps with clean appearance
The color options provide insights into:
mean_doy: Spatial patterns in phenological timing. Earlier timing (lower DOY) typically in southern/lowland areas.
trend: Where phenology is advancing (negative = earlier over time, shown in blue) or delaying (positive, shown in red). Uses Kendall's tau normalized statistic for robustness.
species_cv: Stations where different species show similar timing (low CV) vs. divergent timing (high CV). High variation may indicate species-specific responses to local conditions.
A ggplot map object.
The trend is calculated as Kendall's normalized tau statistic, which ranges roughly from -3 to +3 for typical data:
Negative values (blue): Phenology is getting earlier over time
Positive values (red): Phenology is getting later over time
Values near 0: No clear trend
|tau| > 1.96: Statistically significant at 95% level
Matthias Templ
pheno_normals for detailed normal calculations,
mann_kendall_z for trend calculation,
pheno_synchrony for synchrony analysis
pep <- pep_download() pep_alpine <- pep[country %in% c("Switzerland", "Austria")] pheno_map(pep_alpine, background = "none", color_by = "n_obs") ## Not run: # With Google Maps background (requires API key) ggmap::register_google(key = "your_api_key_here") pheno_map(pep, background = "google", color_by = "n_species", zoom = 5) ## End(Not run)pep <- pep_download() pep_alpine <- pep[country %in% c("Switzerland", "Austria")] pheno_map(pep_alpine, background = "none", color_by = "n_obs") ## Not run: # With Google Maps background (requires API key) ggmap::register_google(key = "your_api_key_here") pheno_map(pep, background = "google", color_by = "n_species", zoom = 5) ## End(Not run)
Computes reference "normal" phenology for a specified period, analogous to WMO climate normals. Returns central tendency, spread, and percentile statistics for phenological day-of-year (DOY) values.
pheno_normals( pep, period = 1991:2020, by = c("country", "genus", "phase_id"), species = NULL, phase_id = NULL, min_years = 20, probs = c(0.05, 0.1, 0.25, 0.75, 0.9, 0.95), na.rm = TRUE )pheno_normals( pep, period = 1991:2020, by = c("country", "genus", "phase_id"), species = NULL, phase_id = NULL, min_years = 20, probs = c(0.05, 0.1, 0.25, 0.75, 0.9, 0.95), na.rm = TRUE )
pep |
A |
period |
Integer vector specifying the years to include in the normal
calculation. Default is |
by |
Character vector of column names to group by. Common choices:
Default is |
species |
Optional character string to filter by species column.
Can be genus name (e.g., "Triticum") or full species (e.g., "Triticum aestivum").
If |
phase_id |
Optional integer vector to filter by BBCH phase codes.
If |
min_years |
Minimum number of years required to calculate valid normals.
Default is 20 (WMO standard). Groups with fewer years return |
probs |
Numeric vector of probabilities for percentile calculation.
Default is |
na.rm |
Logical. Should missing values be removed? Default |
Phenological normals provide a baseline for comparing individual years or detecting trends. This function calculates both classical statistics (mean, SD) and robust alternatives (median, MAD, IQR) that are less sensitive to outliers.
The function can be used in two ways:
Full dataset: Pass the complete pep object and use
species and phase_id parameters to filter internally.
Pre-filtered subset: Filter the data first using data.table syntax, then pass the subset to the function.
A data.table with the following columns:
Grouping variables as specified in by
Number of years with data in the period
Total number of observations
Arithmetic mean DOY
Median DOY (more robust to outliers)
Standard deviation
Interquartile range
Median absolute deviation (robust spread)
Percentiles with the default
probs. Custom probs produce column names derived from
the actual levels — integer percents pad to two digits
(e.g., q05, q10), fractional percents use an
underscore in place of the decimal point (e.g., q02_5 for
the 2.5\
The mapping is also stored in attr(result, "q_names").
Character string describing the reference period
1961-1990: Historical reference (pre-acceleration of warming)
1991-2020: Current WMO standard normal period
Matthias Templ
WMO (2017). WMO Guidelines on the Calculation of Climate Normals. WMO-No. 1203, World Meteorological Organization, Geneva.
Templ, M., Templ, B., and Filzmoser, P. (2018). Modelling and prediction of phenological data of the beginning of flowering: Austrian indicators of climate change. International Journal of Biometeorology, 62, 1319–1334. doi:10.1007/s00484-018-1534-2
pheno_anomaly for calculating anomalies relative to normals,
pep_download for obtaining the main dataset
# Download synthetic data first pep <- pep_download() # Use Swiss subset for faster computation pep_ch <- pep[country == "Switzerland"] # Calculate normals for all species and phases normals_all <- pheno_normals(pep_ch) # Normals for apples, flowering phase (60) apple_normals <- pheno_normals(pep_ch, species = "Malus", phase_id = 60) # Using a pre-filtered subset apples <- pep_ch[genus == "Malus" & phase_id %in% c(60, 65)] apple_normals <- pheno_normals(apples, by = c("species", "phase_id")) # Station-level normals for detailed analysis station_normals <- pheno_normals(pep_ch, species = "Malus", phase_id = 60, by = c("s_id")) # Historical reference period historical <- pheno_normals(pep_ch, period = 1961:1990, species = "Malus") # Compare two periods period1 <- pheno_normals(pep_ch, period = 1961:1990, species = "Malus") period2 <- pheno_normals(pep_ch, period = 1991:2020, species = "Malus") shift <- period2$mean_doy - period1$mean_doy# Download synthetic data first pep <- pep_download() # Use Swiss subset for faster computation pep_ch <- pep[country == "Switzerland"] # Calculate normals for all species and phases normals_all <- pheno_normals(pep_ch) # Normals for apples, flowering phase (60) apple_normals <- pheno_normals(pep_ch, species = "Malus", phase_id = 60) # Using a pre-filtered subset apples <- pep_ch[genus == "Malus" & phase_id %in% c(60, 65)] apple_normals <- pheno_normals(apples, by = c("species", "phase_id")) # Station-level normals for detailed analysis station_normals <- pheno_normals(pep_ch, species = "Malus", phase_id = 60, by = c("s_id")) # Historical reference period historical <- pheno_normals(pep_ch, period = 1961:1990, species = "Malus") # Compare two periods period1 <- pheno_normals(pep_ch, period = 1961:1990, species = "Malus") period2 <- pheno_normals(pep_ch, period = 1991:2020, species = "Malus") shift <- period2$mean_doy - period1$mean_doy
Generates time series plots from processed phenology data. Shows DOY trends with faceting by phenophase and spatial scope.
pheno_plot(data_list, alpha_lines = 0.6, linewidth = 0.7)pheno_plot(data_list, alpha_lines = 0.6, linewidth = 0.7)
data_list |
A named list of prepared data objects, typically output from
|
alpha_lines |
Alpha transparency for time series lines (default is |
linewidth |
Line width for time series lines (default is |
This function visualizes DOY trends with faceting by phenophase and spatial scope.
A ggplot object for visual inspection and further customization.
Matthias Templ
# Construct a minimal data_list with ts_tidy ts_tidy <- data.frame( year = rep(1990:1995, 2), DOY = c(120, 118, 115, 119, 113, 110, 125, 122, 120, 124, 118, 115), source = rep(c("PEP725 (aggregated)", "PEP725 (near Changins)"), each = 6), phase = "Heading", panel = "DOY" ) pheno_plot(list(ts_tidy = ts_tidy))# Construct a minimal data_list with ts_tidy ts_tidy <- data.frame( year = rep(1990:1995, 2), DOY = c(120, 118, 115, 119, 113, 110, 125, 122, 120, 124, 118, 115), source = rep(c("PEP725 (aggregated)", "PEP725 (near Changins)"), each = 6), phase = "Heading", panel = "DOY" ) pheno_plot(list(ts_tidy = ts_tidy))
Visualizes phenological time series for heading and harvest phases, comparing aggregated and spatially filtered PEP725 data.
pheno_plot_hh( data_list, phase_select = NULL, alpha_lines = 0.6, linewidth = 0.7 )pheno_plot_hh( data_list, phase_select = NULL, alpha_lines = 0.6, linewidth = 0.7 )
data_list |
A named list returned by |
phase_select |
Character. Optional filter for a specific phenological phase (e.g., "Heading", "Harvest"). If NULL, all phases are shown. |
alpha_lines |
Numeric. Transparency level for lines (default is 0.6). |
linewidth |
Numeric. Line width for time series plots (default is 0.7). |
This function shows DOY trends by phase and data source across years, with month labels on the y-axis for easier interpretation.
A ggplot object.
Matthias Templ
pep <- pep_download() out <- pheno_regional_hh(pep, species_name = "Triticum aestivum") pheno_plot_hh(out)pep <- pep_download() out <- pheno_regional_hh(pep, species_name = "Triticum aestivum") pheno_plot_hh(out)
Visualizes phenological day-of-year (DOY) trends over time since 1961. Designed for aggregated or site-level phenology data from PEP725.
pheno_plot_timeseries( data, color_by = "species", facet_by = NULL, alpha_lines = 0.7, linewidth = 0.8, smooth = TRUE, se = FALSE, year_min = 1961, title = "Phenological time series of flowering (BBCH 65)" )pheno_plot_timeseries( data, color_by = "species", facet_by = NULL, alpha_lines = 0.7, linewidth = 0.8, smooth = TRUE, se = FALSE, year_min = 1961, title = "Phenological time series of flowering (BBCH 65)" )
data |
A data.frame or data.table containing at least |
color_by |
Character. Column name used for color grouping (default = "species"). |
facet_by |
Character or NULL. Optional column name for faceting (e.g. "functional_group"). |
alpha_lines |
Numeric. Transparency level for time-series lines (default = 0.7). |
linewidth |
Numeric. Line width (default = 0.8). |
smooth |
Logical. If TRUE, adds a linear trend smoother (default = TRUE). |
se |
Logical. Whether to display confidence interval for smoother (default = FALSE). |
year_min |
Numeric. Minimum year to plot (default = 1961). |
title |
Character. Optional plot title. |
A ggplot object.
Matthias Templ
pep <- pep_download() # Use Alpine subset for faster computation pep_alpine <- pep[country %in% c("Switzerland", "Austria")] # Example: flowering DOY by genus pheno_plot_timeseries(data = pep_alpine[phase_id == 65], color_by = "genus", facet_by = NULL, smooth = TRUE) # Example: single species, colored by site (grapevine has longest records) vine_data <- pep_alpine[species == "Vitis vinifera" & phase_id == 65] if (nrow(vine_data) > 0) { pheno_plot_timeseries( data = vine_data, color_by = "s_id", smooth = TRUE, title = "Grapevine flowering (BBCH 65) across sites" ) } # Example: facets by genus pheno_plot_timeseries( data = pep_alpine[phase_id == 65], color_by = "species", facet_by = "genus", smooth = TRUE )pep <- pep_download() # Use Alpine subset for faster computation pep_alpine <- pep[country %in% c("Switzerland", "Austria")] # Example: flowering DOY by genus pheno_plot_timeseries(data = pep_alpine[phase_id == 65], color_by = "genus", facet_by = NULL, smooth = TRUE) # Example: single species, colored by site (grapevine has longest records) vine_data <- pep_alpine[species == "Vitis vinifera" & phase_id == 65] if (nrow(vine_data) > 0) { pheno_plot_timeseries( data = vine_data, color_by = "s_id", smooth = TRUE, title = "Grapevine flowering (BBCH 65) across sites" ) } # Example: facets by genus pheno_plot_timeseries( data = pep_alpine[phase_id == 65], color_by = "species", facet_by = "genus", smooth = TRUE )
Performs PLS regression to identify temperature-sensitive periods affecting phenological timing. This is useful for understanding which parts of the year have the strongest influence on phenological events.
pheno_pls( pep, temp_data, by = NULL, method = c("robust", "standard"), split_month = 6, runn_mean = 11, ncomp = NULL, expl_var = 30, max_iter = 20, tol = 1e-04 )pheno_pls( pep, temp_data, by = NULL, method = c("robust", "standard"), split_month = 6, runn_mean = 11, ncomp = NULL, expl_var = 30, max_iter = 20, tol = 1e-04 )
pep |
A |
temp_data |
A data.frame with daily temperature data. Must contain
columns |
by |
Character vector of grouping columns to match phenology and
temperature data (e.g., |
method |
Character. PLS method to use:
|
split_month |
Integer (1-12). The last month to include in the phenological year. Default 6 (June) means July-June phenological years. This is important for spring phenology where relevant temperatures span the previous autumn/winter. |
runn_mean |
Integer (odd). Window size for running mean smoothing of temperatures. Default 11 days. Larger values produce smoother results. |
ncomp |
Integer or NULL. Number of PLS components. If NULL (default),
automatically determined to explain at least |
expl_var |
Numeric. Minimum percentage of variance to explain when automatically selecting components. Default 30. |
max_iter |
Integer. Maximum iterations for robust method. Default 20. |
tol |
Numeric. Convergence tolerance for robust method. Default 1e-4. |
PLS regression is particularly useful for phenology-temperature analysis because it handles the high collinearity between daily temperatures (adjacent days are highly correlated) while identifying which periods have the strongest influence on phenological timing.
The robust method uses iteratively reweighted PLS with bisquare weights, which downweights observations with large residuals. This is important for phenological data which may contain outliers from observation errors or unusual weather events.
An object of class pheno_pls containing:
Data.frame with Variable Importance in Projection scores for each day of the phenological year.
PLS regression coefficients for each day.
The fitted PLS model object.
Observation weights (all 1 for standard method).
Model R-squared.
Number of components used.
Method used ("robust" or "standard").
Number of observations.
Number of years analyzed.
Values > 0.8 indicate important variables. Days with high VIP scores have strong influence on phenology timing.
Negative coefficients indicate that warmer temperatures during that period lead to earlier phenology (lower DOY).
Spring phenology (e.g., flowering) is influenced by temperatures from the
previous autumn through spring. The split_month parameter defines
where to split the calendar year. With split_month = 6, a flowering
event in April 2020 is analyzed against temperatures from July 2019
through June 2020.
Matthias Templ
Luedeling E, Gassner A (2012). Partial Least Squares Regression for analyzing walnut phenology in California. Agricultural and Forest Meteorology 158-159:104-113.
Wold S, Sjostrom M, Eriksson L (2001). PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems 58:109-130.
pheno_trend_turning for trend analysis,
pheno_anomaly for anomaly detection
data(pep_seed) # Create example temperature data (fewer years for speed) years <- 2005:2012 temp <- data.frame( year = rep(years, each = 365), doy = rep(1:365, length(years)), Tmin = rnorm(365 * length(years), mean = 5, sd = 8), Tmax = rnorm(365 * length(years), mean = 15, sd = 8) ) # Run standard PLS (fast: no CV, fixed components) result <- pheno_pls(pep_seed, temp, method = "standard", ncomp = 2) print(result)data(pep_seed) # Create example temperature data (fewer years for speed) years <- 2005:2012 temp <- data.frame( year = rep(years, each = 365), doy = rep(1:365, length(years)), Tmin = rnorm(365 * length(years), mean = 5, sd = 8), Tmax = rnorm(365 * length(years), mean = 15, sd = 8) ) # Run standard PLS (fast: no CV, fixed components) result <- pheno_pls(pep_seed, temp, method = "standard", ncomp = 2) print(result)
Extracts and aggregates phenological observations for a selected species and phenological phase from the PEP725 dataset (globally and spatially filtered). Optionally links them with global temperature anomalies (GISS data) for climate impact studies.
pheno_regional( pep, giss = NULL, lon_min = 4.2, lon_max = 8.1, lat_min = 44.7, lat_max = 48.1, species_name = "Triticum aestivum", functional_group = NULL, year_min = 1961, pep_for_giss = c("near", "aggregated"), phase = 60 )pheno_regional( pep, giss = NULL, lon_min = 4.2, lon_max = 8.1, lat_min = 44.7, lat_max = 48.1, species_name = "Triticum aestivum", functional_group = NULL, year_min = 1961, pep_for_giss = c("near", "aggregated"), phase = 60 )
pep |
A data.table containing the full PEP725 dataset. Must include at
least columns: |
giss |
Optional. A data.table with GISS global temperature anomalies,
containing columns |
lon_min |
Minimum longitude for spatial filtering of PEP data (default is |
lon_max |
Maximum longitude for spatial filtering of PEP data (default is |
lat_min |
Minimum latitude for spatial filtering of PEP data (default is |
lat_max |
Maximum latitude for spatial filtering of PEP data (default is |
species_name |
Character name of the species to be extracted (default is |
functional_group |
Character. Optional functional group name to filter by (e.g., "C4_summer_cereals"). If provided, overwrites species_name filtering. |
year_min |
Minimum year to include in all outputs (default is |
pep_for_giss |
Selects which PEP data subset to use for merging with GISS
data. Either |
phase |
Integer or vector of phase_id(s) to extract (default is |
The function aggregates PEP observations by species, year, and phase,
calculates mean day-of-year (DOY), applies spatial filtering if needed,
and attaches GISS climate anomalies. It also maps phase_ids to
descriptive names and warns if expected phenological phase are missing
in the selected data.
If GISS data is provided, temperature anomalies are merged with the selected PEP time series to support phenology-climate analysis.
A named list with the following elements:
pep_aggGlobal (non-spatially filtered) PEP725 time series for the selected phase(s)
pep_cgiSpatially filtered PEP725 time series based on lat/lon bounding box
gissProcessed GISS global temperature anomaly data (if provided)
pep_gissMerged data frame of PEP phenology and GISS anomalies (if provided)
speciesCharacter species name used for filtering
functional_groupFunctional group if specified, otherwise NA
phaseInteger phase ID(s) used for filtering
Matthias Templ
pep <- pep_download() out <- pheno_regional(pep, species_name = "Triticum aestivum", phase = 60) str(out)pep <- pep_download() out <- pheno_regional(pep, species_name = "Triticum aestivum", phase = 60) str(out)
Generates harmonized phenological time series for heading (BBCH 60) and harvest (BBCH 100) phases of a specified species and spatial region. Optionally integrates GISS climate data for climate-sensitivity studies.
pheno_regional_hh( pep, giss = NULL, lon_min = 4.2, lon_max = 8.1, lat_min = 44.7, lat_max = 48.1, species_name = "Triticum aestivum", year_min = 1961, pep_for_giss = c("aggregated", "near") )pheno_regional_hh( pep, giss = NULL, lon_min = 4.2, lon_max = 8.1, lat_min = 44.7, lat_max = 48.1, species_name = "Triticum aestivum", year_min = 1961, pep_for_giss = c("aggregated", "near") )
pep |
A data.table containing PEP725 phenological data. |
giss |
Optional. A data.table with GISS global temperature anomalies
( |
lon_min |
Minimum longitude for bounding box filter (default is 4.2) |
lon_max |
Maximum longitude for bounding box filter (default is 8.1) |
lat_min |
Minimum latitude for bounding box filter (default is 44.7) |
lat_max |
Maximum latitude for bounding box filter (default is 48.1) |
species_name |
Name of the species to filter (default is
|
year_min |
Minimum year for all time series outputs (default is 1961) |
pep_for_giss |
Which PEP725 dataset to use for merging with GISS climate
anomalies. Either |
This function filters the PEP725 dataset by species and optionally by spatial box. It aggregates phenological phases (Heading = 60, Harvest = 100) and computes mean day-of-year (DOY) per year. If GISS data is provided, temperature anomalies are merged with the selected PEP time series to support phenology-climate analysis. Warnings are issued if phases are missing for selected periods.
A named list with the following elements:
ts_tidyLong-format time series with consistent phase and source labeling
pep_aggAggregated PEP725 time series (global species-level)
pep_cgiSpatially filtered PEP725 time series (based on lat/lon box)
gissGISS global temperature anomalies (if provided)
pep_gissPEP time series merged with GISS anomalies (if provided)
speciesSpecies name used for filtering
Matthias Templ
pep <- pep_download() out <- pheno_regional_hh(pep, species_name = "Triticum aestivum") pheno_plot_hh(out)pep <- pep_download() out <- pheno_regional_hh(pep, species_name = "Triticum aestivum") pheno_plot_hh(out)
Measures how synchronized phenological events are across stations within regions and whether synchrony is changing over time. Higher synchrony indicates more uniform timing of phenological events.
pheno_synchrony( pep, species = NULL, phase_id = NULL, by = c("country", "year"), min_stations = 5, compute_trend = TRUE, na.rm = TRUE )pheno_synchrony( pep, species = NULL, phase_id = NULL, by = c("country", "year"), min_stations = 5, compute_trend = TRUE, na.rm = TRUE )
pep |
A |
species |
Optional character string to filter by species/genus. |
phase_id |
Optional integer to filter by a single BBCH phase code. |
by |
Character vector of column names to group by for synchrony calculation.
Default is |
min_stations |
Minimum number of stations required to calculate synchrony. Default is 5. Groups with fewer stations return NA. |
compute_trend |
Logical. If |
na.rm |
Logical. Should missing values be removed? Default |
Synchrony measures how similar phenological timing is across different stations within the same region and year. This is important for:
Understanding spatial coherence of phenological signals
Detecting changes in spatial variability over time
Assessing network representativeness
A pheno_synchrony object (list) containing:
A data.table with synchrony metrics per group:
n_stations, mean_doy, sd_doy, cv_pct, and quality indicators
If compute_trend = TRUE, a data.table with trend
analysis results per region (slope, p-value, direction)
Summary statistics across all groups
Standard deviation across stations - lower values indicate higher synchrony
Coefficient of variation (SD/mean * 100) - relative measure that allows comparison across phases with different mean timing
Range of DOY values across stations
Negative trend in SD: Increasing synchrony over time
Positive trend in SD: Decreasing synchrony (more variable)
Matthias Templ
Rosenzweig, C. et al. (2008). Attributing physical and biological impacts to anthropogenic climate change. Nature, 453, 353–357. doi:10.1038/nature06937
pheno_normals for climatological statistics,
pheno_anomaly for anomaly detection
pep <- pep_download() # Subset to two countries for speed pep_subset <- pep[country %in% c("Switzerland", "Austria")] # Calculate synchrony for apple flowering by country and year sync <- pheno_synchrony(pep_subset, species = "Malus", phase_id = 60) print(sync) # Get trend results (robust regression per country) sync$trend # Synchrony without trend analysis (faster) sync_simple <- pheno_synchrony(pep_subset, species = "Malus", phase_id = 60, compute_trend = FALSE) sync_simple # Custom grouping variables sync_detailed <- pheno_synchrony(pep_subset, species = "Malus", phase_id = 60, by = c("country", "year")) sync_detailedpep <- pep_download() # Subset to two countries for speed pep_subset <- pep[country %in% c("Switzerland", "Austria")] # Calculate synchrony for apple flowering by country and year sync <- pheno_synchrony(pep_subset, species = "Malus", phase_id = 60) print(sync) # Get trend results (robust regression per country) sync$trend # Synchrony without trend analysis (faster) sync_simple <- pheno_synchrony(pep_subset, species = "Malus", phase_id = 60, compute_trend = FALSE) sync_simple # Custom grouping variables sync_detailed <- pheno_synchrony(pep_subset, species = "Malus", phase_id = 60, by = c("country", "year")) sync_detailed
Applies a sequential Mann-Kendall test to detect approximate trend turning points in phenological time series. This identifies years where trends may have changed direction (e.g., when spring advancement accelerated or reversed).
pheno_trend_turning(pep, by = NULL, min_years = 10, aggregate = TRUE)pheno_trend_turning(pep, by = NULL, min_years = 10, aggregate = TRUE)
pep |
A |
by |
Character vector of column names to group by before analysis.
Default is |
min_years |
Integer. Minimum number of years required for analysis. Default is 10. Shorter series may produce unreliable results. |
aggregate |
Logical. If |
The sequential Mann-Kendall test calculates two series:
Progressive: Kendall's tau computed from the start forward
Retrograde: Kendall's tau computed from the end backward
Points where these series cross indicate potential trend turning points. When either series exceeds confidence thresholds (|tau| > 1.96 for 95%, |tau| > 2.58 for 99%) before and after a crossing, the turning point is considered statistically significant.
An object of class pheno_turning containing:
Data.table with columns: year, day (or median_day), tau_prog (progressive tau), tau_retr (retrograde tau), is_turning (logical indicating turning points)
Years identified as potential turning points
Number of years in the series
Grouping variables used (if any)
If by is specified, list of results per group
Positive tau indicates an increasing trend (later DOY = delayed phenology)
Negative tau indicates a decreasing trend (earlier DOY = advanced phenology)
Crossing points suggest the trend direction may have changed
Matthias Templ
Sneyers R (1990). On statistical analysis of series of observations. Technical Note No 143. World Meteorological Organization.
pheno_gradient for spatial trend analysis,
pheno_normals for baseline calculations
# Simple vector input (fast) doy_series <- c(120, 118, 122, 115, 110, 108, 112, 105, 102, 100) turning <- pheno_trend_turning(doy_series) print(turning) # Using pep_seed data (no grouping for speed) data(pep_seed) vine <- pep_seed[pep_seed$species == "Vitis vinifera" & pep_seed$phase_id == 65, ] if (nrow(vine) > 0) { turning <- pheno_trend_turning(vine) }# Simple vector input (fast) doy_series <- c(120, 118, 122, 115, 110, 108, 112, 105, 102, 100) turning <- pheno_trend_turning(doy_series) print(turning) # Using pep_seed data (no grouping for speed) data(pep_seed) vine <- pep_seed[pep_seed$species == "Vitis vinifera" & pep_seed$phase_id == 65, ] if (nrow(vine) > 0) { turning <- pheno_trend_turning(vine) }
This function visualizes long-term phenological trends for one genus or species,
using robust MM-regression (robustbase::lmrob) applied to mean annual DOYs.
plot_phenology_trends( pep, genus_name = NULL, species_name = NULL, subspecies_name = NULL, phases = c(65, 87), common_stations = TRUE, combine_regions = FALSE, combine_layout = c("vertical", "horizontal"), years = 1961:2024, calib_years = 1991:2020, pred_years = NULL, subregions = c("Austria", "Germany-North", "Germany-South", "Switzerland"), giss_data = NULL, layout = c("country_phase", "phase_country"), title = NULL )plot_phenology_trends( pep, genus_name = NULL, species_name = NULL, subspecies_name = NULL, phases = c(65, 87), common_stations = TRUE, combine_regions = FALSE, combine_layout = c("vertical", "horizontal"), years = 1961:2024, calib_years = 1991:2020, pred_years = NULL, subregions = c("Austria", "Germany-North", "Germany-South", "Switzerland"), giss_data = NULL, layout = c("country_phase", "phase_country"), title = NULL )
pep |
A PEP725-style data frame with columns |
genus_name |
Character string specifying a genus to filter (optional). |
species_name |
Character string specifying a species to filter (optional). If both genus and species are provided, the species filter is applied last. |
subspecies_name |
Character string specifying a subspecies to filter (optional). Only one of genus_name, species_name, or subspecies_name can be specified. |
phases |
Integer vector of phenological phase IDs to include
(default: |
common_stations |
Logical. If |
combine_regions |
Logical. If |
combine_layout |
Character. Layout for combined regions plot: either |
years |
Numeric vector of years to include in the analysis.
Default is |
calib_years |
Numeric vector specifying the calibration window for robust
regression (default: |
pred_years |
Numeric vector of years for projecting future phenology.
Default is |
subregions |
Character vector of country names to include (default: DACH region). |
giss_data |
Deprecated. Previously used for climate data integration; now ignored. |
layout |
Either |
title |
Optional plot title. If |
The function produces a faceted panel plot showing:
annual mean DOY and interquartile range,
robust trend lines for past and future periods,
CTRL (1991-2020) and PGW (2066-2095) climate windows,
optional layouts (country x phase or phase x country).
The function performs the following steps:
Filter PEP data by species, phases, regions, years.
Aggregate DOYs by year-country-phase.
Fit robust regression over the calibration window.
Predict past and future trends.
Draw ribbon (IQR), annual means, trend lines, climate windows.
Facet by country or by phase.
The robust regression uses robustbase::lmrob, which is resistant to
outliers and non-normality, and therefore ideal for phenological time series.
A ggplot object.
Matthias Templ
pep <- pep_download() plot_phenology_trends( pep, species_name = "Vitis vinifera", phases = c(65, 81), layout = "country_phase" )pep <- pep_download() plot_phenology_trends( pep, species_name = "Vitis vinifera", phases = c(65, 81), layout = "country_phase" )
Provides default visualizations for PEP725 phenological data.
## S3 method for class 'pep' plot(x, type = c("map", "timeseries", "histogram"), ...)## S3 method for class 'pep' plot(x, type = c("map", "timeseries", "histogram"), ...)
x |
A |
type |
Character. Type of plot: |
... |
Additional arguments passed to the underlying plot function. |
A ggplot object (invisibly).
Matthias Templ
pep <- pep_download() plot(pep) plot(pep, type = "timeseries") plot(pep, type = "histogram")pep <- pep_download() plot(pep) plot(pep, type = "timeseries") plot(pep, type = "histogram")
Creates visualizations of data completeness, including heatmaps and bar charts.
## S3 method for class 'pep_completeness' plot(x, type = c("heatmap", "bar", "timeline"), top = 20, ...)## S3 method for class 'pep_completeness' plot(x, type = c("heatmap", "bar", "timeline"), top = 20, ...)
x |
A |
type |
Character. Type of plot: |
top |
Integer. Number of top groups to display. Default is 20. |
... |
Additional arguments (unused) |
A ggplot object (invisibly)
Matthias Templ
pep <- pep_download() pep_ch <- pep[country == "Switzerland"] comp <- pep_completeness(pep_ch, by = c("genus", "phase_id")) plot(comp, type = "heatmap") plot(comp, type = "bar", top = 15)pep <- pep_download() pep_ch <- pep[country == "Switzerland"] comp <- pep_completeness(pep_ch, by = c("genus", "phase_id")) plot(comp, type = "heatmap") plot(comp, type = "bar", top = 15)
Plot Method for PEP Coverage
## S3 method for class 'pep_coverage' plot(x, ...)## S3 method for class 'pep_coverage' plot(x, ...)
x |
A |
... |
Additional arguments (unused). |
A ggplot object (or list of ggplot objects).
Matthias Templ
Plot Method for Outlier Detection Results
## S3 method for class 'pep_outliers' plot(x, type = c("histogram", "scatter", "timeline"), ...)## S3 method for class 'pep_outliers' plot(x, type = c("histogram", "scatter", "timeline"), ...)
x |
A |
type |
Character. Plot type: "histogram" for deviation distribution, "scatter" for DOY vs expected, "timeline" for outliers over time. |
... |
Additional arguments |
A ggplot object (invisibly)
Creates visualizations of phenological data quality metrics, including grade distribution and a map of station quality.
## S3 method for class 'pep_quality' plot( x, which = c("overview", "grades", "map"), pep = NULL, show_grades = c("A", "B", "C", "D"), alpha = 0.6, title = NULL, ... )## S3 method for class 'pep_quality' plot( x, which = c("overview", "grades", "map"), pep = NULL, show_grades = c("A", "B", "C", "D"), alpha = 0.6, title = NULL, ... )
x |
A |
which |
Character. Type of plot to produce:
|
pep |
Optional |
show_grades |
Character vector of grades to display on the map.
Default is |
alpha |
Numeric. Transparency of points on the map (0-1). Default is 0.6. |
title |
Optional character string for the plot title. |
... |
Additional arguments (unused). |
For the map visualization, the function:
Aggregates quality to one grade per station (worst grade if multiple phases)
Uses country borders from the rnaturalearth package
Uses colorblind-friendly colors (blue=A, cyan=B, orange=C, vermillion=D)
A ggplot object (invisibly).
Matthias Templ
pep <- pep_download() # Assess quality for Swiss stations pep_ch <- pep[country == "Switzerland"] quality <- pep_quality(pep_ch, by = c("s_id", "phase_id")) # Grade distribution only (no pep data needed) plot(quality, which = "grades") # Map of station quality (requires pep for coordinates) plot(quality, which = "map", pep = pep_ch) # Map showing only poor-quality stations plot(quality, which = "map", pep = pep_ch, show_grades = "D") # Map showing problematic stations (C and D grades) plot(quality, which = "map", pep = pep_ch, show_grades = c("C", "D")) # Overview: grades + map plot(quality, pep = pep_ch)pep <- pep_download() # Assess quality for Swiss stations pep_ch <- pep[country == "Switzerland"] quality <- pep_quality(pep_ch, by = c("s_id", "phase_id")) # Grade distribution only (no pep data needed) plot(quality, which = "grades") # Map of station quality (requires pep for coordinates) plot(quality, which = "map", pep = pep_ch) # Map showing only poor-quality stations plot(quality, which = "map", pep = pep_ch, show_grades = "D") # Map showing problematic stations (C and D grades) plot(quality, which = "map", pep = pep_ch, show_grades = c("C", "D")) # Overview: grades + map plot(quality, pep = pep_ch)
Visualise phenological anomalies as a colour-coded timeline or as a histogram of anomaly magnitudes.
## S3 method for class 'pheno_anomaly' plot(x, which = c("timeline", "histogram"), ...)## S3 method for class 'pheno_anomaly' plot(x, which = c("timeline", "histogram"), ...)
x |
A |
which |
Character string selecting the plot type: |
... |
Additional arguments (unused). |
A ggplot object (returned invisibly).
Matthias Templ
pep <- pep_download() vine_ch <- pep[country == "Switzerland" & species == "Vitis vinifera"] if (nrow(vine_ch) > 0) { a <- pheno_anomaly(vine_ch, baseline_period = 1961:1990, phase_id = 65, by = "phase_id") plot(a) plot(a, which = "histogram") }pep <- pep_download() vine_ch <- pep[country == "Switzerland" & species == "Vitis vinifera"] if (nrow(vine_ch) > 0) { a <- pheno_anomaly(vine_ch, baseline_period = 1961:1990, phase_id = 65, by = "phase_id") plot(a) plot(a, which = "histogram") }
Plot Method for Combined Time Series
## S3 method for class 'pheno_combined' plot(x, type = c("series", "stations", "residuals"), group = NULL, ...)## S3 method for class 'pheno_combined' plot(x, type = c("series", "stations", "residuals"), group = NULL, ...)
x |
A |
type |
Character. "series" for combined time series, "stations" for station effects, "residuals" for residual distribution. |
group |
For grouped analysis, which group to plot. |
... |
Additional arguments |
A ggplot object (invisibly)
Plot Method for Phenological Gradient Analysis
## S3 method for class 'pheno_gradient' plot(x, ...)## S3 method for class 'pheno_gradient' plot(x, ...)
x |
A |
... |
Additional arguments passed to ggplot |
A ggplot object
Matthias Templ
Visualise phenological normals as a dot plot with interquartile-range bars or as a bar chart.
## S3 method for class 'pheno_normals' plot(x, which = c("dotplot", "bar"), ...)## S3 method for class 'pheno_normals' plot(x, which = c("dotplot", "bar"), ...)
x |
A |
which |
Character string selecting the plot type: |
... |
Additional arguments (unused). |
A ggplot object (returned invisibly).
Matthias Templ
pep <- pep_download() pep_ch <- pep[country == "Switzerland"] # Normals by phase for apple n <- pheno_normals(pep_ch, species = "Malus", by = "phase_id") plot(n) plot(n, which = "bar")pep <- pep_download() pep_ch <- pep[country == "Switzerland"] # Normals by phase for apple n <- pheno_normals(pep_ch, species = "Malus", by = "phase_id") plot(n) plot(n, which = "bar")
Plot Method for PLS Phenology Results
## S3 method for class 'pheno_pls' plot(x, type = c("vip", "coef", "both"), vip_threshold = 0.8, ...)## S3 method for class 'pheno_pls' plot(x, type = c("vip", "coef", "both"), vip_threshold = 0.8, ...)
x |
A |
type |
Character. Plot type: "vip" for VIP scores (default), "coef" for coefficients, "both" for combined plot. |
vip_threshold |
Numeric. Threshold line for VIP plot. Default 0.8. |
... |
Additional arguments passed to plotting functions |
A ggplot object (invisibly)
Creates a time series plot of synchrony (SD) over years.
## S3 method for class 'pheno_synchrony' plot(x, region = NULL, ...)## S3 method for class 'pheno_synchrony' plot(x, region = NULL, ...)
x |
A |
region |
Optional region to plot (for multi-region analyses) |
... |
Additional arguments passed to ggplot |
A ggplot object
Matthias Templ
Plot Method for Trend Turning Analysis
## S3 method for class 'pheno_turning' plot(x, group = NULL, show_thresholds = TRUE, ...)## S3 method for class 'pheno_turning' plot(x, group = NULL, show_thresholds = TRUE, ...)
x |
A |
group |
Character. For grouped analysis, which group to plot.
If |
show_thresholds |
Logical. Show significance threshold lines?
Default |
... |
Additional arguments passed to ggplot |
A ggplot object (invisibly)
Plot Method for Second Events Detection
## S3 method for class 'second_events' plot( x, type = c("overview", "timeline", "seasonal", "map"), scale = c("absolute", "relative"), from_year = NULL, ... )## S3 method for class 'second_events' plot( x, type = c("overview", "timeline", "seasonal", "map"), scale = c("absolute", "relative"), from_year = NULL, ... )
x |
A |
type |
Character. Plot type: "overview", "timeline", "seasonal", "map" |
scale |
Character. Scale for y-axis in timeline/overview plots:
|
from_year |
Integer. If specified, only show data from this year onwards
in timeline and overview plots. Useful for focusing on recent trends.
Default is |
... |
Additional arguments |
A ggplot object (invisibly)
Displays a concise summary of the PEP725 phenological dataset.
## S3 method for class 'pep' print(x, n = 5, ...)## S3 method for class 'pep' print(x, n = 5, ...)
x |
A |
n |
Number of rows to display (default 5). |
... |
Additional arguments passed to print. |
Invisibly returns x.
Matthias Templ
pep <- pep_download() print(pep)pep <- pep_download() print(pep)
Print Method for Completeness Assessment
## S3 method for class 'pep_completeness' print(x, n = 15, ...)## S3 method for class 'pep_completeness' print(x, n = 15, ...)
x |
A |
n |
Number of rows to display |
... |
Additional arguments (unused) |
Invisibly returns x
Matthias Templ
Print Method for PEP Coverage
## S3 method for class 'pep_coverage' print(x, ...)## S3 method for class 'pep_coverage' print(x, ...)
x |
A |
... |
Additional arguments (unused). |
Invisibly returns x.
Matthias Templ
Print Method for Outlier Detection Results
## S3 method for class 'pep_outliers' print(x, ...)## S3 method for class 'pep_outliers' print(x, ...)
x |
A |
... |
Additional arguments (unused) |
Invisibly returns x
Print Method for Data Quality Assessment
## S3 method for class 'pep_quality' print(x, n = 10, ...)## S3 method for class 'pep_quality' print(x, n = 10, ...)
x |
A |
n |
Number of rows to display |
... |
Additional arguments (unused) |
Invisibly returns x
Matthias Templ
Print Method for Phase Check Results
## S3 method for class 'phase_check' print(x, ...)## S3 method for class 'phase_check' print(x, ...)
x |
A |
... |
Additional arguments (unused) |
Invisibly returns x
Matthias Templ
Print Method for Multi-Species Phase Check
## S3 method for class 'phase_check_multi' print(x, ...)## S3 method for class 'phase_check_multi' print(x, ...)
x |
A |
... |
Additional arguments (unused) |
Invisibly returns x
Matthias Templ
Print Method for Phenological Anomalies
## S3 method for class 'pheno_anomaly' print(x, n = 10, ...)## S3 method for class 'pheno_anomaly' print(x, n = 10, ...)
x |
A |
n |
Number of rows to display |
... |
Additional arguments (unused) |
Invisibly returns x
Matthias Templ
Print Method for Combined Time Series
## S3 method for class 'pheno_combined' print(x, ...)## S3 method for class 'pheno_combined' print(x, ...)
x |
A |
... |
Additional arguments (unused) |
Invisibly returns x
Print Method for Phenological Gradient Analysis
## S3 method for class 'pheno_gradient' print(x, ...)## S3 method for class 'pheno_gradient' print(x, ...)
x |
A |
... |
Additional arguments (unused) |
Invisibly returns x
Matthias Templ
Print Method for Phenological Normals
## S3 method for class 'pheno_normals' print(x, n = 10, ...)## S3 method for class 'pheno_normals' print(x, n = 10, ...)
x |
A |
n |
Number of rows to display |
... |
Additional arguments (unused) |
Invisibly returns x
Matthias Templ
Print Method for PLS Phenology Results
## S3 method for class 'pheno_pls' print(x, ...)## S3 method for class 'pheno_pls' print(x, ...)
x |
A |
... |
Additional arguments (unused) |
Invisibly returns x
Print Method for Phenological Synchrony Analysis
## S3 method for class 'pheno_synchrony' print(x, n = 10, ...)## S3 method for class 'pheno_synchrony' print(x, n = 10, ...)
x |
A |
n |
Number of rows to display |
... |
Additional arguments (unused) |
Invisibly returns x
Matthias Templ
Print Method for Trend Turning Analysis
## S3 method for class 'pheno_turning' print(x, ...)## S3 method for class 'pheno_turning' print(x, ...)
x |
A |
... |
Additional arguments (unused) |
Invisibly returns x
Print Method for Second Events Detection
## S3 method for class 'second_events' print(x, ...)## S3 method for class 'second_events' print(x, ...)
x |
A |
... |
Additional arguments (unused) |
Invisibly returns x
Extracts a subset of phenological observations from a processed PEP time series (e.g., output from species-level aggregation), filters by species, year, and phenophase ID, and attaches meaningful phase labels based on standard BBCH codes.
select_phase(dt, label, sp, yrmin, phases = c(60, 100))select_phase(dt, label, sp, yrmin, phases = c(60, 100))
dt |
A |
label |
A character string used as identifier for source and site columns in the output. |
sp |
Species name (as character string) to filter, e.g., |
yrmin |
Minimum year (inclusive) for filtering. |
phases |
Integer vector of phase IDs to include (default: |
This function uses an internal lookup table to map numeric phase_id codes to descriptive BBCH phase names (e.g., 60 = "Heading", 100 = "Harvest").
Unmapped phase IDs will trigger a warning. If expected phases are missing from the result, a separate warning is issued.
The function is particularly useful for visualizing specific growth stages or comparing phenological trends across datasets and locations.
A data.table with columns:
yearYear of observation
phaseMapped BBCH phenological stage name
DOYMean day-of-year for this phase, species, and year
sourceSet to label
siteSet to label
Matthias Templ
pheno_regional, pep_download, pheno_plot
pep <- pep_download() agg <- pheno_regional(pep, species_name = "Triticum aestivum", phase = 60)$pep_agg df <- select_phase(agg, label = "PEP725", sp = "Triticum aestivum", yrmin = 1961) head(df)pep <- pep_download() agg <- pheno_regional(pep, species_name = "Triticum aestivum", phase = 60)$pep_agg df <- select_phase(agg, label = "PEP725", sp = "Triticum aestivum", yrmin = 1961) head(df)
This function produces a complete report for phenological subspecies availability, combining: (1) a tidy summary table, (2) a publication-ready wide table, and (3) a ggplot2 heatmap showing data presence/absence.
subspecies_report( pep, genus_name = NULL, species_name = NULL, subspecies_name = NULL, phases = c(65, 87), years = 1961:2024, subregions = c("Austria", "Germany-North", "Germany-South", "Switzerland"), include_empty = TRUE, metric = "all", bbch_names = NULL, return_results = TRUE )subspecies_report( pep, genus_name = NULL, species_name = NULL, subspecies_name = NULL, phases = c(65, 87), years = 1961:2024, subregions = c("Austria", "Germany-North", "Germany-South", "Switzerland"), include_empty = TRUE, metric = "all", bbch_names = NULL, return_results = TRUE )
pep |
A PEP725-style data frame containing at least:
|
genus_name |
Character (optional). Genus to filter. |
species_name |
Character (optional). Species to filter. |
subspecies_name |
Character vector (optional). One or more subspecies. |
phases |
Integer vector. BBCH phases to evaluate. |
years |
Numeric vector. Years to include. |
subregions |
Character vector of region/country names. |
include_empty |
Logical. If TRUE, include subspecies x country rows even when no observations are present. |
metric |
Character vector. Metrics to compute. Use |
bbch_names |
Optional named list mapping BBCH phase numbers to labels,
e.g. |
return_results |
Logical. If TRUE, returns a list invisibly. |
Users may specify a genus, species, or one or more subspecies. The report evaluates BBCH phases, regions, years, completeness, and several optional availability metrics.
Invisibly returns a list with:
Tidy long-format summary
Publication-ready wide-format summary
A ggplot2 object
Matthias Templ
pheno_trend_turning for trend visualization
pep <- pep_download() # Use Alpine subset for faster computation pep_alpine <- pep[country %in% c("Switzerland", "Austria")] # Grapevine subspecies (wine varieties) - longest historical records if (nrow(pep_alpine[species == "Vitis vinifera"]) > 0) { subspecies_report( pep_alpine, subspecies_name = c("Vitis vinifera Riesling", "Vitis vinifera MUELLER THURGAU WEISS"), subregions = c("Switzerland", "Austria"), metric = "all", bbch_names = list("65"="Flowering", "81"="Veraison") ) subspecies_report( pep_alpine, species_name = "Vitis vinifera", subregions = c("Switzerland", "Austria"), metric = c("n_obs","median_doy","completeness") ) subspecies_report( pep_alpine, genus_name = "Vitis", subregions = c("Switzerland", "Austria"), metric = "all" ) }pep <- pep_download() # Use Alpine subset for faster computation pep_alpine <- pep[country %in% c("Switzerland", "Austria")] # Grapevine subspecies (wine varieties) - longest historical records if (nrow(pep_alpine[species == "Vitis vinifera"]) > 0) { subspecies_report( pep_alpine, subspecies_name = c("Vitis vinifera Riesling", "Vitis vinifera MUELLER THURGAU WEISS"), subregions = c("Switzerland", "Austria"), metric = "all", bbch_names = list("65"="Flowering", "81"="Veraison") ) subspecies_report( pep_alpine, species_name = "Vitis vinifera", subregions = c("Switzerland", "Austria"), metric = c("n_obs","median_doy","completeness") ) subspecies_report( pep_alpine, genus_name = "Vitis", subregions = c("Switzerland", "Austria"), metric = "all" ) }
Computes data availability metrics for specified genus, species, or subspecies across BBCH phases and regions. Metrics include numbers of observations, station counts, first/last observation years, median DOY, and completeness ratios.
summarize_subspecies_availability( pep, genus_name = NULL, species_name = NULL, subspecies_name = NULL, phases = c(65, 87), years = 1961:2024, subregions = c("Austria", "Germany-North", "Germany-South", "Switzerland"), include_empty = FALSE, metric = c("n_obs", "n_stations", "first_year", "last_year", "median_doy", "completeness") )summarize_subspecies_availability( pep, genus_name = NULL, species_name = NULL, subspecies_name = NULL, phases = c(65, 87), years = 1961:2024, subregions = c("Austria", "Germany-North", "Germany-South", "Switzerland"), include_empty = FALSE, metric = c("n_obs", "n_stations", "first_year", "last_year", "median_doy", "completeness") )
pep |
A PEP725-style data frame containing at least:
|
genus_name |
Character (optional). Genus to filter. |
species_name |
Character (optional). Species to filter. |
subspecies_name |
Character vector (optional). One or more subspecies. |
phases |
Integer vector. BBCH phases to evaluate. |
years |
Numeric vector. Years to include. |
subregions |
Character vector of region/country names. |
include_empty |
Logical. If TRUE, include subspecies x country rows even when no observations are present. |
metric |
Character vector specifying which metrics to compute.
Allowed values:
|
A tibble in long format with one row per subspecies x country.
Matthias Templ
pep <- pep_download() pep_alpine <- pep[country %in% c("Switzerland", "Austria")] # Grapevine species (longest historical records) if (nrow(pep_alpine[species == "Vitis vinifera"]) > 0) { summarize_subspecies_availability( pep_alpine, species_name = "Vitis vinifera", subregions = c("Switzerland", "Austria"), metric = "all" ) summarize_subspecies_availability( pep_alpine, subspecies_name = c("Vitis vinifera Riesling", "Vitis vinifera MUELLER THURGAU WEISS"), subregions = c("Switzerland", "Austria"), metric = c("n_obs","median_doy") ) }pep <- pep_download() pep_alpine <- pep[country %in% c("Switzerland", "Austria")] # Grapevine species (longest historical records) if (nrow(pep_alpine[species == "Vitis vinifera"]) > 0) { summarize_subspecies_availability( pep_alpine, species_name = "Vitis vinifera", subregions = c("Switzerland", "Austria"), metric = "all" ) summarize_subspecies_availability( pep_alpine, subspecies_name = c("Vitis vinifera Riesling", "Vitis vinifera MUELLER THURGAU WEISS"), subregions = c("Switzerland", "Austria"), metric = c("n_obs","median_doy") ) }
Provides a detailed phenological summary of the dataset, including breakdowns by species, phase, country, and temporal coverage.
## S3 method for class 'pep' summary(object, by = c("species", "phase", "country", "year"), top = 10, ...)## S3 method for class 'pep' summary(object, by = c("species", "phase", "country", "year"), top = 10, ...)
object |
A |
by |
Character. Summarize by |
top |
Integer. Number of top entries to show (default 10). |
... |
Additional arguments (currently unused). |
A pep_summary object (list) containing summary tables,
printed to console.
Matthias Templ
pep <- pep_download() summary(pep) summary(pep, by = "phase") summary(pep, by = "country")pep <- pep_download() summary(pep) summary(pep, by = "phase") summary(pep, by = "country")
Summary Method for Completeness Assessment
## S3 method for class 'pep_completeness' summary(object, ...)## S3 method for class 'pep_completeness' summary(object, ...)
object |
A |
... |
Additional arguments (unused) |
Invisibly returns a summary list
Matthias Templ
Summary Method for Outlier Detection Results
## S3 method for class 'pep_outliers' summary(object, ...)## S3 method for class 'pep_outliers' summary(object, ...)
object |
A |
... |
Additional arguments (unused) |
Invisibly returns a summary data.table
Summary Method for Data Quality Assessment
## S3 method for class 'pep_quality' summary(object, ...)## S3 method for class 'pep_quality' summary(object, ...)
object |
A |
... |
Additional arguments (unused) |
Invisibly returns a summary list
Matthias Templ
Summary Method for Phenological Anomalies
## S3 method for class 'pheno_anomaly' summary(object, ...)## S3 method for class 'pheno_anomaly' summary(object, ...)
object |
A |
... |
Additional arguments (unused) |
Invisibly returns a summary list
Matthias Templ
Summary Method for Phenological Normals
## S3 method for class 'pheno_normals' summary(object, ...)## S3 method for class 'pheno_normals' summary(object, ...)
object |
A |
... |
Additional arguments (unused) |
Invisibly returns a summary list
Matthias Templ
Summary Method for PLS Phenology Results
## S3 method for class 'pheno_pls' summary(object, vip_threshold = 0.8, ...)## S3 method for class 'pheno_pls' summary(object, vip_threshold = 0.8, ...)
object |
A |
vip_threshold |
Numeric. VIP threshold for identifying important periods. Default 0.8. |
... |
Additional arguments (unused) |
Invisibly returns a summary data.frame
Summary Method for Phenological Synchrony Analysis
## S3 method for class 'pheno_synchrony' summary(object, ...)## S3 method for class 'pheno_synchrony' summary(object, ...)
object |
A |
... |
Additional arguments (unused) |
Invisibly returns a summary list
Matthias Templ
Summary Method for Second Events Detection
## S3 method for class 'second_events' summary(object, ...)## S3 method for class 'second_events' summary(object, ...)
object |
A |
... |
Additional arguments (unused) |
The summary data.table (invisibly)