Data Quality Assessment2 months ago
Introduction | The Data Quality Challenge | What You'll Learn | Prerequisites | Setup | Part 1: Outlier Detection | Why Detect Outliers? | Statistical Methods for Outlier Detection | Method 1: 30-Day Rule (Simple Threshold) | Understanding the Output | Method 2: MAD (Median Absolute Deviation) - Recommended | Method 3: IQR (Interquartile Range) - Also Robust | Method 4: Z-Score - Sensitive but Less Robust | Method 5: GAM Residual - Model-based, Covariate-aware | Method 6: Mahalanobis - Multivariate, Robust (MCD) | Choosing a Method | Summary Statistics | Comparing Outlier Rates Across Species | Visualizing Outliers | 1. Overview Plot | 2. Seasonal Distribution | 3. Detailed Context | 4. Geographic Distribution | 5. Model Diagnostic (for gam_residual / mahalanobis) | 6. Phase-profile Plot (primary Mahalanobis figure) | Part 2: Data Completeness | Why Check Completeness? | Visualizing Completeness Issues | Assessing Completeness | Understanding Completeness Metrics | Filtering by Completeness | Completeness Thresholds for Different Analyses | Visualizing Completeness | Part 3: Phase Presence Validation | Why Check Phase Presence? | Checking Phase Presence | Checking Multiple Species | Common Phases to Check | Part 4: Integrated Quality Workflow | Putting It All Together | Documentation Template | Best Practices Summary | For Outlier Detection | For Abnormal Event Detection | For Completeness Assessment | For Phase Presence Checking | Summary | Key Take-Home Messages | Next Steps | Session Info
pep725 1.1.0Matthias Templ^[University of Applied Sciences Northwestern Switzerland (FHNW)], Barbara Templ^[Swiss Federal Institute for Forest, Snow and Landscape Research (WSL)]data-quality.Rmd