Tutorials & Examples

Author

Ceres Barros

Published

July 26, 2023

Reproducible workflows in R

An example using a (too) simple species distribution model

I was asked multiple times why I picked species distribution models (SDMs) to teach reproducible workflows in R (for ecologists). Simple, climate-based SDMs may not be great to actually forecast species distributions, due to the several other processes they are missing (e.g. dispersal limitation and biotic interactions), but they can be great for hypothesis testing (see Lee-Yaw et al. 2021 for a quick overview of SDM strengths and weaknesses) and as educational tools. They are easy to understand, relatively fast to run in nowadays computes, have several R packages that make their fitting and validation easier, the data they need is often easily obtainable at large scales and have many features and requirements that are common with other ecological workflows:

Need different sources of data that should be as FAIR Wilkinson et al. (2016) as possible;

Often require some degree of data preparation (e.g. GIS operations) to put the data in the right format;

Rely on statistical steps like model fitting, model validation and running model prediction on the same or new data (be careful with climate-based SDMs and extrapolation, though…!);

Produce visual outputs (e.g. maps).

My hope is that this example will can serve as a starting point for you to adapt your current workflows (regardless of what they are used for) and ground them on the R^3T principles.

Click here to open the example on a new page.

References

Lee-Yaw, Julie A., Jenny L. McCune, Samuel Pironon, and Seema N. Sheth. 2021. “Species Distribution Models Rarely Predict the Biology of Real Populations.” Ecography, December, ecog.05877. https://doi.org/10.1111/ecog.05877.
Wilkinson, Mark D., Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (1): 160018. https://doi.org/10.1038/sdata.2016.18.