Reproducible projects in R emphasize streamlined project setup and efficient workflows. There are a number of very helpful tools in the R universe, such as renv
, usethis
, or here
, that range from setting up a stable R environment to generating custom project structures to getting the necessary paths in an easy way. In addition, there are a number of project structure packages and templates for creating easy-to-use and transparent project structures. Namely, tinProjects
, prodigenr
or workflowr
are R packages designed to facilitate reproducible research through automated project structuring and standardization. They all promote organized project directories, emphasize reproducibility by integrating with tools like Git
and renv
, and reduce manual setup efforts to ensure consistent and error-free project initialization. These packages help build a solid foundation for research and ensure that best practices include using separate scripts for data processing, analysis, and reporting, and combining code with narrative in R Markdown documents from the start. This organized setup improves reproducibility by making it easier to maintain, share, and replicate research. For a more comprehensive overview, have a look at the CRAN Task View Reproducible Research.
In the context of link2GI
, which relies heavily on third-party command-line APIs and requires complex and stable folder and file structures, a flexible, lightweight R project setup greatly improves the integration of OS command-line tools into spatial workflows by:
GDAL
or the sophisticated Orfeo Toolbox
(OTB) and the growing universe of r(-)spatial packages for advanced geospatial processing.initProj
provides a complete and flexible working environment for GI projects. The focus is on a simple, efficient and reproducible project management and data handling. The basic framework is formed by a defined folder structure, initial scripts and configuration templates as well as optional Git repositories and an renv
environment. A corresponding RStudio project file is also created. It supports the automatic installation (if needed) and loading of the required libraries including various standard setup skeletons to simplify project initialisation.
The function creates a skeleton of the skeleton scripts main-control.R
, pre-processing.R
, 10-processing.R
and post-processing.R
, and creates corresponding parameter configurations files stored as yaml
files in scr/configs/
. The script src/functions/000_settings.R holds all specific project settings. Easy access to all project paths is provided via the list variable dirs
.
For this reason, the link2GI
package includes a lean and lightweight but focused approach that integrates git
, renv
, and a highly flexible folder and package setup process that is simpler than existing approaches, increasing efficiency, accuracy, and performance in geospatial workflows.
When using RStudio, a new project can be created by simply selecting the Create Project Structure (link2GI) template from the ***File -> New Project -> New Directory -> New Project Wizard *** dialogue.
The basic setup of a default project, which initializes Git and renv, is done with the following call.
root_folder = tempdir() # Mandatory, variable must be in the R environment.
dirs = initProj(root_folder = root_folder, standard_setup = "baseSpatial")
It is easy to customize the folder structure. By default you will create
link2GI::setup_default()$baseSpatial$dataFolder
[1] "level0" "level1" "level2" "run" "rawdata"
link2GI::setup_default()$baseSpatial$code_subfolder
[1] "src" "src/functions" "src/configs"
Use the folders
argument to create a specific structure or subfolder structure of your project.
root_folder = tempdir() # Mandatory, variable must be in the R environment.
dirs = initProj(root_folder = root_folder,
standard_setup = "baseSpatial",
folders = c("data/rawdata/provider1/", "docs/quarto/")
)
A more complex call that integrates the git
and renv
setup, adds some additional folders and libraries as well as a location tag will be:
dirs = initProj(root_folder = tempdir(),
folders = c("data/newdata/"),
init_git = TRUE,
init_renv = TRUE,
code_subfolder = c("src", "src/functions","src/deprec"),
standard_setup = "baseSpatial",
loc_name = "oldplace",
appendlibs = c("raster"),
openproject=TRUE)