Genesys PGR is the global database on plant genetic resources maintained ex situ in national, regional and international genebanks around the world.
genesysr uses the Genesys API to query Genesys data. The API is accessible at https://api.genesys-pgr.org.
Accessing data with genesysr is similar to downloading data in CSV or Excel format and loading it into R.
Accession passport data is retrieved with the
get_accessions
function.
The database is queried by providing a filter
(see
Filters below):
## Setup: use Genesys Sandbox environment
# genesysr::setup_sandbox() # Use this to connect to our test environment https://sandbox.genesys-pgr.org
# genesysr::setup_production() # This is initialized by default when loading genesysr
# Open a browser: login to Genesys and authorize access
genesysr::user_login()
# Retrieve first 1000 accessions for genus *Musa*
musa <- get_accessions(filters = list(taxonomy = list(genus = c('Musa'))), at.least = 1000)
# Or retrieve all accession data for genus *Musa*
musa <- get_accessions(filters = list(taxonomy = list(genus = c('Musa'))))
# Retrieve all accession data for the Musa International Transit Center, Bioversity International
itc <- get_accessions(list(institute = list(code = c('BEL084'))))
# Retrieve all accession data for the Musa International Transit Center, Bioversity International (BEL084) and the International Center for Tropical Agriculture (COL003)
some <- get_accessions(list(institute = list(code = c('BEL084','COL003'))))
genesysr provides utility functions to create
filter
objects using Multi-Crop
Passport Descriptors (MCPD) definitions:
# Retrieve data by country of origin (MCPD)
get_accessions(mcpd_filter(ORIGCTY = c("DEU", "SVN")))
The data is provided by Genesys as CSV. Where multiple values are
possible for a column, there will be multiple columns. For example,
accession STORAGE
may be provided as:
… | storage1 | storage2 | storage3 |
---|---|---|---|
… | 10 | 20 | 30 |
… | 30 | 40 | NA |
… | 30 | NA | NA |
… | 10 | 20 | 30 |
The filter
object is a named list()
where
names match a Genesys filter and the value specifies the criteria to
match.
The records returned by Genesys match all filters provided (AND operation), while individual filters allow for specifying multiple criteria (OR operation):
# (GENUS == Musa) AND ((ORIGCTY == NGA) OR (ORIGCTY == CIV))
filter <- list(taxonomy = list(genus = c('Musa'), species = c('aa')), countryOfOrigin = list(iso3 = c('NGA', 'CIV')))
# OR
filter <- list();
filter$taxonomy$genus = c('Musa')
filter$taxonomy$species = c('aa')
filter$countryOfOrigin$iso3 = c('NGA', 'CIV')
# See filter object as JSON
jsonlite::toJSON(filters)
There are a number of filtering options to retrieve data from Genesys. Best explore how filtering works on the actual website https://www.genesys-pgr.org/a/overview by inspecting the HTTP requests sent by your browser to the API server and then replicating them here.
taxonomy$genus
filters by a list of genera.
taxonomy$species
filters by a list of
species.
countryOfOrigin$iso3
filters by ISO3 code of country of
origin of PGR material.
# Material originating from Germany (DEU) and France (FRA)
filters <- list(countryOfOrigin = list(iso3 = c('DEU', 'FRA')))
geo.latitude
and geo.longitude
filters by
latitude/longitude (in decimal format) of the collecting site.
institute$code
filters by a list of FAO WIEWS
institute codes of the holding institutes.
institute$country$iso3
filters by a list of
ISO3 country codes of country of the holding institute.
Genesys API returns a lot of variables for accession passport data.
To reduce the amount of data to be processed and kept in memory, select
the columns of interest the fields
vector:
# Fetch only accession id, storage and taxonomic data for *Musa*
musa <- genesysr::get_accessions(list(taxonomy = list(genus = c('Musa'))), fields = c("taxonomy", "storage", "id"))
To list the variable names returned by the Genesys APIs, test the response and select columns of interest:
Let’s take a look of all the process of fetching accession passport data from Genesys.