The universal data structure we’re going to use is:
abundance
(required)covariates
(optional)metadata
(required)If both abundance
and covariates
are present in the list, then the two data.frames must have the same number of rows.
In the abundance
data.frame:
Here, the common usage is for each column to be a species or taxon, and each row to be an observed sample. In other words, each column is a time series, with the rows sorted such that time advances down (higher row indices correspond to later times).
In the covariates
data.frame:
The number of rows should match that of abundance
, and rows of covariates
should line up with abundance
(either sampled simultaneously or concurrently). Common covariates are date and time, temperature, treatments, etc.
In the metadata
list:
is_community
entry, which indicates whether the time series in abundance
can be treated as components of a community with interactions and/or shared drivers in some waycitation
entry that is a vector of text values for the reference to the dataset. There can be multiple values (e.g. in the case of a specific dataset pulled from a larger database).location
entry, it must contain at least a latitude
and longitude
value (in decimal form). location
itself can be a data.frame or vector (that has names)timename
entry, it refers to a column in the covariates
data.frame that gives a time index for the data
tidyr::full_seq
, along with a “period” entry (using 1 if missing) will produce the appropriate equi-timed spacingperiod
entry, it must be compatible with tidyr::full_seq
and the timename
variable described above.species_table
entry, it must have an id
column that includes all the column names in abundances
. This is intended to provide more information about the different variables in abundances
.Here is an example of a correctly formatted dataset with covariates and metadata:
library(MATSS) data(dragons) str(dragons) #> List of 3 #> $ abundance :Classes 'tbl_df', 'tbl' and 'data.frame': 6 obs. of 3 variables: #> ..$ Red Spotted Dragon : num [1:6] 2 6 0 5 4 4 #> ..$ Green Striped Dragon : num [1:6] 6 0 4 1 9 7 #> ..$ Blue Eyes White Dragon: num [1:6] 0 0 0 1 0 0 #> $ covariates:'data.frame': 6 obs. of 3 variables: #> ..$ date : Date[1:6], format: "2014-06-28" "2015-06-28" ... #> ..$ precipitation: int [1:6] 7 6 14 18 9 5 #> ..$ effort : num [1:6] 3 3 2 4 1 9 #> $ metadata :List of 7 #> ..$ timename : chr "date" #> ..$ effort : chr "effort" #> ..$ period : num 365 #> ..$ authors :List of 2 #> .. ..$ :Class 'person' hidden list of 1 #> .. .. ..$ :List of 5 #> .. .. .. ..$ given : chr "Ellen" #> .. .. .. ..$ family : chr "Bledsoe" #> .. .. .. ..$ role : chr "aut" #> .. .. .. ..$ email : NULL #> .. .. .. ..$ comment: Named chr "0000-0002-3629-7235" #> .. .. .. .. ..- attr(*, "names")= chr "ORCID" #> .. ..$ :Class 'person' hidden list of 1 #> .. .. ..$ :List of 5 #> .. .. .. ..$ given : chr "Hao" #> .. .. .. ..$ family : chr "Ye" #> .. .. .. ..$ role : chr "aut" #> .. .. .. ..$ email : chr "hao.ye@weecology.org" #> .. .. .. ..$ comment: Named chr "0000-0002-8630-1458" #> .. .. .. .. ..- attr(*, "names")= chr "ORCID" #> .. ..- attr(*, "class")= chr "person" #> ..$ species_table:'data.frame': 4 obs. of 2 variables: #> .. ..$ id : Factor w/ 4 levels "Blue Eyes White Dragon",..: 4 3 1 2 #> .. ..$ game: Factor w/ 2 levels "pokemon","yugioh": NA NA 2 1 #> ..$ citation : chr "Hao Ye, Ellen K. Bledsoe, Renata Diaz, S. K. Morgan Ernest, Juniper L. Simonis, Ethan P. White, & Glenda M. Yen"| __truncated__ #> ..$ is_community : logi TRUE #> - attr(*, "class")= chr "matssdata"
We can view the abundance and covariates tables side by side:
|
|
We also provide a function for checking whether the data is formatted correctly:
check_data_format(dragons) #> [1] TRUE