Sample modifiers in pepr: append

Michal Stolarczyk

2023-11-21

Learn append sample modifier in pepr

This vignette will show you how and why to use the append functionality of the pepr package.

Problem/Goal

The example below demonstrates how to use the constant attributes to define the samples attributes in the read_type column of the sample_table.csv file. This functionality is extremely useful when there are many samples that are characterized by identical values of certain attribute (here: value SINGLE in read_type attribute). Please consider the example below for reference:

sample_name organism time read_type
pig_0h pig 0 SINGLE
pig_1h pig 1 SINGLE
frog_0h frog 0 SINGLE
frog_1h frog 1 SINGLE

Solution

As the name suggests the attributes in the specified attributes (here: read_type) can be defined as constant ones. The way how this process is carried out is indicated explicitly in the project_config.yaml file (presented below). The name of the column is determined in the sample_modifiers.append key-value pair. Note that definition of more than one constant attribute is possible.

   pep_version: 2.0.0
   sample_table: sample_table.csv
   sample_modifiers:
      append:
          read_type: SINGLE

Let’s introduce a few modifications to the original sample_table.csv file to use the sample_modifiers.append section of the config. Simply skip the attributes that are set constant and let the pepr do the work for you.

sample_name organism time
pig_0h pig 0
pig_1h pig 1
frog_0h frog 0
frog_1h frog 1

Code

Read in the project metadata by specifying the path to the project_config.yaml:

projectConfig = system.file(
  "extdata",
  paste0("example_peps-", branch), 
  "example_append", 
  "project_config.yaml", 
  package = "pepr"
)
p = Project(projectConfig)
#> Loading config file: /tmp/RtmpoymTo9/Rinstb3055bff7/pepr/extdata/example_peps-master/example_append/project_config.yaml

And inspect it:

sampleTable(p)
#>    sample_name organism time read_type
#> 1:      pig_0h      pig    0    SINGLE
#> 2:      pig_1h      pig    1    SINGLE
#> 3:     frog_0h     frog    0    SINGLE
#> 4:     frog_1h     frog    1    SINGLE

As you can see, the resulting samples are annotated the same way as if they were read from the original annotations file with attributes in the last column manually determined.

What is more, the p object consists of all the information from the project config file (project_config.yaml). Run the following line to explore it:

config(p)
#> Config object. Class: Config
#>  pep_version: 2.0.0
#>  sample_table: 
#> /tmp/RtmpoymTo9/Rinstb3055bff7/pepr/extdata/example_peps-master/example_append/sample_table.csv
#>  sample_modifiers:
#>     append:
#>         read_type: SINGLE
#>  name: example_append