This tutorial will guide you on how to perform GWAS with SLOPE. Analysis consists of three simple steps.
You need to provide paths to three files:
library(geneSLOPE)
famFile <- system.file("extdata", "plinkPhenotypeExample.fam", package = "geneSLOPE")
mapFile <- system.file("extdata", "plinkMapExample.map", package = "geneSLOPE")
snpsFile <- system.file("extdata", "plinkDataExample.raw", package = "geneSLOPE")
When you have phenotype you can move to reading snp data. Depending on data size reading SNPs may long time. As data is very large, snps are filtered with their marginal test p-value. All snps which p-values are larger than threshold \(pValMax\) will be truncated. For details on how to choose \(pValMax\) see How changing parameters affects my analysis?
screening.result <- screen_snps(snpsFile, mapFile, phenotype, pValMax = 0.05,
chunkSize = 1e2, verbose=FALSE)
Parameter verbose=FALSE suppresses progress bar. Default value is TRUE.
User look into result of reading and screening dataset
## Object of class screeningResult
## $X: data matrix
## 90 observations
## 52 snps
## 1000 SNPs were screened
## 52 snps had p-value smaller than 0.05 in marginal test
When data is successfully read, one can move to the second step of analysis.
Last step of analysis is using SLOPE
## Warning in select_snps(clumping.result, fdr = 0.1): All lambdas are equal. SLOPE does not guarantee
## False Discovery Rate control
As before one can plot and summarize results
## Object of class selectionResult
## 2 snps selected out of 41 clump representatives
## Effect size for selected snps (absolute values)
## Min: 3.640963
## Mean: 3.768299
## Max: 3.895635
## R square of the final model: 0.9756304
## Kink value: 1
Like with result of clumping, it is possible to identify interactively clump number which contains specific SNP selected by SLOPE. The procedure is the following. First plot the whole genome, then run function and click on SNP of interest.
When clump is identified one can zoom into it
It is easy to get information about selected SNPs. To get indices of columns in original SNP matrix they refer to use
## rs2719295_T rs17546815_T
## 222 573
If .map file was given, then one can get more information about SNPs
## chromosome rs genetic_distance_(morgans)
## 222 8 rs2719295 34.2919
## 573 11 rs17546815 113.7420
## base_pair_position_(bp_units)
## 222 34291873
## 573 113741770
For information about SNPs that are part of specific clump use
## Summary of 1 selected clump
## chromosome rs genetic_distance_(morgans)
## 222 8 rs2719295 34.2919
## 377 2 rs11124642 38.6580
## 598 2 rs4672803 217.2340
## 906 6 rs325120 147.8600
## base_pair_position_(bp_units)
## 222 34291873
## 377 38657967
## 598 217233745
## 906 147860161
There are three numerical parameters that influence result
Input: \(rho \in (0, 1)\);