phylotypr is a package for classification based analysis of DNA sequences. This package primarily implements Naive Bayesian Classifier from the Ribosomal Database Project. Although you can classify any type of sequence (assuming you have the proper database), this algorithm is mainly used to classify 16S rRNA gene sequences.
You can install the development version of phylotypr from GitHub with:
# install.packages("devtools")
::install_github("mothur/phylotypr") devtools
You can also get the official release version from CRAN
install.packages("phylotypr")
Be sure to see the Getting Started article to see an example of how you would build the database and classify individual and multiple sequences.
The {phylotypr}
package ships with the RDP’s v.9 of
their training data. This is relatively small and old (2010) relative to
their latest versions. You are encouraged to install newer versions of
the RDP, greengenes, and SILVA databases from the
{phylotyprrefdata}
package on GitHub. Note that installing
the package will take about 20 minutes to install. If it sits at “moving
datasets to lazyload DB” for a long time, this is normal :)
::install_github("mothur/phylotyprrefdata")
devtoolslibrary(phylotyprrefdata)
The following will list the references that are available in
{phylotyprrefdata}
:
data(package = "phylotyprrefdata")
{phylotypr}
You can learn more about the underlying algorithm in the paper that originally described the algorithm that was published in Applied and Environmental Microbiology. If you want to learn more about how this package was created, be sure to check out the mothur YouTube channel where a playlist is available showing every step.