Details
Presenter(s)
Display Name
Micah Thornton
- Affiliation
-
AffiliationUniversity of Texas Southwestern
- Country
-
CountryUnited States
Abstract
This paper described three approaches for filtering genomic PS for use in correlation analysis, data reduction, genomic fingerprinting, and sorting. The approaches: Minimal Variance Filtering, Automatic Filter Learning, and Maximal Variance Principal Components Filtering are introduced. We provide a case study on 1,397 SARS-CoV-2 genomes, and show how filtered sets of coefficients produce distances correlated with the unfiltered sets, we also show how specific information such as region of sequence submission may be captured by filtered power spectral coefficients, by attempting to classify the region of submission using sets of filtered power spectra with random forest classifiers.