Download PDFOpen PDF in browserDetection of exceptional genomic words: a comparison between speciesEasyChair Preprint 6312 pages•Date: April 15, 2018AbstractIn this study we explore the potentialities of the inter-word distances to detect exceptional genomic words (oligonucleotides) in several species, using whole-genome analysis. We confront the empirical results obtained from the complete genomes with the corresponding results obtained from the random background. We develop a procedure, based on some statistical properties of the global distance distributions in DNA sequences, to discriminate words with exceptional inter-word distance distribution and to identify distances with exceptional frequency of occurrence. We identify the statistically exceptional words in whole-genomes, i.e., words with unexpected inter-word distance distributions, and we suggest species signatures based on exceptional word profiles. Keyphrases: DNA sequence, Inter-oligonucleotide distances, exceptional genomic word, goodness-of-fit, stochastic model
|