Читать книгу Informatics and Machine Learning. From Martingales to Metaheuristics онлайн
59 страница из 101
The Norwalk virus genome is actually smaller than the typical viral genome, which ranges between 10 000 and 100 000 bases in length. Prokaryotic genomes typically range between 1 and 10 million bases in length. While the human genome is approximately three billion bases in length (3.23 Gb per haploid genome, 6.46 Gb total diploid). To go forward with a “strong” statistical analysis in the current discussion, the key as with any statistical analysis, is sample size, which is obviously dictated in this analysis by genome size. So to have “good statistics” meaning to have sufficient samples that frequencies of outcomes provide a good estimation of the underlying probabilities for those outcomes, we will apply the methods developed thus far to a bacterial genome in ssss1 (the classic model organism, E. coli.). In this instance the genome size will be approximately four and a half million bases in length, so much better counts should result than with the 7654 base Norwalk virus genome.