Читать книгу Informatics and Machine Learning. From Martingales to Metaheuristics онлайн
81 страница из 101
In evaluating if there is a statistical linkage between two events we are essentially asking if the probability of those events are independent, e.g. does P(X, Y) = P(X)P(Y)? In this situation we are again in the position of comparing two probability distributions, P(X, Y) and P(X)P(Y), so if relative entropy is best for such comparisons, then why not evaluate D(P(X, Y) ‖ P(X)P(Y))? This is precisely what should be done and in doing so we have arrived at the definition of what is known as “mutual information” (finally a name for an information measure that is perfectly self‐explanatory!).
The use of mutual information is very powerful in bioinformatics, and informatics in general, as it allows statistical linkages to be discovered that are not otherwise apparent. In ssss1 we will start with evaluating the mutual information between genomic nucleotides at various degrees of separation. If we see nonzero mutual information in the genome for bases separated by certain, specified, gap distances, we will have uncovered that there is “structure” of some sort.