Читать книгу Informatics and Machine Learning. From Martingales to Metaheuristics онлайн
57 страница из 101
In the results for Shannon entropy on dinucleotides, we still do not see clear signs of nonrandomness. Similarly, let us try trinucleotide level. There are 64 (4 × 4 × 4) trinucleotides that we must now get counts on:
-------------------- prog1.py addendum 7 --------------------- stats = {} order = 3 for index in range(order-1,seqlen): xmer = "" for xmeri in range(0,order): xmer+=result[index-(order-1)+xmeri] if xmer in stats: stats[xmer]+=1 else: stats[xmer]=1 for i in sorted(stats): print("%dx'%s'" % (stats[i],i)) ---------------- end prog1.py addendum 7 ---------------------
Still do not see real clear signs of non‐random at tribase‐level! So let us try 6‐nucleotide level. There are 4096 6‐nucleotides that we must now get counts on:
----------------- prog1.py addendum 8 ------------------------ def shannon_order( seq, order ): stats = {} seqlen = len(seq) for index in range(order-1,seqlen): xmer = "" for xmeri in range(0,order): xmer+=result[index-(order-1)+xmeri] if xmer in stats: stats[xmer]+=1 else: stats[xmer]=1 nonzerocounts = len(stats) print("nonzerocounts=") print(nonzerocounts) counts = np.empty((0)) for i in sorted(stats): counts = np.append(counts,stats[i]+0.0) probs = count_to_freq(counts) value = shannon(probs) print "The shannon entropy at order", order, "is:", value, "." shannon_order(result,6) ------------------- end prog1.py addendum 8 ------------------