Читать книгу Informatics and Machine Learning. From Martingales to Metaheuristics онлайн

76 страница из 101

3.1.1 The Khinchin Derivation

In his now famous 1948 paper [106] , Claude Shannon provided a qualitative measure for entropy in connection with communication theory. The Shannon entropy measure was later put on a more formal footing by A. I. Khinchin in an article where he proves that with certain assumptions the Shannon entropy is unique [107] . (Dozens of similar axiomatic proofs have since been made.) A statement of the theorem is as follows:

H(p1, p2, …, pn)np1p2pnpk≥ 0k = 1, 2, …, nkpk= 1nH(p1, p2, …, pn) = −λ∑kpklog(pk)λλ = 1

1 For given n and for ∑kpk = 1, the function takes its largest value for pk = 1/n (k = 1, 2, …, n). This is equivalent to Laplace’s principle of insufficient reason, which says if you do not know anything assume the uniform distribution (also agrees with Occam’s Razor assumption of minimum structure).

2 H(ab) = H(a) + Ha(b), where Ha(b) = –∑ap(a)log(p(b|a)), is the conditional entropy. This is consistent with H(ab) = H(a) + H(b), for probabilities of a and b independent, with modifications involving conditional probability being used when not independent.

Правообладателям