Читать книгу Informatics and Machine Learning. From Martingales to Metaheuristics онлайн

62 страница из 101

2.4 Identifying Emergent/Convergent Statistics and Anomalous Statistics

Expectation, E(X), of random variable (r.v.) X:


X is the total of rolling two six‐sided dice: X = 2 can occur in one way, rolling “snake eyes,” while rolling X = 7 can be done in six ways, etc. E(X) = 7. Now consider the expectation for rolling a single die, now E(X) = 3.5. Notice that the value of the expectation need not be one of your possible outcomes (it is really hard to roll a 3.5).

The expectation, E(g(X)), of a function g of r.v. X:


Consider special case g(X) where g(xi ) = −log(p(xi )):


which is Shannon Entropy for the discrete distribution p(xi). For Mutual Information, similarly, use g(X,Y) = log(p(xi , yi )/p(xi )p(yi )) :


if p(xi ), p(yi ), p(xi , yi ) are all ∈ℜ+ , which is the Relative Entropy between a joint distribution and the same distribution if r.v.'s independent: D( p(xi , yi ) ‖ p(xi )p(yi ) ).

Jensen's Inequality:

11nn11nnii11


Since φ(x) = −log(x) is a convex function:


Variance:


Chebyshev's Inequality:

Правообладателям