Читать книгу Informatics and Machine Learning. From Martingales to Metaheuristics онлайн
50 страница из 101
Some notes on syntax: “len” is a Python function that returns the length of (number of items in) an array (from the numpy module). Notice how the probs array initialization has one entry as 1.0 and the others just 1. Again, this is an instance where the data structure must have components of the same type and if presented with mixed type will promote to a default type that represents the least loss of information (typically), in this instance, the “1.0” forces the array to be an array of floating point (decimal) numbers, with floating point arithmetic (for the division in the frequency evaluation used as the estimate of the probability in the “for loop”).
At this point we can estimate a new probability distribution based on the rolls observed, for which we are interested in evaluating the Shannon entropy. To avoid repeatedly copying and pasting the above code for evaluating the Shannon entropy, let us create a subroutine, called “shannon” that will do this standard computation. This is a core software engineering process, whereby tasks that are done repeatedly become recognized as such, and become rewritten as subroutines, and then need no longer be rewritten. Subroutines also avoid clashes in variable usage, compartmentalizing their variables (whose scope is only in their subroutine), and more clearly delineate what information is “fed in” and what information is returned (e.g. the application programming interface, or API).