Читать книгу Informatics and Machine Learning. From Martingales to Metaheuristics онлайн

35 страница из 101


ssss1 (Left) The general stochastic sequential analysis flow topology. (Center) The general signal processing flow in performing channel current analysis is typically Input ➔ tFSA ➔ Meta‐HMMBD ➔ SVM ➔ Output. (Right) Notable differences occur in channel current cheminformatics during state discovery when EVA‐projection (emission variance amplification projection), or a similar method, is used to achieve a quantization on states, then have Input ➔ tFSA ➔ HMMBD/EVA (state discovery) ➔ meta‐HMMBD‐side ➔ SVM ➔ Output. While, in gene‐finding just have: Input ➔ meta‐HMMBD‐side ➔ Output. In gene‐finding, however, the HMM internal “sensors” are sometimes replaced, locally, with profile‐HMMs [1, 3] (equivalent to position‐dependent Markov Models, or pMM’s, see ssss1), or SVM‐based profiling [1, 3], so the topology can differ not only in the connections between the boxes shown, but in their ability to embed in other boxes as part of an internal refinement.

Source: Based on Winters‐Hilt [1, 3].

The sequence of algorithmic methods used in the SSA Protocol, for the information‐processing flow topology shown in ssss1, comprise a weak signal handling protocol as follows: (i) the weakness in the (fast) Finite State Automaton (FSA) methods will be shown to be their difficulty in nonlocal structure identification, for which HMM methods (and tuning metaheuristics) are the solution; (ii) for the HMM, in turn, the main weakness is in local sensing “classification” due to conditional independence assumptions. Once in the setting of a classification problem, however, the problem can be solved via incorporation of generalized SVM methods [1, 3]. If facing only classification task (data already preprocessed), the SVM will also be the method of choice in what follows. (iii) The weakness of the SVM, whether used for classification or clustering, but especially for the latter, is the need to optimize over algorithmic, model (kernel), chunking, and other process parameters during learning. This is solved via use of metaheuristics for optimization such as simulated annealing, and genetic algorithm optimization in (iv). The main weaknesses in the metaheuristic tuning effort is partly resolved via use of the “front‐end” methods, like the FSA, and partly resolved by a knowledge discovery process using the SVM clustering methods. The SSA Protocol weak signal acquisition and analysis method thereby establishes a robust signal processing platform.

Правообладателям