Active Testings & MIRACLEs

Active testing refers to a process in which an image is sequentially probed for information in order to efficiently determine the presence of one or more objects drawn from an object library. This amounts to a strategy captured by a {\it decision tree}, wherein the nodes correspond to performing simple tests on the image grey levels and the branches correspond to the possible outcomes of the tests. The leaves represent decision points: sufficient information has been gathered to make an informed decision.

Decision trees provide a highly efficient computational framework for recognizing multiple objects in complex scenes. The focus of past work has been on issues of good or optimal methods for constructing decision trees, in the context of mathematical and statistical models that define the objects to be recognized and the distribution of the observed image data.

Our early work in this area concentrated on specific algorithms for specific applications. These applications have provided a rich motivation, focusing foundational research on issues that are germane to real military and industrial computer vision problems. Decision trees are the basis for an optical character recognition algorithm that is in wide use in the electronics, semiconductor and other industries that use state-of-the-art manufacturing technology. Decision trees are also the foundation for a Ladar/FLIR ATR algorithm developed with support from the Night Vision Lab and ARL (\cite{ARTM4}). Recently, decision trees have been developed for the recognition of shape classes and applied to road tracking in satellite imagery and handwritten numeral recognition, achieving state-of-the-art performance. Currently, we are working on the use of decision trees for early detection of potential objects for wide-area surveillance and retargetting.

Decision trees are conceptually related to decision making problems familiar from everyday life. They are a variant of the divide-and-conquer procedure which is carried out informally, heuristically, in parlor and board games such as ``Twenty Questions'' or ``Battleship'' and in popular TV game shows such as ``Wheel of Fortune.'' The basic idea is very simple: ask the right questions in the right order. Of course, our research strives to place what is informal and heuristic on a firm scientific foundation. In a careful formulation, the guiding principle for decision trees is uncertainty reduction and the construction is rooted in principles of statistical inference and information theory.

Decision trees provide a framework for attempting to understand optimality of procedures for testing multiple, highly composite hypotheses. At virtually every location in an image, we seek to determine the possible interpretations of the image data in a vicinity of the location: Is there any object present or is there no object present? If an object is present, what are its possible types? These questions are addressed as formal hypothesis testing problems. Regarded as such, and recognizing the complexity of these decisions in comparison to problems in classical statistics where sharp performance bounds are known, it is reasonably clear that there is slim hope of a tight universal bound on the ``power'' of all possible recognition methods and no hope of deriving a ``uniformly most powerful'' recognition algorithm. Nonetheless, the mathematical framework does allow one to understand recognition problems in terms of well-established theories of information and sequential decision theory and measures of performance can be analyzed.

One such measure of performance relates the computational efficiency of a decision protocol (tree) to the best possible computational complexity in the idealized case that error-free tests can be done at the nodes of the decision tree. The problem of designing or optimizing a binary decision tree is equivalent to prescribing which of the available tests should be done at each node. If the criterion of optimality is to minimize the expected time to reach a decision of which object is present, if any, and if tests are available for every possible binary partitioning of the object subclasses that are still viable at a given node, then it is well known that an optimal tree can be identified with a Huffman code and the best possible value of the objective function is the Shannon entropy of the initial distribution on the object types. In this idealized scenario, one can also prove that a ``greedy'' decision rule performs nearly as well as the optimal protocol.

In more complex scenarios, the decision tree framework still affords a clear understanding of the connections between recognition on one hand and fundamental principles of information theory and sequential decision theory on the other hand.