The most promising approach to creating a mathematical theory of vision is based on physiological models that describe the process of human detection and recognition of objects. However, the complexities associated with studying these processes do not yet allow us to obtain significant results for lighting engineering soon. The most effective way to solve this problem is to maximize the use of existing knowledge about the physiology of vision and the structure of the human eye. It is also necessary to draw upon results obtained in other fields of science to fill in the missing information. Physiology describes in detail the structure of the optical system of the human eye, including the retina. However, the mechanisms of processing visual information at the retinal level, and especially in the higher parts of the brain, are at an initial stage of study.
Due to the lack of information on this issue, it can be assumed that in the course of evolution and natural selection, the human brain has reached the highest level of development in solving visual tasks of object detection and recognition. This assumption is based on the fact that during evolution, carnivores with less perfect vision would have died of hunger in the struggle with more developed predators, while herbivores with similar vision would have become the prey of the former. Both would have failed to produce offspring, and their genes would not have participated in further natural selection.
This hypothesis opens up the possibility of using one of the conclusions drawn in statistical decision theory. In particular, the optimal radiation receiver algorithm proposed by Shestov in 1967 can be applied to the visual system. Among the many algorithms designed to isolate signals against a background of noise, there is only one that can surpass all others in solving the problem of detecting objects against a noise background. This algorithm is called the “optimal radiation receiver,” and for the analysis of two-dimensional luminance fields, it is called the “optimal image receiver” or simply the “optimal receiver”. There are many mathematical methods that describe this algorithm. However, the most illustrative way to represent it is by using the likelihood ratio function. At the same time, we do not claim that the human brain calculates exactly this function. We only assume that due to complex neural connections, the human brain implements an algorithm close to the algorithm of the optimal receiver. The likelihood ratio function is merely a convenient mathematical tool for describing this algorithm.
The block diagram of the proposed visual system model is presented below.
The crystalline lens of the eye and all its optical elements are represented in the model by an optical system (OS). The retina is represented as a mosaic of N statistically independent radiation receivers (RR). In this case, independent radiation receivers are understood as either individual receivers or groups of receivers connected via intermediate neurons to a single optic nerve fiber. A set of random signals in the optic nerve fibers mu_i enters the analysis unit (AU), which contains a priori information about the background and the object in its memory. The AU calculates the one-dimensional likelihood ratio function Lambda, which is equal to the ratio of the probability of occurrence in a detection experiment of a random realization Y (a set of random signals mu_i) given the presence of an object in the human’s field of view (P[Y/S]) to the probability of the occurrence of the same realization Y given its absence, i.e., given the presence of the background (P[Y/0]).
$$ \Lambda = \frac{P[Y/S]}{P[Y/0]} \quad (2.1) $$where p and q are the a priori probabilities of the presence and absence of the object, and p + q = 1.
In essence, the likelihood ratio in probabilistic language shows what the image seen by a person (the distribution of mu_i) is more similar to: an image of the background with an object (Lambda > 1) or an image of the background without an object (Lambda < 1).
According to the optimal receiver algorithm, the decision about the presence of an object in the field of view should be made when Lambda exceeds a certain threshold Lambda_p, i.e., in accordance with the decision rule Lambda > Lambda_p.
Different criteria are characterized only by different numerical values of Lambda_p. If we assume that the criterion by which a person makes a decision about the presence of an object remains unchanged for different tasks, the value of Lambda_p will be constant for various types of objects, backgrounds, brightness levels, and other parameters.