Fisher Kernel

One of the most important abilities in human’s learning process is perhaps pattern recognition. That is to say, when a picture is presented in front of a person, he or she can usually give this picture a place in his or her mind: this is a picture of buildings, this is a picture of landscapes, this is a picture of my friends, this is a picture of an animal, etc.,etc. This is the very first step in mind’s information managing process.

But this process is not at all clear to us. A digression here is good. In fact, the reception of the information about a picture is really something microscopic. This means that, just the photons, the photo receptors, the retinas, the neurons, they are all in the microscopic level. But, the result of the pattern recognition is something macroscopic. In other words, knowing what is what, which belongs to which, is in the spiritual level, a macroscopic level. And we have to find a bridge between these two levels. The most common models used now in neuroscience is perhaps still based on networks, that is viewing the neurons as vertexes, the synapses as the connections, and so on. If these network models were to simulate something in the macroscopic level, we have to consider comparing the network to something in physics. A certain quantity of liquid water can, in some sense, be viewed as a network, too. Each water molecule is a vertex, while the forces(Van der Waals) between molecules are the connections(or edges, in network’s terminology). To resemble the network of neurons, this quantity of water must be in static state(macroscopically speaking), because most of the time the neurons don’t move around. It is the synapses that change the states of the mind. So, loosely speaking, the network of neurons are almost always in or near critical states(in other words, this network is always in or near phase transition).

Phase transition, in physics, generally means that, even though the whole material is in macroscopic static state, but its macroscopic physics properties change dramatically with some perturbation in the outside world. This sounds just like the behavior of the network of neurons. Every time there is some stimulus from exterior, the network of neurons will give some macroscopic results.

So, in some sense, all the difficulties in modeling the network of neurons are reduced to the understanding the mechanisms of the phase transition of the neuron groups.

Now return to our original pattern recognition problem. This problem is the common difficulty among all the works that want to mimic the human’s mind, for example, the machine learning. In this field, this problem is called the classification problem. There are two types of classification problem, the supervised learning classification and unsupervised learning classification( or more often called, clustering problem). The first type means to classify things based on a set of things whose classification results are already known, while the second type means to classify things using only their innate properties.

Even though it is a common practice that the classification result of one object is a certain thing, but if we think over it for some time, we should realize that the word ‘certain’ already involves probability(in this sense, we can say, with one hundred percent probability, that our world, the world of our humans, is in fact probabilistic, even though this phrase is itself self-contradictory). So it seems more natural to consider those statistical models, rather than those deterministic models.

One such model in classification problem is the Fisher kernel model. Like most other statistical model, this model uses a particular distance function(the Fisher kernel) to measure similarities between objects. And to test if an object belongs to a certain class, we just have to measure the distances between this object and the objects in this class and then take their mean, so after some conversion, this mean number represents the probability that this object belongs to this class.


Retina is the surface area of the eyes of an animal.

In the inner surface of the retina reside the retinal ganglion cell. This is a special type of neurons, which receives visual information through two other neurons: the horizontal cell and the amacrine cells. The defining characteristic of the retinal ganglion cell is that it has a very long axon which extends directly to the brain. There is a liaison between the retinal ganglion cells and the photoreceptor. For the human being, each retina has about one million retinal ganglion cells, while it has one hundred million photoreceptors. But the photoreceptors are not uniformly distributed. In fact, in the center of the retina, each retinal ganglion cell correspond to five or six photoreceptors, whereas in the periphery area, each retinal ganglion cell receives visual information from some thousands photoreceptors. This, in some sense, is reasonable, because the visual information is not uniformly distributed, either. Usually the central area receives the most information of the sights, so a few photoreceptors will be enough.

The role of retinal ganglion cells is that they proceed the visual information. What does this mean? It means that these neurons start to distinguish the motion visual information. More precisely, it is in the inner surface of retina that the horizontal motion is separated from the vertical motion, as well as from other motions, like moving forward, or moving backward. This is very important for all the animals. For example, when there is a predator in front of an animal, if the predator moves a bit, the animal recognizes at once and can prepare for an escape. This process is automatic, that is, seeing a moving creature, then this visual information is sent to the brain, the brain deciding taking a corresponding action after a series of action potentials and chemical reactions. So, viewed this way, we see that the simple actions of the animals is spontaneous, at least with the kick-off of a firing in the outside world.

But this is not exactly the case. In fact, the reality is more complex than what we have thought. For example, a cat can move even without any external stimulus. So in some sense, the brain is a machine automatic. There are fluctuations in the brain, which is the internal stimulus. But which system can have fluctuations? This is an interesting question. We know that, even the microbes are in a scale superior to that of atoms, the latter being the constituents of all the grand molecules, like DNA, proteins, glucides, etc.. But this is not enough to create a living being, which is obvious. Put in another way, just putting together many many DNA, proteins, all sort of things that exist in the body of an animal will not lead to a creation of such animal. That is to say, only the fluctuation in the sense of thermodynamics is not enough. So, we must have forgotten something. Or, perhaps it is just because I haven’t read enough.