# modules of finite type over a principal ideal domain

The modules of finite type over a principal ring(which, in this post, is considered the same as a principal ideal domain, just for simplify the long name) behaves very well, much like the vector spaces over a field. For example, we know that in linear algebra, any linear independent set of vectors of a vector space can be completed to be a basis of this space. In the theory of modules over a principal ring, we can’t expect that much, but we still have a similar result: we can simultaneously choose a set of elements that generate a free module of finite type and one of its submodule. We will try to prove this result in this post.

Recall that a principal ring $R$ is a ring that every ideal is principal. So as a consequence, $R$ is unique factorization domain. Hence, $R$ is a Noetherian ring(every ideal is finitely generated). And so, every module $M$ of finite type over $R$ is also Noetherian, since $M$ is a quotient of a free $R$-module of finite rank, while the later is a direct sum of Noetherian rings, hence is itself Noetherian, and so its quotient $M$, is Noetherian. But in general, $M$ is not Artinian, mainly because $R$ is integral, there always exists a descending sequence of infinite length of ideals $(a)\supset(a^2)\supset(a^3)...$.

But if pose a condition on $M$, then we can get what we want: if $M$ is torsion. Note that $M$ is of finite type, so we can write $M=\sum_{i=1}^n Rm_i$(the sum is a finite sum). And thus we can construct a projection from $R^n$ to $M$, by sending $(a_1,a_2,...,a_n)$ to $\sum_i a_im_i$. Now we denote $Ann_R(m)={a\in R| am=0}$. We see easily that this is an ideal of $R$, and thus is principal. For those generators of $M$, denote $Ann_R(m_i)=Rr_i(r_i\neq 0)$. So we have a surjection $R/Rr_1\times R/Rr_2\times...\times R/Rr_n\rightarrow M$. So if we can show each term in the product is Artinian, then the whole product is also Artinian, so is its quotient $M$. Note that if we have a descending sequence of ideals $R/Rr_i\supset Ra_1/Rr_i\supset Ra_2/Rr_i\supset...$. Then we must have that $a_1|a_2|...|a_k|a_{k+1}|...|r_i$. But since $r_i$ has only finite many factors non-equivalent, so the above sequence is stationary, hence we proved that $R/Rr_i$ is in fact Artinian, so is $M$. Here we use the fact crucial fact that $M$ is torsion, which guaranties that these $r_i$ are not zero elements.

So in summary we get the following result:

If $M$ is a torsion module of finite type over a principal ring, then $M$ is both Noetherian and Artinian.

So in fact we have seen that there are two important parts in a module, the torsion part and the torsion free part. In deed, if we denote by $T(M)$ the set of elements of $M$ which is torsion, then we see that this is a submodule of $M$. And continue to take a quotient, we get an exact sequence $0\rightarrow T(M)\rightarrow M\rightarrow M/T(M)\rightarrow 0$. Then there are three things to do, study the two non-trivial parts not in the center and see if we can recover $M$.

For the first step, we will study the structures of modules torsion-free. Before doing the real work, we can stop to have a look at the present question: does such a module over a principal ring resemble a vector space over a field? This is an interesting problem, and we will see that it is indeed the case in the sense that will be clarified.

In fact, we will prove that we can define an invariant for a free module over a commutative ring, this invariant will be called the rank of this free module. Note here we emphasis that the module is free, because otherwise it is hard to even to find a similar number to be a candidate for an invariant. Let’s first state the result:

Suppose that $R$ is a commutative ring, then for any injection of $R$-homomorphism $f:R^n\rightarrow R^m(n,m\in \mathbb{N})$, then we have that $n\leq m$

This result have a very nice proof using the method of Cayley-Hamilton theorem. The proof I found is here. Here I will just give the main idea of the proof(the proof itself is very short and simple). Suppose the opposite that $n>m$, then we can view $R^m$ as a submodule of $R^n$(the inclusion map being sending the $m$ coordinates of $R^m$ to the first $m$ coordinates of $R^n$), then we can say that $f$ is in fact a homomorphism from $R^n$ to itself, a transformation on $R^n$, which is still injective. So using the method in the proof of Cayley-Hamilton theorem, we see that $f$ satisfies its minimal polynomial, $f^k+a_{k-1}f^{k-1}+...+a_1f+a_0=0$. Using the fact that $f$ is injection, we have that $a_0\neq 0$. Noting that $a_0$ is an element of $R$, and when we apply both terms of this equation to the vector $(0,0,...,0,1)$. Using the new expanded definition of the $f$, we see that the left hand side is equal to $a_0$, while the right hand side is $0$. So we get a contradiction, which means that $n\leq m$.

This is a very pleasant result, note that we have found an invariant of a free module over any commutative ring.

But what about the submodules of a free module? Are they similar to a free module? The following example shows that it is in general not the case.

The ring is polynomial ring of one variable over the integers, $R=\mathbb{Z}(X)$. Then consider one of its ideal(which is, of course, a module over $R$) $M=R<3,X^2>$. Suppose that there is a free module $R^n$of rank greater than $1$ and an injection $f:R^n\rightarrow M$. Consider the first two coordinate elements $e_1,e_2$, supposing that $f(e_1)=a_1, f(e_2)=a_2$. Then $f(a_2e_1-a_1e_2)=0$ since $f$ is an $R$-homomorphism. And obviously $a_2e_1-a_1e_2\neq 0$, thus contradicting the fact that $f$ is injective. So for $f$ to be injective, we must have $n=1$. But since the module $M$ has at least two generators, after examining all the possibilities, we conclude that $f$ can’t be surjective. Therefore, $M$ doesn’t have a free module structure.

But if we give the ring $R$ some more structures, things will change a lot. For example, when $R$ is a principal ideal domain(or equivalently, as put above, a principal ring), then any submodule of a free module of finite type is again a free module. That is

If $R$ is a principal ring, then any submodule of $R^n$ is a free module.

The proof proceeds as a recurrence on the rank $n$. For the beginning case, $n=1$, note that a submodule of $R$ is the same as an ideal of $R$, thus is a principal ideal, written as, say, $Rr$. Clearly we have that $Rr$ is isomorphic either to $R$, or to $0$, in both cases $Rr$ is a free module. For the induction part, suppose that $M$ is a submodule of $R^{n+1}$. Then consider the projection $p: R^{n+1}\rightarrow R$, sending an element of $R^{n+1}$ to its last component. Thus we  have a decomposition $R^{n+1}=ker(p)\bigoplus R$. And also we have a decomposition for the submodule $M$, that is to say, $M=(ker(p)\bigcap M)\bigoplus p(M)$. The decomposition is not so trivial as it first seems, and next we try to construct a map. Note that $p(M)$ is in fact an ideal of $R$, so we can write it as $Rr_0$. And we suppose that $p(m_0)=r_0$. The point in constructing this isomorphism is to start from the direct-sum-part, not the inverse direction. We define a map as $f: (ker(p)\bigcap M)\bigoplus p(M)\rightarrow M, (m,r)\mapsto m+rm_0$. That this map is $R$-isomorphism can be easily verified. So now we get a submodule $ker(p)\bigcap M$ of the free module $ker(p)=R^n$ of rank $n$, which satisfies the induction assumption, thus proving this result.

So now we have a very nice characterization of the submodule of a free module. We want to know if all modules torsion-free have a free module structure. Perhaps the first step is to show that any such module is a submodule of some free module. In fact we have an elegant result:

If $R$ is a principal ring, then any module of finite type over $R$ is a submodule of a free module of finite rank.

This fact is not hard to prove. In deed, if we denote the finite set of generators of $M$ as $G=\{m_1,m_2,...,m_r\}$. We can not expect that there is a direct map from $M=\sum_i Rm_i$ to $R^r$. The obstacle is that these generators are not ‘linearly independent’. So we have to find a subset of $G$ the elements of which are $R$-independent. The $R$-independence can be characterized as the existence of an injection from some $R^k$ to the submodule generated by these elements. And of course we have to consider the largest set of $R$-independent generators, which leads to the following consideration: we define the set of subsets of $G$ of which the elements are $R$-independent, that is to say $S=\{g\subset G|\exists f:R^{\#g}\rightarrow \sum_{m\in g}Rm, f \text{is injective}\}$. Using the inclusion relation as an order, we see there exists a maximal element $g$of $S$ with a morphism $f: R^{\#g}\rightarrow \sum_{m\in g}Rm:=M'$(note that this is in fact an isomorphism, since the elements in $g$ are $R$-independent). And for any generator $m_i\in G-g$, there exists a nonzero element $a_i\in R$ such that $a_im_i\in M'$. So we define $a=\prod_{m_i\in G-g}a_i$, which is not zero, since $R$ is integral. So now we can set $F:M\rightarrow M', m\mapsto am$. This homomorphism is injective since $M$ is torsion-free, and so we get an embedding $M\rightarrow R^n$ for some rank $n$.

Combining the above two results, we know that any module torsion-free of finite type is in fact a free module of finite rank.

So we know that the structures of modules torsion-free of finite type over a principal ring is rather simple, they are just $R^n$ for some integer $n$.

So our next step is to study the structures of torsion modules over a principal ring. Recall that we have shown that this kind of modules are, in fact, Noetherian and Artinian. So in some sense we can invoke the Krull-Schmidt theorem on decomposition of Noetherian or Artinian rings into indecomposable modules. By the way, looking back the above text, we see that the indecomposable modules over a principal ring include itself. But take the integer ring $R=\mathbb{Z}$ as an example. What are the indecomposable modules over $R$? It is easy to verify that $R/(Rp^n)(n\in\mathbb{N})$(we can do this by comparing the orders of the elements). We know that in a principal ring $R$, there are also prime elements, which are the generators of the prime ideals. So we expect that $R/(Rp^n)$ are also indecomposable modules over $R$(where $p$ is a prime element in $R$). This is what the following proposition says:

Suppose that $R$ is a principal ring, and $p$ is a prime element in $R$(possibly equal to $0$), then $R/(Rp^n)$(where $n$ is a non-negative integer) is an indecomposable module over $R$.

The problem then is reduced to how to show that a module is indecomposable. Note if the module $M$ is decomposable, that is, $M=M'\bigoplus M''$, then consider the endomorphisms on $M$. Each endomorphism $f$ on $M$ is, in fact, equivalent to four parts: the map $f_{11}$from $M'$ to $M'$, the map $f_{12}$ from $M'$ to $M''$, the map $f_{21}$ from $M''$ to $M'$, and the map $f_{22}$ from $M''$ to $M''$. Now consider two maps, the first is $F\in End(M)$ such that $F_{11}=id_{M'}, F_{12}=F_{21}=F_{22}=0$, the second $F\in End(M'')$ such that $G_{22}=id_{M''},G_{11}=G_{12}=G_{21}=0$. Consider the ideals $I_F,I_G$ of $End(M)$ generated by these two elements respectively. There are maximal ideals containing these two ideals, but these two maximal ideals can’t be identical, otherwise it would contain $F+G=id_{M}$, which is impossible. So through the opposite hypothesis, we get that the End-ring can’t be a local ring.

Now look at the particular module $M=R/(Rp^n)$. There is a series of identifications $End_R(M)=End_{M}(M)=M$. They are easy to verify. So we have to show that $M$ is a local ring(it indeed has a ring structure inherited from that of $R$) to show that $M$ is indecomposable. Note that an ideal of $M$ is always of the form $Rr/(Rp^n)$ for some element $r\in R$. Thus we have that $r|p^n$. So we conclude easily that there is only one maximal ideal in $M$, that is $Rp/(Rp^n)$. The above argument is valid for $p\neq0$. It remains to show that $R$ is indecomposable. If not, then we can write $R=Ra\bigoplus Rb(a,b\neq0)$. But this is impossible since $Ra\bigcap Rb=Rgcd(a,b)$, which is not zero.

So after a long journey, we get to the conclusion that $R/(Rp^n)$ are indecomposable modules over $R$.

But recall that in the beginning of this post, we have said that all torsion modules of finite type over a principal ring $R$ is in fact a finite product of modules like $R/(Ra)$. Then a simple application of the Chinese remainder theorem shows that $R/(Ra)$ is a finite product of modules of the form $R/(Rp^n)$ where $p$ is a prime element in $R$.

In fact one corollary of the above argument is that, any indecomposable torsion module of finite type over a principal ring $R$ is of the form $R/(Rp^n)$($p$ as above).

So all in all, we know the structures of the torsion part and the torsion-free part of a module over a principal ring. Now return to the exact sequence of a module $M$ of finite type: $0\rightarrow T(M)\rightarrow M\rightarrow M/T(M)\rightarrow0$. Note that $M/T(M)$ is torsion-free, thus is a free module over $R$ of finite rank. Therefore this sequence splits, that is $M=T(M)\bigoplus M/T(M)$. Explicitly, any module $M$ of finite type over a principal ring $R$ is of the form $M=R^n\bigoplus R/(Rp_i^{n_i})\bigoplus R/(Rp_2^{n_2})\bigoplus...\bigoplus R/(Rp_k^{n_k})$ where $p_1,p_2,...,p_k$ are prime elements of $R$, and they are not necessarily distinct. So after rearranging these prime terms, we can write it in another form $M=R^n\bigoplus R/R(a_1)\bigoplus R/(Ra_2) \bigoplus...\bigoplus R/(Ra_l)$ where $a_l|a_{l-1}|...|a_2|a_1$.

We see that the modules of finite type over a principal ring is semi-simple as well as totally decomposable. But the concepts of indecomposable module and simple module are still not the same.

Now the last to do is to prove the result mentioned in the beginning of this post, that is

Suppose that $R$ is a principal ring, and $M=R^n$ a free module over $R$, $N$ a submodule of $M$. Then we can find a basis $(e_1,e_2,...,e_n)$ of $M$ and a sequence of elements $(d_1,d_2,...,d_n)$ of $R$ such that $d_n|d_{n-1}|…|d_2|d_1$ such that $M=\sum_i Re_i, N=\sum_i Rd_ie_i$.

The proof utilize again recurrence on the rank $n$. For the starting step, it is very easy. For the induction step, suppose that $M=R^{n+1}$. We have to find a way to reduce to the $n$-case. Suppose that now we know the result, that is $M=\sum_i Re_i, N=\sum_i Rd_ie_i$. Then any morphism $f: M\rightarrow R$ must have that $f(n)=f(\sum_i r_id_ie_i)=d_nf(\sum_i r_id_i'e_i)(n\in N)$(where $d_i'd_n=d_i$). So the $d_n$ are, in some sense, independent of the choice of the coordinates, it can be characterized by the morphisms. This is a very important observation. Continue our imagination. Which map $f:M\rightarrow R$ can take the value $d_n$?Can we characterize this map? We can characterize this map using $d_n$, that is to say, $f$ is the morphism we are after if and only if $f(N)=Rd_n$. This is like an extreme point, in some sense the largest map. To give a precise definition, we consider $S=\{g(N)|g:M\rightarrow, \text{is a morphism}\}$. Since these $g(N)$ are, in fact ideals of $R$, so we can choose a maximal one, say, $f(N)=Rd_n\in S$. Suppose further that $f(r_n)=d_n(r_n\in N)$. We want to show for any other morphism $g:M\rightarrow R$, the value of $g$ at the point $r_n$ is ‘greater’ than $d_n$, that is, $d_n|g(r_n)$. For simplicity, we denote $g(r_n)=d$. Now consider the gcd of $d_n,d$, that is $d'=gcd(d_n,d)=ad_n+bd$(the last identity is due to Bezout’s lemma). Now the new morphism $g'=af+bg$ has an image of $N$ greater than that of $f$. Indeed, note that $g'(N)\supset Rd'\supset Rd_n$. But according to the maximality of $f$, we must have that these three ideals are equal, and thus $d_n|d'$, which implies that $d_n|d$. We note that $d_n$ can not be zero, which is easy to see. Next we want to divide $r_n$ by $d_n$. This can be done by considering those projections $p_i: M\rightarrow R$, projecting to the $i$-th component. The values of these maps at $r_n$ are all divisible by $d_n$. That is to say, if we write $r_n=\sum_i a_ie_i$ in the canonical coordinate form ,then it reads that $d_n|p_i(r_n)=a_i$, thus $r_n$ is divisible by $d_n$. Thus we can put $r_n=d_nr_n'$, and therefore $d_nf(r_n')=d_n$, which implies, due to the fact that $R$ is integral, that $f(r_n')=1$. So now we can define a map $F:M\rightarrow ker(f)\bigoplus f(M)=ker(f)\bigoplus Rr_n', m\mapsto (m-f(m)r_n',f(m)r_n')$. $F$ is well defined, and is a morphism. So, after a simple verification, we get a decomposition $M=ker(f)\bigoplus Rr_n'=M'\bigoplus Rr_n', N'=(ker(f)\bigcap N)\bigoplus Rr_n=N'\bigoplus Rr_n$. So $M'$ is a free module of rank $n$, and $N$ is one of its submodules, thus using the induction assumption, we have that there exist a basis $(e_0,...,e_{n-1})$ of $M'$ and $(d_0,...,d_{n-1})\in R^n(d_{n-1}|...|d_0)$ such that $M'=\sum_i Re_i, N'=\sum_i Rd_ie_i$. Now to show that $d_n|d_{n-1}$, we define a ‘smart’ morphism, $G: M=M'\bigoplus Rr_n'\rightarrow R, (\sum a_ie_i, rr_n')\mapsto a_{n-1}+r$. Restricting to $N$, we have that $G':N=N'\bigoplus Rr_n, (\sum a_id_ie_i, rr_n)\mapsto a_{n-1}d_{n-1}+rd_n$. One value that $G'$ can take is obviously $d_{n-1}$. What is more, $G$ is ‘greater’ than $f$ in that $G'(N)\supset Rd_n$, thus $d_{n-1}\in G'(N)=Rd_n$, which implies that $d_n|d_{n-1}$. Which completes the proof, and also this post.

# 《天堂的颜色》－电影观后感

The first part of this series posts concentrates on quadratic forms in two variables over the integers.

A quadratic form $q$ over the integer ring $\mathbb{Z}$, is just a bilinear form, $q: \mathbb{Z}^2 \rightarrow \mathbb{Z}$. For example, $q(x,y)=ax^2+bxy+cy^2(a,b,c,x,y\in\mathbb{Z})$ is a general expression for a quadratic form.

We say that two quadratic forms $q_1,q_2$ are equivalent if there is an invertible matrix $P\in M_2(\mathbb{Z})$ such that $q_1(x,y)=q_2((x,y)P)$ (here we take the convention that $q(x,y)=q((x,y))$, the latter, $(x,y)$ is an element in $\mathbb{Z}^2$, while in the first case we just take the corresponding coordinates). Furthermore, we say that they are properly equivalent if this matrix has determinant $1$. We can verify that these two relations are indeed equivalent relations. So the classification problem arises: classify these quadratic forms under one of these equivalent relations.

We have to find some invariants in these equivalent relations. One of them is the discriminant of a quadratic form. Suppose that a quadratic form is $q(x,y)=ax^2+bxy+cy^2$, then we define the discriminant of this quadratic form to be $D=b^2-4ac$. It can be verified easily that the discriminant of a quadratic form doesn’t change under an invertible transformation. So we have a necessary condition for two quadratic forms to be equivalent: their discriminants are the same.

Another property which is invariant under invertible transformations is the set of integers that can be represented by a quadratic form. We say that an integer $u$ can be represented by a quadratic form $q$ if there are two integers $x,y$ such that $q(x,y)=u$. An invertible transformation acting on $q$ is the same as a base change on $\mathbb{Z}^2$. So we see that it is indeed the case that the set of integers represented by a quadratic form is invariant under invertible transformations.

To solve the classification problem, we propose of finding some ‘minimal’ form in a sense to be clarified for each equivalent class. We say that a quadratic form $q=(a,b,c)$(from now on, for convenience we write $(a,b,c)$ for a quadratic form $q(x,y)=ax^2+bxy+cy^2$) is Lagrange-reduced if these coefficients satisfy $-|a|. One important result is that every quadratic form is equivalent to some Lagrange-reduced form. In fact, there even exists a nice algorithm for this process. Before that, we must say a few words about some elementary invertible transformation.

It is readily seen that (we denote $\approx$ for the equivalent relation, and $\approx'$ for the properly equivalent relation) for a quadratic form $(a,b,c)$, we have $(a,b,c)\approx(a,-b,c)$, and $(a,b,c)\approx'(c,-b,a),\approx'(a,b+2a,c+b+a), \approx' (a,b-2a,c-b+a)$. These transformations are called elementary transformations. They are going to be the building block of several results later on.

Now take a quadratic form(without loss of generality, or by using the above elementary transformations, we can assume the quadratic form to be) $(a,b,c)(a\neq 0)$. Then again using the equivalent relations $(a,b,c)\approx' (a,b-2a,c-b+a), \approx' (a,b+2a,c+b+a)$, we can always assume that $b$ lies in the interval $(-|a|,|a|]$. Now if $0<|c|<|a|$, then we apply the transformation $(a,b,c)\approx'(c,-b,a)$, and we can proceed until this form is transformed into a Lagrange-reduced form or a form like $(a,b,0)$. In the latter case, we have that the discriminant $D=b^2$. If we assume that the discriminant of the quadratic form is not a square, then we must have that this quadratic form is properly equivalent to a Lagrange-reduced form.

Another way to show the result is to introduce the concept, primitive representation, which is also an important concept on its own right. We say that an integer $u$ is primitively represented by a form $q$, if there exists a pair of integers $x,y$ such that $u=q(x,y)$, what is more, $x,y$ are coprime. Suppose that a non-zero integer $n$ is primitively represented by $q$ through a pair of integers $u,u'$(that is to say $n=q(u,u')=au^2+buu'+cu'^2$). Then by the Bezout lemma, we can find another pair of integers, $v,v'$ such that $uv-u'v'=1$. Then we define a new quadratic form $q'(x,y)=q(x(u,u')+y(v,v'))=q(xu+yv,xu'+yv')$ $=a(xu+yv)^2+b(xu+yv)(xu'+yv')+c(xu'+yv')^2$ $=(au^2+buu'+cu'^2)x^2+...=nx^2+...$. So all in all, $(a,b,c)\approx' (n,b',c')$. Utilize again those elementary transformations above, we can assume that $-|n|. In other words, if $n$ can be primitively represented by a form, then this form is properly equivalent to a form like $(n,b',c')(-|n|. Now consider the set of absolute values of the integers represented by the quadratic form $q$. Suppose $a$ is the smallest non-zero integer in this set, then we can readily see that $a$ or $-a$( we write it as $\epsilon a$) is primitively represented by $q$. Then $q\approx' q'=(\epsilon a,b,c)(-|a|. Since obviously $c$ can be represented by $q'$, if we assume that $D$ is not a square, then $c$ can’t be zero(easily verified), so we must have $|c|\geq |a|$, which completes the proof.

Now we spare some time for the case when $D$ is a square. Suppose the form is $q=(a,b,c)$ with $D=b^2-4ac=k^2$. If $k\geq1$ and $a>0$, then the equation $a^2+bx+c=0$ has rational solutions, suppose one of them is $\frac{u}{w}=\frac{-b+k}{2a}$ with $u,w$ coprime. Then we can find another pair of integers $t,v$ such that $ut-wv=1$. Then consider the new form, $q'(x,y)=q(xu+yv,xw+yt)$. These two are properly equivalent, which is not hard to see. After some calculations, the magical result is that $q'$ looks like $q'=(0,-k,c)$. So after some elementary transformations, we get that $q\approx (0,k,c)(0\leq c. If $k>0$ and $a=0$, then if $c\neq0$, we can change the place of $a$ and $c$ and proceed as above. If $c$ is also zero, then we are done. If $k=0$, we have that $q(x,y)=ax^2+4acxy+cy^2=(\sqrt{a}x+\sqrt{c}y)^2$. Since $ac$ is a square, we can assume that $a=d^2f,c=e^2f$ where $f$ is a product of distinct primes. Furthermore, we assume that the greatest common divisor of $d,c$ is $g=gcd(d,e)$ such that $d=d'g,e=e'g$. Then we have that $q(x,y)=fg^2(d'x+e'y)^2$. Since $d',e'$ are coprime, so by Bezout’s lemma, we can find another pair of integers $u,v$ such that $d'u-e'v=1$. By defining a new form $q(x,y)=q'(d'x+e'y,vx+uy)$, we get that $q'(x,y)=fg^2x^2$. So $q\approx (fg^2,0,0)$. What is more, it follows that this kind of reduced form is unique.

In the above, we have shown that each quadratic form is properly equivalent to some Lagrange-reduced form. Then we may  wonder whether we can replace ‘some’ by ‘one unique’. The answer is, NO. That is to say, two Lagrange-reduced form can be properly equivalent. For example, the forms$(1,1,-1),(-1,1,1)$ are in fact properly equivalent. Indeed, $(1,1,-1)\approx' (-1,-1,1)\approx' (-1,1,1)$.

But when the discriminant $D$ is negative, things become easier. Note that when $D<0$, we have that the quadratic form is definite(either positive or negative). So in order to simplify things, we always assume that a form is positive definite, in other words, in $(a,b,c)$, we must have $a>0$. What is more, in the case $a=c$, we have $(a,b,a)\approx'(a,-b,a)$. Here we always have $|b|<|a|$ since the discriminant is negative. But the result is that these two Lagrange-reduced forms are properly equivalent, which is a bit annoying. So to get some uniqueness, we have to get rid of this kind of things. So we demand that $b\geq0$ for the case $a=c$. So in sum, we can define another type of reduced forms, the Gauss-reduced forms. A quadratic form of discriminant negative $(a,b,c)$ is said to be Gauss-reduced, if $-a and if$a=c$ then $b\geq 0$.

Then Gauss gives a result as follows: a quadratic form of discriminant negative is properly equivalent to one and only one Gauss-reduced form.

The proof is not very hard, but we will not present it here.

Another important concept in the theory of quadratic forms is primitive forms. As the words go, a form $q=(a,b,c)$ is said to be primitive, if $a,b,c$ are coprime. Clearly primitivity is also a property that is invariant under invertible transformations. We denote by $P(D)$, the properly equivalent primitive class of quadratic forms of discriminant $D$, and by $h(D)$ the cardinality of $P(D)$. If $D\equiv 0(mod 4)$, then $(1,0,-D/4)$ is a primitive form of discriminant $D$, if $D\equiv 1(mod 4)$, then $(1,1,\frac{1-D}{4})$ is a primitive form of discriminant $D$. So we showed that $P(D)$ is never empty. But the problem is how $h(D)$ varies with $D$.

Gauss conjectured that the only negative values of $D$ such that $h(D)=1$ is $-3,-4,-7,-8,-11,-12,-16,-19,-27,-28,-43,-67,-163$. This conjecture was solved by Stark, Heegner and Baker. These numbers are closely related to the Heegner numbers, $1, 2, 3, 7, 11, 19, 43, 67, 163$. The former numbers are all multiples of the latter. Heegner proved that the polynomial $x^2+x+\frac{1+D}{4}$ can express primes for all $0\leq x\leq \frac{1+D}{4}-2$ if and only if $D$ is a Heegner number. Note that the discriminant of the polynomial is just $-D$.

# Retina

Retina is the surface area of the eyes of an animal.

In the inner surface of the retina reside the retinal ganglion cell. This is a special type of neurons, which receives visual information through two other neurons: the horizontal cell and the amacrine cells. The defining characteristic of the retinal ganglion cell is that it has a very long axon which extends directly to the brain. There is a liaison between the retinal ganglion cells and the photoreceptor. For the human being, each retina has about one million retinal ganglion cells, while it has one hundred million photoreceptors. But the photoreceptors are not uniformly distributed. In fact, in the center of the retina, each retinal ganglion cell correspond to five or six photoreceptors, whereas in the periphery area, each retinal ganglion cell receives visual information from some thousands photoreceptors. This, in some sense, is reasonable, because the visual information is not uniformly distributed, either. Usually the central area receives the most information of the sights, so a few photoreceptors will be enough.

The role of retinal ganglion cells is that they proceed the visual information. What does this mean? It means that these neurons start to distinguish the motion visual information. More precisely, it is in the inner surface of retina that the horizontal motion is separated from the vertical motion, as well as from other motions, like moving forward, or moving backward. This is very important for all the animals. For example, when there is a predator in front of an animal, if the predator moves a bit, the animal recognizes at once and can prepare for an escape. This process is automatic, that is, seeing a moving creature, then this visual information is sent to the brain, the brain deciding taking a corresponding action after a series of action potentials and chemical reactions. So, viewed this way, we see that the simple actions of the animals is spontaneous, at least with the kick-off of a firing in the outside world.

But this is not exactly the case. In fact, the reality is more complex than what we have thought. For example, a cat can move even without any external stimulus. So in some sense, the brain is a machine automatic. There are fluctuations in the brain, which is the internal stimulus. But which system can have fluctuations? This is an interesting question. We know that, even the microbes are in a scale superior to that of atoms, the latter being the constituents of all the grand molecules, like DNA, proteins, glucides, etc.. But this is not enough to create a living being, which is obvious. Put in another way, just putting together many many DNA, proteins, all sort of things that exist in the body of an animal will not lead to a creation of such animal. That is to say, only the fluctuation in the sense of thermodynamics is not enough. So, we must have forgotten something. Or, perhaps it is just because I haven’t read enough.

# the  geometry of numbers

The geometry of numbers is q branch of number theory created by Minkowski.

The idea of this theory is to consider the intersection of some set with a lattice in the Euclidean space $\mathbb{R}^n$ to get the existence or non-existence of some particular number.

Start with the basics. Suppose $L$ is a lattice in $\mathbb{R}^n$(i.e. a discret subgroup of the additive group of $\mathbb{R}^n$ without point of accumulation, which viewed as a set of vectors in $\mathbb{R}^n$ generate the whole space). Using the fact that $\mathbb{R}^n$ has a finite set as basis, we deduce easily that $L$ is finitely generated. Using the fact that $\mathbb{Z}$ is a principal domain, we see that $L$ has, in fact, a finite set as $\mathbb{Z}$-basis(what is more, the cardinality of this set is independent of the choice of the basis, it is an invariant under isomorphism, also a property of the abelian group). That is we can assume that $L$ looks like $L=\mathbb{Z}$. What is more, we can for any other basis $\{f_1,f_2,...,f_n\}$ of $L$, the matrix of these $f_i$ expressed with respect to these $e_j$ lies in the group $GL_n(\mathbb{Z})$. After choosing the canonical basis of $\mathbb{R}^n$, we can calculate the determinant of the matrix $(e_1,...,e_n)$. The absolute value of this determinant is, coincidently, the volume of the fundamental domain of the lattice $L$, or in other words, the volume of $\mathbb{R}^n/L$ equipped with the quotient measure(we denote it by $covol(L)$, the co-volume of $L$). This value is, in some sense, the generator of the image of the homomorphism of groups $f:L^n\rightarrow \mathbb{R}, (a_1,...,a_n)\mapsto det(a_1,...,a_n)$.

The fundamental result in the geometry of numbers concerns the intersection of a convex set with the lattice. We say a convex subset $C\subset \mathbb{R}^n$ is symmetric if for all $x\in C$, we have $-x\in C$. So if $x,y\in C$, we have that $latex\frac{x-y}{2}$. This is really a simple fact, but it is also a really important observation. Now we are ready to state the theorem of Minkowski:

Suppose that $C$ a convex symmetric subset of $\mathbb{R}^n$, and $L$ is a lattice of the latter. What is more, $covol(L)<\frac{vol(C)}{2^n}$. Then there is a non-zero element $0\neq x\in L$ which lies in $C$ also. That is to say, $\#C\bigcap L>1$.

The point of the proof is to express the above inequality as a comparison between two volumes. Noting that if we dilate the lattice $L$ by $2$, then we have that $covol(2L)=2^ncovol(L)$. So the inequality turns into $covol(2L). Then we can say that there are two distinct elements $x\neq y\in 2L$ such that the half of their difference $\frac{x-y}{2}\in C$. If it is not the case, then after choosing a measurable representation $X$ of $\mathbb{R}^n/2L$ in $\mathbb{R}^n$, we can express $C=\bigcup_{x\in 2L}(C\bigcap(X+x))$. And so $C'=\bigcup_{x\in 2L}(C\bigcap(X+x)-x)$ has the same volume as $C$. But $C'$ is now a measurable subset of $X$(because $C$ is a measurable subset, see here), so we have $vol(C)\leq covol(2L)$, which is a contradiction to the assumption. So we get a non-zero element $\frac{x-y}{2}\in C\bigcap L$, thus proved the theorem.

Now come to the applications of this fundamental result. As a first example, we consider one theorem of Fermat’s, that is a prime number $p\equiv 1(mod 4)$ if and only if $p$ can be written as a sum of two squares of integers, $p=a^2+b^2(a,b\in\mathbb{Z})$. Note that $p\equiv 1(mod 4)$ is the same as that $-1$ is quadratic residue mod $p$. So we can find some integer $u$ such that $u^2\equiv -1(mod p)$. We will use this $u$ to construct some lattice $L$ in $\mathbb{R}^2$. That is, ze define $L=\{(a,b)\in\mathbb{Z}^2,a\equiv ub(mod p)\}$. Clearly this is a lattice(a routine way of showing this is, $p\mathbb{Z}^2\subset L\subset\mathbb{Z}^2$). What is more, the indice of $L$ in $\mathbb{Z}^2$ is $p$(just consider the group homomorphism $\phi:\mathbb{Z}^2\rightarrow\mathbb{Z}/p\mathbb{Z}, (x,y)\mapsto x-uy$, which is surjective, with kernel $ker(\phi)=L$. So we have $\#\mathbb{Z}/L=p$).And thus $covol(L)=p*covol(\mathbb{Z}^2)=p$. Now consider the disk $D=\{(x,y)\in\mathbb{R}^2, x^2+y^2<2p\}$. This disk is clearly convex and symmetric, with volume $vol(D)=2p\pi$. Now that the assumptions are satisfied just by noting that $p<\frac{2p\pi}{2^2}$, so there is $0\neq x=(a,b)\in L\bigcap D$. Note that since $a=ub(mod p)$, we have $a^2=u^2b^2=-b^2(mod p)$, in other words, $a^2+b^2=np$ for some integer $n$.But recall the definition of $D$, we have that $a^2+b^2<2p$, so the only possibility is that $a^2+b^2=p$.

So by carefully choosing a lattice, we proved the theorem due to Fermat. This is a small success of the geometry of numbers. In fact, the above reasoning can be generalized to any polynomials like $a^2+db^2$ where $d$ is a positive integer.

The following is one such result:

Suppose that integer $d>0$ is a quadratic residue mod $p$, then at least one of the numbers $p,2p,..., hp$($h$ is the largest integer such that $h\leq\frac{4d}{\pi}$) can be expressed in the form $a^2+db^2$.

# the law of quadratic reciprocity

Finding the roots of a polynomial over some ring is always an interesting subject in mathematics. To find the number of roots of a polynomial over the complex number field is trivial. To find such a number over the real number field is a bit difficult but still we find various algorithms to solve this problem. For example, the Sturm sequence is usually implemented in most computer algebra systems. Another case is over the finite fields or fields of characteristic $>0$, the problem in this case is more interesting and also more fruitful.

For example, if we want to consider the polynomial $x^2=a$ in the field $\mathbb{F}_p$. Not every value $a$ in this field provides a solution for this polynomial. We call quadratic residu for those $a$ such that the corresponding polynomial has solutions in $\mathbb{F}_p$. Most often we consider the case that $a\neq 0$. And thus we can define the Legendre symbol$(\frac{a}{p})$(of course this symbol is defined for all the integers). Using Fermat’s little theorem, we can easily show that

$(\frac{a}{p})=a^{(p-1)/2}$

Using the homomorphism

$\phi:\mathbb{F}_p\rightarrow\{1,-1\}, x\mapsto x^{(p-1)/2}$

we can conclude that the product of two quadratic residues or two non-quadratic residues is again a quadratic residue, while the product of a quadratic residue and a non-quadratic residue is a non-quadratic residue.

So one question arises: when an integer $a$ will be a quadratic residue modulo $p$? The law of quadratic reciprocity resolves this problem(completely in some sense). The following is the content of the theorem:

Suppose that $p,q$ are two distinct odd primes, then $(\frac{p}{q})(\frac{q}{p})=(-1)^{\frac{p-1}{2}\frac{q-1}{2}}$.And for the case of $2$, we have $(\frac{2}{p})=(-1)^{\frac{p^2-1}{2}}$.

Another problem concerning the finite fields is the multiplicative generator of the multiplicative group of this field. One conjecture of Artin says that for an integer $a$ which is neither a square of some other integer nor equal to $-1$, then there are infinitely many primes $p$ such that $a$ is a generator for the group $\mathbb{F}_p^*$.