Finite Groups-Part 1

In finite group theory, there is an interesting fact due to Brauer. First of all, some notations. We denote S_n the symmetry group acting on n numbers \{1,2,...,n\}. For any field k, there is a natural action of S_n on the vector space k^n, that is, the action on the canonical base \{e_1,...,e_n\} is just as follows: the element g\in S_n takes e_i to the base e_{g(i)}. Extended to the whole vector space, we denote the matrix obtained as P_g. It is easy to verify that P:S_n\rightarrow GL_n(k),g\mapsto P_g is a representation of S_n. It is clear that if g,g'\in S_n are conjugate in S_n, then P_g, P_{g'} are also conjugate. Amazingly, the converse is true, that is

g,g'\in S_n are conjugate if and only if P_g,P_{g'} are conjugate as matrices.

One common way to show if two matrices are conjugate or not is to use the Jordan standard forms, or the similarity invariants. But here the problem is that two conjugate permutation matrices don’t necessarily have another permutation matrix as their conjugate. So the solution should study these permutation matrices to get some information about the permutation itself. On the other hand, how can we decide if two permutations are conjugate? There is one simple criterion, that is write them in the form of disjoint cycles and see if every possible length of cycles appear the same number of times. That is, if the cycle representation of g has length data as (l_1,l_2,...,l_k), the g' there is (l'_1,...,l_h'), then we have to examine if these two set of numbers are the same after possibly a permutation. But obviously, we can not get directly from P_g these informations. But look at P_g, P_{g'}. If they are conjugate, what are the possible invariants? Now consider a simple case where g=(12). So what can we say about P_g? Since g has order 2, so is P_g. So P_g^2=Id. So this inspires us to consider the eigenvalues and eigenvectors of P_g. It has one eigenvector(up to a multiplication of a non-zero scalar) with eigenvalue 1. So we have dim(ker(P_g-Id))=1. Clearly for any other matrix conjugate to P_g, this quantity must also be 1. Can we generalize this to general situations? The answer is yes. Suppose that g=(12...k), then we still have that dim(ker(P_g-Id))=1. But if g=g_1g_2 where g_1,g_2 are two cycles disjoint, then a simple speculation give that dim(ker(P_g-Id))=2. So we can generalize this to other situation: if g has k disjoint cycles, then dim(ker(P_g-Id))=k. So, we at last get some information on g. Through dim(ker(P_g-Id)), we know at least that the cycle representation of g has how many disjoint cycles. Then, since P_g,P_{g'} are conjugate, so are alll their powers. Perhaps their powers can also give us some more information.

But before that, let’s see what happens when g takes powers. Still start from the simplest case, g=(12...k). Then what are g^2,g^3,...? Another simple speculation will show that for g^h, it will have d=gcd(k,h) cycles disjoint, with each cycle of length k/d. We have said that two cycles are conjugate if and only if their length are the same. But recalling what we have done above(we get some information on the number of cycles of an element), we tend to think if we can use another way to characterize an integer k? There are many, but one suits here, that is all its factors. That is to say, if we know two sets of factors of two integers are the same, then we are sure that these two integers are identical. So for g=(12...k). The sequence \{dim(ker(P_g^l-Id))\}_l=\{gcd(k,l)\} gives all the possible positive factors of k, thus determines k, and also determines g up to conjugacy. Now what about the general situation? That is to say, suppose that g=g_1g_2...g_k where g_i is a cycle of length l_i unknown for the present and these cycles are disjoint and even the number of cycles is unknown, too. Now we calculate dim(ker(P_g-Id)), which gives us the number of cycles in g, that is d_1=k=\sum_i gcd(1,l_i). Then we can calculate dim(ker(P_g^2-Id)), this gives us d_2=\sum_i gcd(2,l_i). Continue in this way, we obtain a sequence (d_h=\sum_i gcd(h,l_i)). So can we recover those l_i with the help of these d_i? We can write this in the form of a matrix. We define M=(a_{ij})_{1\leq i\leq n, 1\leq j\leq n} where each entry is a_{ij}=gcd(i,j). And there is a column vector v=(v_1,...,v_n)^{\dagger} such that Mv=(d_1,d_2,...,d_n)^{\dagger}. The problem is then to show that this matrix M is invertible. If it is indeed the case, then we can recover v, so this means that (d_i) determines (l_i)(note v_i counts the number of l_j equal to i, which is enough to determine g up to conjugacy). In fact, this is a theorem due to Smith(1876), that is

The matrix M=(a_{ij})=(gcd(i,j))_{1\leq i,j\leq n} is invertible.

There is a proof which uses an ingenious idea of decomposing M  into the product of two upper triangular matrices. In fact, if we note \phi(n) the Euler function which counts the number of integers 1leq i\leq n which is prime to n. There is a nice formula concerning this function, that is


One way to see this is to consider these rationals S=\{1/n,2/n,...,n/n\}. Write each of these rationals in the reduced form, for each d|n, all the rationals in S with denominator d must have a nominator prime to d, and if d' is a positive integer smaller than d but is prime to d, then surely d'/d will appear in S. So the number of rationals in S with denominator d|n is then \phi(d), which gives that \sum_{d|n}\phi(d).

So for each entry a_{ij}=gcd(i,j)=\sum_{d|gcd(i,j)}\phi(d)=\sum_{d|i,d|j}\phi(d), the last identity is a very important observation, which enables us to rewrite as gcd(i,j)=\sum_{1\leq d\leq n}\phi(d)f(d,i)f(d,j) where f:\mathbb{N}-0 \times \mathbb{N}-0 \rightarrow \{0,1\}, (i,j)\mapsto f(i,j) and if i|j we define f(i,j)=1, otherwise f(i,j)=0. And we define g(i,j)=\phi(i)f(i,j). So we get gcd(i,j)=\sum_{1<\leq d\leq n}g(d,i) f(d,j). So in fact, M is a product of two matrices, that is M=ST with S=(s_{ij})=(g(j,i)), T=(t_{ij})=(f(i,j)). Note that these two matrices are in fact triangular. In deed, for S, if i<j, then s_{ij}=g(j,i)=\phi(j)f(j,i). But since j>i, we can not have j|i, so we must have that s_{ij}=0. Similarly for T, if i>j, we have that t_{ij}=0. So we have that det(M)=det(ST)=det(S)det(T)=\prod_ig(i,i)\prod_jf(j,j)=\prod_i\phi(i)f(i,i)^2=\prod_i\phi(i), thus M is invertible.

So in this way, we proved the theorem of Smith and also the result due to Brauer.

This is really an interesting result. This result shows that for those permutation matrices, the conjugacy will automatically mean the conjugacy in the group theoretic sense.

Concerning the theorem of Smith, there are various generalizations. In the proof, we have used the relation that n divides m, defining a partial order on \mathbb{N}-0. What we have really utilized is the Mobius function. Specifically, if S  is a partial order set with the order \leq, then for any function g:S\rightarrow\mathbb{R}, we define f:S\rightarrow \mathbb{R}, z\mapsto f(z)=\sum_{w\leq z}g(w). how can we define a Mobius function on S? Or more fundamentally, can we define a Mobius function on S? Recall one property of the Mobius function on the positive integers: \sum_{d|n}\mu(d)=\delta(1,n) where \delta(n,m)=1 if n=m, and is 0 otherwise. We can generalize this formula to the case \sum_{n|d|m}\mu(m/d)=\delta(n,m). Using the function defined in the proof, f(i,j)=1 if i|j, and is 0 otherwise, we can rewrite this formula as \sum_{d}f(n,d)f(d,m)\mu(m/d)=\delta(n,m). Note that the function \delta is in fact an unit in the set of real-valued functions on S\times S with the addition being the point-wise addition, and the multiplication being the convolution(more precisely, C(S)=\{a:S\times S\rightarrow \mathbb{R}| a(x,y)=0 \text{if} x\not\leq y\}, (a*b)(x,y)=\sum_{z}a(x,z)b(z,y). Here we can take S=\mathbb{N}-0). To be compatible with the above definition of functions on S\times S, we define the two-variant Mobius function(using thesame notation), \mu: (\mathbb{N}-0)\times(\mathbb{N}-0)\rightarrow \mathbb{R}, (d,m)\mapsto \mu(d,m) with \mu(d,m)=\mu(m/d) if m/d is an integer and is 0 otherwise. So the above identity can be written as \delta(n,m)=\sum_{d}f(n,d)\mu(d,m). So, in this way, the Mobius function being a unit is more obvious. So, we want to generalize this definition to other partial order sets. The \delta function is easy to define, while the function f(note that it is a unit in the point-wise multiplication), it is not hard to generalize, either. For a partial order set(poset, for short), we define f:S\times S\rightarrow, (x,y)\mapsto f(x,y) with f(x,y)=1 if x\leq y, and is 0 otherwise. So, we can define the Mobius function on S to be the inverse of the (point-wise multiplicative unit) f, that is to say \mu* f=\delta.

Having thus defined the Mobius function, next perhaps we should consider the Mobius inversion formula, that is,

(Mobius Inversion Formula) if F,G:S\rightarrow\mathbb{R} two functions on S such that F(x)=\sum_{y\leq x}G(y), then we have that G(x)=\sum_{y\leq x}F(y)\mu(y,x).

In fact, we don’t need the condition in the summation y\leq x since \mu is a function on S\times S, its definition requires that \mu(x,y)=0 if x\not\leq y. The proof of this formula is very simple, just using the convolution once we have noticed that F(x)=\sum_y G(x)f(x,y). And thus we omit it.

For references on the Mobius functions on a poset, we can read this


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s