symbolic dynamical systems-an example

Here in this post I will not give a precise definition of what is a symbolic dynamical system. I will just give an example of it, so that we can get just feeling of a symbolic dynamical system.

One of the most studied example in dynamical system is perhaps the unit circle, S^1. There are various transformations that we can define on it, like the rotation transformation, the reflexion transformation, and so on. When we view the unit circle as the quotient space of the unit interval, that is S^1=[0,1]/(0\equiv 1), perhaps we can invent more, for example, the multiplication transformation, that is T_N: S^1\rightarrow S^1, z\mapsto z^N(or, equivalently T_N: \mathbb{R}/\mathbb{Z}\rightarrow \mathbb{R}/\mathbb{Z}, x\mapsto Nx), just multiplying the argument of the complex number z by N, where N is a positive integer, since for non integers this map would not make obvious sense.

Which kind of numbers in the unit interval(latter in this post, we will mix the unit interval and the unit circle when there is no confusion) behaves well under T_N? First of all, which number has finite orbit under T_N? After a simple speculation, we see that the orbit \mathfrak{O}(x) of a point x\in [0,1] is finite if and only if x\in \mathbb{Q}. What are the possible cardinality of these finite orbits? This is not hard, either. But all this has something to do with N. Perhaps a natural way of thinking is to express a real number on basis of N(here we assume that N>1, otherwise T_N is the identity map, things get much easier), that is to say, write x=\sum_{n\in\mathbb{Z}}\frac{a_n}{N^n}. And we can prove easily that x is a rational number if and only if the sequence (a_n) is periodic. So if x is rational, then suppose that (a_n) has a period \tau, then we see that T_N^{\tau}(x)=x. Which means that \#\mathfrak{O}(x)\leq \tau. An interesting result.

Perhaps we can concentrate our attention on this N-cimal expansion. In fact, we see that all expansions of numbers in the unit interval correspond to a sequence of integers (a_n)_{n\in\mathbb{N}} such that each term a_n\in\{0,1,...,N-1\}. Conversely, given a sequence, (a_n)_{n\in\mathbb{N}}(a_n\in\{0,1,...,N-1\}), there is always a number x=\sum_n\frac{a_n}{N^n} which lies in the unit interval. Note that, for a real number in the unit interval, there are sometimes two possibilities to write its N-cimal expansion(for example, x=\frac{1}{N}=\sum_{n>1}\frac{N-1}{N^n}). So we had better define a map from \{0,1,...,N-1\}^{\mathbb{N}} to the unit interval, not in the inverse direction. That is, we define, \pi: S_N=\{0,1,...,N-1\}^{\mathbb{N}}\rightarrow \mathbb{R}/\mathbb{Z}, (a_n)\mapsto \sum_n\frac{a_n}{N^n}.

What is missing is a topology on S_N. And that is what we are going to do right now. For any element a=(a_n)\in S_N, if it is not zero, then there is a least n(a) such that a_{n(a)}\neq 0. And thus we define the order of a to be ord(a)=n(a). And we set ord((0))=+\infty. It is easy to see that ord(x-y)\geq ord(x)+ord(y)(x,y\in S_N). Note that these numbers are positive numbers, so we have an inequality e^{-ord(x-y)}\leq e^{-ord(x)}+e^{-ord(y)}, which shows that d(x,y)=e^{-ord(x-y)} is in fact a distance on S_N. If we denote p_i the projection that takes an element in S_N to its i-th component, then we see that p_i(S_N)=\{0,1,...,N-1\}, which, as a finite subset of S_N, is compact. Note also that S_N=\prod_ip_i(S_N), so S_N is the product of compact spaces, so S_N is compact.(There is some similarity between S_p and the p-adic integers \mathbb{Z}_p, the expansion, the order of an element, the metric(we can show that both two metrics are in fact ultrametrics), the compacity, the total disconnectivity, etc.,but their addition and multiplication rules are not the same, which is a pity. In other words, they share some topological properties, but don’t share algebraic properties).

Having equipped S_N with an ultrametric, we wonder if \pi: S_N\rightarrow S^1 is continuous or not. It is not hard to do. In deed, note that \pi(a)\leq N^{-ord(a)}\sum_{n>0}\frac{N-1}{N^n}=d(a,0)^{\text{ln}N}, which show that \pi is indeed continuous.

Now we can study some properties of these two transformations, T_N. Let’s first decide if the T_N can be transported to S_N. Suppose x=\sum_n\frac{x_n}{N^n}. So T_N(x)=\sum_n\frac{x_{n+1}}{N^n}. So in fact, we can define a transformation on S_N, that is \sigma: S_N\rightarrow S_N, (a_n)_{n\geq0}\mapsto (a_{n+1})_{n\geq0}. It is easy to verify that there is T_N\circ \pi=\pi \circ\sigma. Recall that \pi is surjective, and we can this kind of map(like \pi, which is continuous, surjective, commutes two dynamical systems) a semi-conjugation.

Then like those irrational rotations on the unit circle, we want to see if there are some real number in the unit interval has a dense orbit. Without the idea of N-cimal expansion, this problem can be very hard. However, with this idea, things become much easier. Note that S_N is in fact a separable space(one countable dense subset of S_N is B=\{x\in S_N|\exists m\in\mathbb{N},\text{s.t.} \forall n>m, x_n=0\}). We can put the effective parts of all the elements in B in one element of S_N, and this element has a dense orbit. What is more, we can see that are infinite way to generate this kind of elements.

So, in this example, we abstract the space S^1 to get a purely symbolic dynamical system. After giving it some proper metric, this abstract symbolic dynamical system gives us a satisfactory answer to a hard problem.

Now we want to say something on some particular \Sigma_N, that is \Sigma_2. Before that, let’s study the expansions of the real numbers in the interval [0,1). We, as we have done in the above, identify S=[0,1)\mathbb{R}/\mathbb{Z}, equipped with the Lebesgue measure. Then we see that the multiplication operator T_N:S\rightarrow S, x\mapsto Nx is ergodic with respect to the Lebesgue measure. That is to say, the only measurable set A\subset S such that T_N^{-1}(A)=A is a set of measure 0 or 1, or equivalently, the only measurable functions invariant under T are the constant functions. Note that the level set of an invariant function is a set invariant under T, too. This gives the equivalence between these two. What we want to do now is to show that all words of length k in \Sigma_N have an equal probability of appearance in the expansion of almost all elements of S in the N-base expansion. How to express the probability of appearance of a word in the expansion of a number? Suppose there is a word x=(x_1x_2...x_k)(0\leq x_1,...,x_k\leq N-1), and a real number a=\sum_{i>0}a_i/N^i. That x appears in the first k digitals of a is equivalent to saying that a\in I_x=[b,b+1/N^k), where b=b(x)=\sum_{1\leq i\leq k}x_i/N^i. And the second k digitals of a is the word x means that T_N(a)\in [b,b+1/N^k). So, we have found a method in counting the number of appearance of the word x in the expansion of the number a. That is, in the first n+k digitals of a, the number of appearance of x is just \sum_{1\leq i\leq n}\chi_{I_x}(T^i_N(a)). Divided by n, then the quotient means the frequency of appearance of x in the first n+k digitals of a. So, does this frequency have a limit? If so, what is the limit? Here we need the mean ergodic theorem of von Neumann. Note that the characteristic function \chi_{I_x} belongs to L^2(S). So, the mean ergodic theorem of von Neumann says that, under the condition that T is ergodic, the sequence of functions \frac{1}{n}\sum_{0\leq k\leq n-1}f(T^k_N(x)) converges in L^2(S) to a constant for any f\in L^2(S), so does f=\chi_{I_x} here(Or we can also use the mean ergodic theorem of Birkhoff). And this constant is just the orthogonal projection of f on the space of functions measurable with respect to the \sigma-algebra generated by the measurable sets invariant under T. Thus here, it is the trivial algebra, \{\emptyset, S\}, and a function measurable with respect to this algebra is just constant function. So, the orthogonal of f onto this space is just the integral of f. So, we have that \lim_n\frac{1}{n}\sum_{0\leq k\leq n-1}f(T^k_N(a)) converges in L^2(S) to \int f. Note also that each term in this sequence is bounded from above by 1. Without loss of generality, we can assume that \int f=0. So, we conclude that for almost all points a\in S, this sequence converges to 0(we can argue by contradiction. Suppose that there is a measurable set A of measure m(A)>0 such that for almost all a\in A, we have that \limsup |\frac{1}{n}\sum_kf(T^k_N(a))|>\epsilon>0. So we must have that \limsup \int |\frac{1}{n}\sum_kf(T^k_N)|^2>\epsilon^2m(A)>0(theorem of dominant convergence), which is a contradiction). So, in this way, we show that for almost all points a\in S, we have that in its expansion, the word x appears with a frequency \int \chi_{I_x}=\frac{1}{N^k}. This means that for each word x, there is a set of measure 1 such that each point in it has an expansion in which x appears infinitely many times. One consequence of this result is that, for all the words(which are countable), there is a set of measure 1 such that each point in it has an expansion in which every word appears infinitely many times. This is the same as that the set S'=\{a\in S|\omega(a)=S\}.

A little digression here. Note that equipped with the quotient topology, S is a compact space, thus is complete. So, the Baire category theorem applies here. Baire category theorem says that in a complete metric space S, countable intersection of dense open sets is still dense. Note that ”density’ is really a topological term, which means that, it only measures the cardinality of the set, but says nothing about the ‘volume’ of the set. We call the countable intersection of dense open sets a generic set. So, a generic set is dense, but not the converse. For the present problem, we can show that S' is a generic set. This is not hard to see, and we will omit it.

Another digression is useful. It is the concept of mixed dynamical systems. We say a system (X,T,m)(preserving the measure) is mixed, if for any measurable sets A,B\subset X, we have that \lim_nm(A\bigcap T^{-n}(B))=m(A)m(B). What does this condition mean? It means that, after a long time(n>>1), a non trivial measurable set A will be spread to almost everywhere with equal probability. This condition is stronger than ergodicity in that ergodicity doesn’t guarantee the equal probability. Indeed, suppose that a system is mixed, then for any invariant set A, X-A is invariant, too. So, we have that \lim_nm(A\bigcap T^{-n}(X-A))=m(A)m(X-A). But we have that T^{-1}(X-A)=X-A, so m(A\bigcap T^{-n}(X-A))=m(A\bigcap X-A)=0, which shows that either m(A)=0 or m(A)=1, thus showing the ergodicity of the system. According to the comments above, it is not hard to find examples of dynamical systems which are ergodic but not mixed. The examples must not spread its subset (at least one subset). One example is the rotation on the unit circle.

Now return to the problem of symbolic dynamical systems. Using its factor system (S,T_N,m), we have shown that (\Sigma_N,\sigma_n,d) is ergodic. But note that we haven’t defined any Borel measure on \Sigma_N. For any p_0,p_2,...,p_{N-1}\in [0,1] such that \sum_ip_i=1, and any cylinder for a word x=(x_1,x_2,...,x_k), [x]=\{a\in \Sigma_N, a_1=x_x,...,a_k=x_k\}, we set m([x])=p_{x_1}p_{x_2}...p_{x_k}. What is more, the set \mathbb{X}=\{[x]\subset X|x \text{ is a word}\}\bigcup{\emptyset}(we also consider the empty word \emptyset, then [\emptyset]=X), forms a basis for the topology on \Sigma_N induced by the metric defined above. What is more, \mathbb{X} generates the \sigma-algebra on \Sigma_N which is compatible with its topology. Moreover, for any two elements A,B\in \mathbb{X}, we have that A\bigcap B\in \mathbb{X}(either the empty set, either one of them), A-B is the finite union of elements in \mathbb{X}. And if A_n\in \mathbb{X}(n\in \mathbb{N}) are disjoint one to another, and there is A\in\mathbb{X} such that A=\bigcup_nA_n, then the word x for A must be a part of each of the word x_n for A_n. So, without loss of generality, we can suppose that these words have no common sub-word. So, we have that 1=m([\emptyset])=\sum_{0\leq k\leq N-1}m(\bigcup_{A_n, (x_n)_1=k}A_n). We can continue in this way for the second digital, the third digital, etc.. And finally we have that m(\bigcup_n A_n)=\sum_nm(A_n). So, according to the Caratheodory extension theorem, we have that there is a unique \sigma-finite measure(m') on X compatible with the topology on it such that m'|_{\mathbb{X}}=m. And for simplicity, we still write m for m'. So, for each N-tuple, p=(p_0,p_1,...,p_{N-1}), there is a unique measure on \Sigma_N, and we denote this measure by m_p.

Now we want to show that m_p is \sigma-invariant. Indeed, for any cylinder [x], we have that T^{-1}([x])=[0x]\bigcup[1x]\bigcup...\bigcup[(N-1)x], and \sum_km([kx])=\sum_kp_km([x])=m([x]). This completes the proof. Next we want to prove that (\Sigma_N,\sigma, m_p) is ergodic. First, we can prove that it is mixed. For that, consider any two cylinder A=[x]=[(x_1x_2...x_k)],B=[y]=[y_1...y_l], then for large n>>1, T^{-n}([y])=\{a\in\Sigma_N|a_n=y_1,...,a_{n+l}=y_l\}, so we must have that A\bigcap T^{-n}(B)=\{a\in\Sigma_N|a_1=x_1,...,a_k=x_k,a_n=y_1,...,a_{n+l}=y_l\}, thus m_p(A\bigcap T^{-n}(B))=p_{x_1}...p_{x_k}p_{y_1}...p_{y_l}=m_p(A)m_p(B). So this system is mixed, thus is ergodic.

Now we can repeat what we have done for (S,T_N,m): for any word x=(x_1...x_k), and any element a\in \Sigma_N, the frequency that x appears in the first n+k digitals of a is \frac{1}{n}\sum_{0\leq j\leq n-1}\chi_{B(x,1/N^k)}(\sigma^j(a)). Note that B(x,1/N^k)=[x], so we have that this sequence converges for almost all points a to the integral \int \chi_{[x]}=m([x])=p_{x_1}...p_{x_k}(since the system is ergodic). So, in this system, different words appear with a possibly different probability. The reason for this, is of course that the measure is not uniform. But note that, even though these measures are not uniform, these systems are still mixed.

So we prove that as long as none of these p_i is zero, then the set of elements a\in \Sigma_N such that \omega(a)=\Sigma_N has a measure 1. Is this set the full \Sigma_N? Not really. For example, the element a=(0000...) whose digitals are all zero, so, \omega(a)=\emptyset. In fact, we can show something much stronger, that is: there is a generic set whose elements are all this sort. First, we want to construct an element a such that the limit of the sequence \frac{1}{n}\sum_k\chi_{[x]}(\sigma^k(a)) doesn’t exist for all words( that is to say, for some words, this limit doesn’t exist). For example, we choose x=(1). How to construct an element such that the frequency of ones doesn’t have a limit? One way is, for an element a=(a_1a_2...), we put 2^{2k-1} many ones on a_{2^{2k}+1},a_{2^{2k}+1},...,a_{2^{2k+1}}, and we put no ones on a_{2^{2k+1}+1},...,a_{2^{2k+2}}. So, on one hand, from a_1 to a_{2^{2n+1}} there are \sum_{1\leq k\leq n}2^{2k-1} many ones, so the frequency of ones is f_n\geq 2^{2n-1}/2^{2n+1}=1/4. On the other hand, from a_1 to a_{2^{2n+2}}, there are \sum_{1\leq k\leq n}2^{2k-1}<2^{2n+1}/3, so the frequency of ones is g_n\leq 1/6. So, we have found two sequences of frequencies f_n,g_n, such that \liminf_nf_n\geq1/4 while \limsup_ng_n\leq1/6. So, the limit doesn’t exist. Now we want to show that there is a generic set whose elements are all this kind. We define W_n=\{x\in\Sigma_N| \exists l,k>n, \text{from } a_{2^{2k}+1+l} \text{ to } a_{2^{2k+1}+2l}, \text{there are } 2^{2k-1}+l/2 \text{ many ones, from } a_{2^{2k+1}+2l+1} \text{ to } a_{2^{2k+2}+4l} \text{ there are no ones}\}. It is easy to verify that each W_n is open and dense. Now look at their intersection W=\bigcap_n W_n, this is a generic set. We can show that each element in W has the same property as the element constructed above. For each word, we can construct a series of W_n as above, and since there are only countable words, so we conclude that the set of elements in whose expansions the frequency of each word doesn’t converge is a generic set. Of course, this generic set must have a measure 0, with respect to the above measures m_p.

Next we want to restrict our attention to \Sigma_2, which consists of elements of a sequence of 0,1. A note for all these \Sigma_N. The distance function takes only discrete values in the real number field. What is more, the whole space \Sigma_N is compact. So, if C\subset\Sigma_N is a closed subset, then it is compact. For each point c\in C, we can take a ball B(c,r_c) around it such that B(c,r_c)\subset C. And using the compactness of C, we conclude that C is at the same time open. So the characteristic function of it, \chi_C, is continuous. Now suppose that, there is an element x\in \Sigma_2 such that there is a positive number \delta\in (0,1), and for any n\in\mathbb{N}, there is b_n,a_n\in\mathbb{N}, b_n-a_n>n, and \frac{1}{b_n-a_n}\sum_{a_n+1\leq k\leq b_n}x_k>\delta. This condition means that for any large number n, there is an interval of length greater than n such that the number 1 appears in this interval with a frequency greater than \delta. We want to use this to construct a measure m on \Sigma_2. Note K=\overline{\mathfrak(O)_{\sigma}(x)} the closure of the orbit of x under \sigma. We want that m(K)=1. The bricks to construct measures is the Dirac measures. Here we can try, m_n=\frac{1}{b_n-a_n}\sum_{a_n\leq k\leq b_n}\delta_{\sigma^{k}(x)}. Note that, these are all measures, thus there is a subsequence m_{n_k} which converges weakly to a measure m. Moreover, for any n, there is m_n(K)=1, then \int \chi_Kdm_n=1. Since \chi_K is continuous, so we must have 1=\int \chi_K dm_{n_k}\rightarrow \int \chi_Kdm, thus m(K)=1. Now to express the idea that \frac{1}{b_n-a_n}\sum_{a_n+1\leq k\leq b_n}x_k>\delta, we set K'=[1]\bigcap K. Then this inequality becomes \frac{1}{b_n-a_n}\sum_{a_n+1\leq k\leq b_n}\chi_{[1]}(\sigma^k(x))>\delta. In other words, \frac{1}{b_n-a_n}\sum_{a_n+1\leq k\leq b_n}\delta_{\sigma^k(x)}(K')>\delta. So, \int \chi_{K'}\geq\delta. Note that [1] is closed and open, K is closed, so K' is closed, thus \chi_{K'} is continuous. This implies that \delta\leq m_{n_k}(K')\rightarrow m(K'), that is m(K')\geq \delta>0.

Now we want to use a great theorem of Furstenberg, the multiple recurrence theorem:

For any probabilistic dynamical system (X,T,m) which preserves the measure, suppose that B\subset X is a measurable set with m(B)>0.Then for any integer k, there are infinitely many n such that m(B\bigcap T^{-n}(B)\bigcap...\bigcap T^{-kn}(B))>0

This is a refinement of the recurrence theorem of Poincare. Poincare’s theorem says that almost all points in B shall return to B for infinitely many times. Furstenberg’s theorem states that one part of the returns has some rule inside, that is the arithmetic progression rule. More precisely, there is an arithmetic progression of any length, but the common difference is not known, a fortiori, although it guarantees that the common length can be arbitrarily large. This theorem doesn’t say that there exists an arithmetic progression of any common difference. But, in some sense, is also the default of this result.

Now apply this result to our system (\Sigma_2,\sigma,m) and to the set K'. Thus for any k, there is infinitely many n such that m(K'\bigcap \sigma^{-n}(K')\bigcap...\bigcap \sigma^{-kn}(K'))>0. What does this mean? If y lies in such a set, then y,\sigma^n(y),...,\sigma^{kn}(y)\in K'=[1]\bigcap K. Since K is the closure of the orbit of x, then there is a j such that d(\sigma^j(x),y)<1/2^{kn+n}, this means that the first kn digitals of \sigma^j(x) and y coincide. Since we have that y_1=y_{n+1}=y_{2n+1}=...=y_{kn+1}=1, this implies that x_{j+1}=x_{j+n+1}=...=x_{j+kn+1}=1. And there are infinitely many arithmetic progression of length k+1.

For summary, we have proven a theorem of Szemeredi,

If f:\mathbb{N}\rightarrow \{0,1\} such that there is a positive real number \delte\in (0,1), and for any n there is b_n,a_n with b_n-a_n>n and \frac{1}{b_n-a_n}\sum_{a_n+1\leq k\leq b_n}f(k)>\delta. Then for any k, there are infinitely many arithmetic progression as a subsequence in the sequence (f(n)) of the form 1=f(a)=f(a+n)=f(a+2n)=...=f(a+kn), and the common difference can be arbitrarily large.

This is an interesting result. It says that, as long as the ones is not too sparse, there exists always infinitely many arithmetic progressions of any length.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s