[Math] International Mathematical Olympiad (IMO) 2015 Problem 6

Working on Problem 6 of IMO 2015 was like playing a game. It is interesting in the sense that not too much advanced skill is required, but the involved visual thinking and the pursuit of the nature of seemingly simple yet complicated things made the experience enjoyable.

The problem is:

The sequence a₁, a₂, … of integers satisfies the following conditions:

1 ≤ a_j ≤ 2015 for j ≥ 1
k + a_k ≠ l + a_l for all 1 ≤ k < l

Prove that there exist two positive integers b and N such that

|∑ⁿ_{j = m + 1}(a_j − b)| ≤ 1007²

for all integers m and n satisfying n > m ≥ N

Imagine an infinite number of columns numbered 1, 2, … from the left, each divided into d = 2015 cells numbered 1, 2, 3, …d from bottom to top. By (1) and (2), starting from column 1 we are to sequentially select and mark one cell in each column with 'o', and then cross out its diagonal cells with 'x' in the columns to the right. For an example with d = 5 where the first 4 columns are finished:

__o___

o__x__

_x__x_

_oxo_x

__xxx_

Intuitively, the statement to be proved essentially says that the sequence, i.e. the heights of "o" marks, will always stabilizes at some point N, meaning that after column N the average height within any window of consecutive columns will not differ too much from a constant b.

Proof

I first played with the columns consisting of cells above for some time trying to find useful patterns of selected cells but found nothing. However after a while I was led to something unexpected. Consider sequence c_i, the number of cells not crossed out at column i when column i - 1 is finished. Clearly c_i ≥ 1 because cell d is never crossed out. Note that sequence c_i never increases, and it decreases if and only if cell 1 is not selected when it's available!

Because c_i can only decrease and yet lower bounded, it stays at some constant d − k after some point N, i.e.

c_i = d − k for all i ≥ N + 1

1 ≤ k ≤ d − 1

Here k is the number of crossed out cells in each column when it is being considered. k = 0 is the trivial case where all a_i = 1 from the beginning of the sequence.

From now on we only consider columns after N. Since sequence c_i stays constant, if cell 1 is available when we want to select a cell from a column, it must be selected. Now think about diagonals. A diagonal D is defined as the set of cells such that i + a_i = D. From (2), there can only be at most one cell marked per diagonal. We say a diagonal is marked if one of its cells is. Now, when we are forced to select cell 1 in column i, this is the last chance that any cell in diagonal i + 1 gets marked. This means that no diagonals starting from diagonal N + 2 will be left unmarked permanently, because if there's any permanently unmarked diagonal y ≥ N + 2, it should've been marked as we see column y − 1 ≥ N + 1.

Consider any consecutive columns from m + 1 to n where m + 1 ≥ N + 1. At column m + 1, there are already k cells crossed out. Let's say they're in diagonals y₁, y₂, …, y_k where all y_i are between m + 2 and m + d (m + 1 + d can never be marked when column m is finished). As we finish column n, diagonal n + 1 is already marked, and so is diagonal n, n − 1… etc. Therefore, all diagonals in [m + 2, n + 1] are marked at this point, plus k more in [n + 2, n + d], which we denote by z₁, z₂, …, z_k.

Finally, since a_i = D_i − i where D_i is the diagonal index of the selected cell in column i, we can do substitution:

∑ⁿ_{j = m + 1}(a_j − b)
= ∑ⁿ_{j = m + 1}(D_j − j − b)
= ∑^n + 1_{j = m + 2}j − ∑ⁿ_{j = m + 1}(j + b) + z₁ − y₁ + z₂ − y₂ + … + z_k − y_k
= (n − m)(1 − b) + z₁ − y₁ + z₂ − y₂ + … + z_k − y_k

Note that

(m + 2) + … + (m + 1 + k) ≤ ∑^k_i = 1y_i ≤ (m + d − k + 1) + … + (m + d)

(n + 2) + … + (n + 1 + k) ≤ ∑^k_i = 1z_i ≤ (n + d − k + 1) + … + (n + d)

So the quantity of interest lies in [(n − m)(1 − b) + (n − m − d + k + 1)k, (n − m)(1 − b) + (n − m + d − k − 1)k]

Let b = k + 1, these bounds become (k − (d − 1))k and (d − 1 − k)k, whose absolute values are bounded by (d − 1)² ⁄ 4 for an odd d.

Q.E.D.

The result b = k + 1 also aligns well with intuition. For large k, many lower cells tend to be crossed out and we inevitably have to select a high one. With a small k, we could not pick too many high cells because cell 1 has to be marked whenever it's available.

post by Shen-Fu Tsai

References:

[1]	International Mathematical Olympiad