# [Math] International Mathematical Olympiad (IMO) 2015 Problem 6

Working on Problem 6 of IMO 2015 was like playing a game. It is interesting in the sense that not too much advanced skill is required, but the involved visual thinking and the pursuit of the nature of seemingly simple yet complicated things made the experience enjoyable.

The problem is:

The sequence a1, a2, … of integers satisfies the following conditions:

1. 1 ≤ aj ≤ 2015 for j ≥ 1
2. k + ak ≠ l + al for all 1 ≤ k < l

Prove that there exist two positive integers b and N such that

|nj = m + 1(aj − b)| ≤ 10072

for all integers m and n satisfying n > m ≥ N

Imagine an infinite number of columns numbered 1, 2, … from the left, each divided into d = 2015 cells numbered 1, 2, 3, …d from bottom to top. By (1) and (2), starting from column 1 we are to sequentially select and mark one cell in each column with 'o', and then cross out its diagonal cells with 'x' in the columns to the right. For an example with d = 5 where the first 4 columns are finished:

__o___

o__x__

_x__x_

_oxo_x

__xxx_

Intuitively, the statement to be proved essentially says that the sequence, i.e. the heights of "o" marks, will always stabilizes at some point N, meaning that after column N the average height within any window of consecutive columns will not differ too much from a constant b.

## Proof

I first played with the columns consisting of cells above for some time trying to find useful patterns of selected cells but found nothing. However after a while I was led to something unexpected. Consider sequence ci, the number of cells not crossed out at column i when column i - 1 is finished. Clearly ci ≥ 1 because cell d is never crossed out. Note that sequence ci never increases, and it decreases if and only if cell 1 is not selected when it's available!

Because ci can only decrease and yet lower bounded, it stays at some constant d − k after some point N, i.e.

ci = d − k for all i ≥ N + 1

1 ≤ k ≤ d − 1

Here k is the number of crossed out cells in each column when it is being considered. k = 0 is the trivial case where all ai = 1 from the beginning of the sequence.

From now on we only consider columns after N. Since sequence ci stays constant, if cell 1 is available when we want to select a cell from a column, it must be selected. Now think about diagonals. A diagonal D is defined as the set of cells such that i + ai = D. From (2), there can only be at most one cell marked per diagonal. We say a diagonal is marked if one of its cells is. Now, when we are forced to select cell 1 in column i, this is the last chance that any cell in diagonal i + 1 gets marked. This means that no diagonals starting from diagonal N + 2 will be left unmarked permanently, because if there's any permanently unmarked diagonal y ≥ N + 2, it should've been marked as we see column y − 1 ≥ N + 1.

Consider any consecutive columns from m + 1 to n where m + 1 ≥ N + 1. At column m + 1, there are already k cells crossed out. Let's say they're in diagonals y1, y2, …, yk where all yi are between m + 2 and m + d (m + 1 + d can never be marked when column m is finished). As we finish column n, diagonal n + 1 is already marked, and so is diagonal n, n − 1… etc. Therefore, all diagonals in [m + 2, n + 1] are marked at this point, plus k more in [n + 2, n + d], which we denote by z1, z2, …, zk.

Finally, since ai = Di − i where Di is the diagonal index of the selected cell in column i, we can do substitution:

nj = m + 1(aj − b)
= nj = m + 1(Dj − j − b)
= n + 1j = m + 2j − nj = m + 1(j + b) + z1 − y1 + z2 − y2 + … + zk − yk
= (n − m)(1 − b) + z1 − y1 + z2 − y2 + … + zk − yk

Note that

(m + 2) + … + (m + 1 + k) ≤ ki = 1yi ≤ (m + d − k + 1) + … + (m + d)

(n + 2) + … + (n + 1 + k) ≤ ki = 1zi ≤ (n + d − k + 1) + … + (n + d)

So the quantity of interest lies in [(n − m)(1 − b) + (n − m − d + k + 1)k, (n − m)(1 − b) + (n − m + d − k − 1)k]

Let b = k + 1, these bounds become (k − (d − 1))k and (d − 1 − k)k, whose absolute values are bounded by (d − 1)2 ⁄ 4 for an odd d.

Q.E.D.

The result b = k + 1 also aligns well with intuition. For large k, many lower cells tend to be crossed out and we inevitably have to select a high one. With a small k, we could not pick too many high cells because cell 1 has to be marked whenever it's available.

References: