[Algorithm] Insane DFS

The hardest problem in the latest weekly HackerRank challenge looks like a problem of graph theory, but only at the superficial level. It is not hard to tell that the "suitable array" defined by DFS traversal throughout a rooted tree is equivalent to all and only arrays a[] satisfying these constraints:

a[0] = 0, a[1] = 1 if N > 1
For n > 0, a[n] ≤ a[n - 1] + 1

Let's start with DP. Let D[x][y] be the number of arrays a[0 ... x] with a[x] ≥ y. Then, the number of arrays a[0 ... x] such that a[x] == y is exactly D[x - 1][y - 1] because a[x - 1] must be no less than y - 1. Thus we have relation

D[x][y] = D[x - 1][y - 1] + D[x][y + 1] (*)

with some boundary conditions, including D[x][0] = D[x][1]. For simplicity we can append a[N] = 1 in order to find the final answer. Suppose a[x₀] and a[x₁] are the only fixed numbers in [x₀, x₁], the core task is to compute f(x₀, y₀, x₁ - 1, y₁ - 1) = D[x₁ - 1][a[x₁] - 1] from D[x₀][0] = D[x₀][1] = ... = D[x₀][a[x₀]] = 1, and the final answer will be the product of all such f(x₀, y₀, x₁, y₁) modulo M.

With initial observations, I almost felt f(x₀, y₀, x₁, y₁) has binomial form C(a, b), although it doesn't because of boundary condition D[x][0] = D[x][1]. After some calculations, I noticed the nice relation

f(x₀, y₀, x₁, y₁) = C(2W - H, W) - C(2W - H, W - y₁ - 1)

where W = x₁ − x₀, H = y₁ − y₀. The proof is actually mathematically very interesting.

Imagine you're on an infinite xy-plane and stand at cell (0, y₀). Relation (*) says you can only go in two directions (1, 1) or (0, -1). Denote them by step A and B. Then D[x₁][y₁] is the number of shortest path you may take to get to cell (x₁ − x₀, y₀), and the constraint D[x][0] = D[x][1] implies you are not allowed to go below x-axis. Now, C(2W - H, W) is the number of shortest paths you could take without this contraint. This is because to go from (0, y₀) to (x₁ − x₀, y₁), the number of A step you need is exactly x₁ − x₀ = W since only A step makes x-direction move. The number of required B step to count for y-direction move is hence y₀ − y₁ + W = W - H. We could also transform the xy-plane to x'y'-plane with x' = x and y' = y - x. There A step become horizontal step (1, 0) and B step is vertical downward (0, -1), i.e. everything is square and not tilted.

We now have to show C(2W - H, W - y₁ - 1) is number of shortest paths that goes under x-axis in xy-plane. Going under x-axis is the same as stepping into zone x' + y' < 0 since y = x + y' = x' + y'. What is the number of such shortest paths? For illustration we draw an example of y₀ = 2, W - H = 6, x₁ − x₀ = 13 in x'y'-plane below, where the goal is to go from upper left corner to lower right corner by stepping in at least one '*' cell, i.e. all '*' cells have x' + y' < 0.

(0, 2)
      oooooooooooooo
      oooooooooooooo
(0, 0)oooooooooooooo
      *ooooooooooooo
      **oooooooooooo
      ***ooooooooooo
      ****oooooooooo (13, -4)

Here's the magic: we can 'fold' the plane along x' + y' = 0 to get this:

(0, 2)
      oooooooooooooo
      oooooooooooooo
(0, 0)oooooooooooooo
      *ooo----------
      **oo----------
      ***o----------
      ****---------- (13, -4)
      ++++
      ++++
      ++++
      ++++
      ++++
      ++++
      ++++
      ++++
      ++++
      ++++ (3, -14)

Fold all the '-' cells to the '+' cells shown above. There's then a 1-1 mapping between the original shortest paths from (0, 2) to (13, -4) touching at least one '*' cell and the shortest paths from (0, 2) to (3, -14), and so the number of shortest paths is C(19, 3). To generalize, with some counting we can thus prove that it's C(2W - H, W - H + 1 - (y₀ + 1) - 1) = C(2W - H, W - y₁ - 1)

Overall, this solution takes O(N) computation of binomial coefficient C(a, b). Assuming each C(a, b) requires O(b)log(M) where M is 10⁹ + 7 and b = O(W), each boils down to O(W)log(M). The total time complexity is then O(N)log(M) and space complexity O(1).

post by Shen-Fu Tsai