Maybe I am missing something very easy and obvious.

But, I don’t understand why estimate cost of source vertex is subtracted from the overall estimate cost, if heuristic function $ h$ is monotonic: $ $ d′(x,y)=d(x,y)+h(y)−h(x)$ $

What I currently know:

A* algorithm can be used as an extension for Dijkstra’s algorithm. At each iteration of its main loop, it chooses the vertex with the minimum of estimation cost plus cost of the path to this vertex:

For vertex $ u$ and its successor $ v$ , overall cost is calculated with $ $ f(u, v) = d(u, v) + h(v)$ $ using some heuristic function $ h$ . Where:

- $ d(u,v)$ cost of the path from $ u$ to $ v$
- $ h(v)$ estimate cost from $ v$ to the target vertex $ t$

If for any adjacent vertices $ u$ and $ v$ , it is true that $ $ h(u) <= d(u, v) + h(v)$ $ then $ h$ is a monotonic. In other words, graph holds triangle inequality property.

It is stated in Wiki page of A* algorithm:

If the heuristic h satisfies the additional condition $ h(x) ≤ d(x, y) + h(y)$ for every edge $ (x, y)$ of the graph (where d denotes the length of that edge), then h is called monotone, or consistent. In such a case, A* can be implemented more efficiently—roughly speaking, no node needs to be processed more than once (see closed set below)—and A* is equivalent to running Dijkstra’s algorithm with the reduced cost $ d'(x, y) = d(x, y) + h(y) − h(x)$ .

My questions are:

and A* is equivalent to running Dijkstra’s algorithm with the reduced cost $ d'(x, y) = d(x, y) + h(y) − h(x)$ .

Any proof for this equivalence ?

It is clear for me that $ 0 <= d(x, y) + h(y) – h(x)$ , and it is feasible. But:

- Why this formula is chosen as a new distance function ?
- Is there any formal proof that it works ?
- Why it is not enough to run Dijkstra with $ d'(x, y) = d(x, y) + h(y)$ ?
- What is the math behind it ?