Understanding Gillman’s proof of the Chernoff bound for expander graphs

My question is about Claim 1 in the proof here: Gillman (1993). At the end of the proof, the author says:

The matrix product $ U^\top\sqrt{D^{-1}}(P+(\mathrm{e}^x-1)B(0)-\mu I)\sqrt{D}U$ , which is equal to $ (D’-\mu I)(I+(D’-\mu I)^{-1}(\mathrm{e}^x-1)D’U^\top D_A U)$ , is singular. Therefore,

\begin{align*} 1&\leq \lVert (D’-\mu I)^{-1}(\mathrm{e}^x-1)D’U^\top D_A U \rVert_2 \ &\leq \frac{1}{\mu-\lambda_2}(\mathrm{e}^x -1). \end{align*}

(The first inequality uses the continuity of the function $ \lambda_2(y)$ .)

I understand why the two expressions at the beginning are equal and I understand the second inequality, but I do not understand the first inequality and also why the matrix product is singular.

Let me provide the definitions so you can avoid reading the whole paper. There is a weighted undirected graph $ G=(V, E, w)$ where $ w_{ij}=0$ if $ \{i,j\}\notin E$ . Denote $ w_i:=\sum_j w_{ij}$ . Let $ P$ denote the transition matrix, so $ P_{ij}:=\frac{w_{ij}}{w_i}$ . Denote by $ \lambda_2$ the second largest eigenvalue of $ P$ and by $ \epsilon:=1-\lambda_2$ the spectral gap. Next, let $ M$ be the weighted adjacency matrix $ M_{ij}:=w_{ij}$ . Let $ A$ be a set of vertices and $ \chi_A$ be an indicator function. Some more definitions are:

\begin{align*} &E_r:=\operatorname{diag}(\mathrm{e}^{r\chi_A}) & &P(r):=PE_r \ &D:=\operatorname{diag}(\frac{1}{w_i}) & &S:=\sqrt{D}M\sqrt{D} \ &S_r:=\sqrt{DE_r}M\sqrt{DE_r} & & B(r):=\frac{1}{\mathrm{e}-1}(P(r+1)-P(r)) \end{align*}

Moreover, since $ S$ is unitarily diagonalizable, there exist a unitary matrix $ U$ and a diagonal matrix $ D’$ such that $ D’=U^\top SU$ . Furthermore, there exists a diagonal matrix $ D_A$ such that $ B(0)=PD_A$ .

Define $ \lambda(r)$ to be the largest eigenvalue of $ P(r)$ and $ \lambda_2(r)$ to be its second largest eigenvalue. As before, $ \epsilon_r := \lambda(r)-\lambda_2(r)$ is the spectral gap.

In Claim 1 the author lets $ 0\leq x\leq r$ be some number. He also defines $ \mu<\lambda(x)$ to be any other eigenvalue of $ P(x)$ . At the end of the proof, we are only interested in $ \mu>\lambda_2$ .

Some other facts are:

\begin{align*} &\lVert D’ \rVert_2 = \lVert D_A \rVert_2 = 1 & &D’=U^\top\sqrt{D^{-1}}P\sqrt{D}U \ &P(0)=P & &\lambda(0)=1 & &\lambda_2(0)=\lambda_2 \ &P=\sqrt{D}S\sqrt{D^{-1}} & &P(r)=\sqrt{DE_r^{-1}}S_r\sqrt{E_rD^{-1}} \end{align*}

I hope I didn’t miss anything relevant. Thank you for your help.