Stochastic OR

7.4 On \(\lambda = \gamma + \lambda P\)

In this final section, we concentrate on finding a general condition to ensure that \(\lambda = \gamma + \lambda P\) has a solution.

To start, we define iteratively \(\lambda^0 = 0\), \(\lambda^n = \gamma + \lambda^{n-1}P\), for \(n\geq 1\). Then, by substitution, we find

\begin{align*} \lambda^n = \gamma + \lambda^{n-1} P = \gamma + (\gamma + \lambda^{n-2}P) P = \cdots = \gamma \sum_{m=0}^{n-1} P^m, \end{align*}

where \(P^0\) is the identity matrix. Since \(P\) is a non-negative matrix, \(P^n\) is also non-negative for any \(n\), hence, \(\sum_{m=0}^n P^m\) is a monotonic increasing sequence (of matrices).1 Formally, non-decreasing sequence. Thus, for \(\lambda=\gamma + \lambda P\) to have a (unique) solution, this sum must remain finite as \(n\to \infty\).

Now, observe the similarity with \(\sum_{i=0}^n x^i\) with \(|x| < 1\). We conclude that this converges to \(1/(1-x)\) for \(n\to \infty\). Similarly, if there is an \(\epsilon>0\) such that for all \(i, j\), \(0\leq P^m_{ij} \leq (1-\epsilon)^m\), then \(0\leq \sum_{m=0}^n P_{ij}^m \leq \sum_{m=0}^n (1-\epsilon)^m \leq 1/\epsilon\).

It remains to find a condition to ensure that such \(\epsilon\) exists. For this, we discuss two ways.

From the previous section, we know that \(P_{i0} = 1-\sum_{j=1}^M P_{ij}\) is the probability that a job leaves the network from station \(i\). Next, consider a station \(k\) with \(P_{ki} > 0\). Then the probability that a job starts at \(k\) and leaves the network after a visit to \(i\) is \(P_{ki}P_{i0}>0\). Consequently, \(P^2_{k0} = \sum_{j=0}^M P_{kj}P_{j0} \geq P_{ki}P_{i0} > 0\).

More generally, we say that an \(M\times M\) matrix \(P\) is transient when it is possible to leave the network from any station in at most \(M\) steps. In other words, for any station \(j\) there is a sequence of intermediate stations \(j_1, j_2, \ldots , j_{M-1}\) such that \(P^{M}_{j0} \geq P_{j j_1}P_{j_1 j_2}\cdots P_{j_{M-1}0} > 0\). Using this, it is not hard to prove that \(P^n \leq (1-\epsilon)^n\) when \(P\) is transient.2 Ex 7.4.2.

For the second method, let us make the simplifying assumption that \(P\) is a diagonalizable matrix with \(M\) different eigenvalues.3 The argument below applies just as well to matrices reduced to Jordan normal form, but adds only notational clutter. In this case, there exists an invertible matrix \(V\) with the (left) eigenvectors of \(P\) as its rows and a diagonal matrix \(\Lambda\) with the eigenvalues of \(P\) on its diagonal such that \( V P = \Lambda V\). Hence, pre-multiplying with \(V^{-1}\), \( P = V^{-1}\Lambda V\). But then \(P^2 = V^{-1}\Lambda V \cdot V^{-1}\Lambda V= V^{-1}\Lambda^2 V\), and in general \(P^n = V^{-1}\Lambda^n V\). Clearly, if each eigenvalue has modulus less than \(1\), then \(\Lambda^n \to 0\) exponentially fast, hence \(P^n\to 0\) exponentially fast.

So, let us prove that all eigenvalues of a finite, transient routing matrix \(P\) have modulus less than \(1\). For this we use Gershgorin's disk theorem. Define the Gershgorin disk of the \(i\)th row of the matrix \(P\) as the disk in the complex plane:

\begin{equation*} B_i=\left\{z\in \mathbb{C}; |z-p_{ii}|\leq \sum_{j\neq i} |p_{i j}| \right\}. \end{equation*}

In words, this is the set of complex numbers that lie within a distance \(\sum_{j\neq i} |p_{i,j}|\) of the point \(p_{i,i}\). Next, assume for notational simplicity that for each row \(i\) of \(P\) we have that \(\sum_{j} p_{i j}<1\).4 Otherwise apply the argument to \(P^M\). Then this implies for all \(i\) that

\begin{equation*} 1> \sum_{j=1}^M p_{i j} = p_{ii} + \sum_{j\neq i} p_{i j}. \end{equation*}

Since all elements of \(P\) are non-negative, so that \(|p_{i j}| = p_{i j }\), it follows that

\begin{equation*} -1 < p_{ii} - \sum_{j\neq i} p_{i j} \leq p_{ii} + \sum_{j\neq i} p_{i j} < 1. \end{equation*}

With this and using that \(p_{ii}\) is a real number (so that it lies on the real number axis), it follows that the disk \(B_i\) lies strictly within the complex unit circle \(\{z \in \mathbb{C}; |z|\leq 1\}\). As this applies to any row \(i\), the union of the disks \(\cup_{i} B_{i}\) also lies strictly within the complex unit circle. Now we invoke Gershgorin's theorem, which states that all eigenvalues of the matrix \(P\) must lie in \(\cup_i B_i\), to conclude that all eigenvalues of \(P\) lie strictly in the unit circle, and hence all eigenvalues have modulus smaller than \(1\).

Ex 7.4.1

What is the routing matrix \(P\) for a tandem network with \(M\) stations? Show that \(P^M = 0\).

Hint

After visiting \(M\) stations, any job must have left the tandem network.

Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

\(P_{i, i+1} = 1\), and \(P_{i j} = 0\) for \(j\neq i+1\). As \(P\) is an upper-triangular matrix of dimension \(M\times M\), \(P^M =0\).

Ex 7.4.2

Show that when \(P\) is transient, \(P^n \leq (1-\epsilon)^n\) for some \(\epsilon > 0\).

Hint

Define \(Q= P^M\). As \(P\) is transient, \(Q_{ij} < 1\) for all \(i, j\). Then show that \(Q^n \to 0\).

Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

\(P\) transient \(\implies Q_{i0} = P_{i0}^M > 0 \implies \sum_{j=1}^M Q_{ij} < 1\). Since \(M\) is a finite number, there exists an \(\epsilon>0\) such that \(\max\{\sum_{j=1}^M Q_{ij}\} < 1-\epsilon\). Writing \(\bf 1\) for the vector \((1,1,\ldots, 1)\), this means that \(Q{\bf 1} < (1-\epsilon) {\bf 1}\). But then \( Q^{2} {\bf 1} = Q(Q{\bf 1}) < (1-\epsilon) Q{\bf 1} < (1-\epsilon)^2 {\bf 1}\). As \(Q\geq 0\) element-wise, we see that \(0\leq Q^n < (1-\epsilon)^n\). And then \(P^n = Q^{n/M} < (1-\epsilon)^{n/M}\).

Ex 7.4.3

Why does the assumption of Ex 7.4.2 not apply to an infinite transient matrix \(P\)?

Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

The above argument is not necessarily valid for matrices \(P\) that are infinite, since \(\inf\{P^{M}_{ij}\}\) need not be strictly positive for all \(i, j\).