Stochastic OR

4.1 Rate, Stability and Load

In this section, we define the load and stability of a queueing system, which is, arguably, the most important performance measure of such a system. We first develop the concepts for the \(G/G/1\) queue, then we consider the \(G^{B}/G^{C}/c\) queue.

We start with defining the arrival rate and departure rate as the long-run average number of jobs that arrive and depart along a sample path:

\begin{align*} \lambda &= \lim_{t\to\infty} \frac{A(t)}t, & \delta &= \lim_{t\to\infty} \frac{D(t)}t, \tag{4.1.1} \end{align*}

where \(A(t)\) and \(D(t)\) are discussed in Section 3.1. We assume that both limits exist1 The first limit does not necessarily exist if \(A(t)\) is some pathological function. , are finite, and \(0<\delta \leq \lambda<\infty\).

For the arrival rate, suppose we are given a sequence of iid inter-arrival times \(X_k \sim X\). By the strong law of large numbers, \(A_n/n = n^{-1}\sum_{k=1}^{n}X_{k} \stackrel{a.s.}\to \E X\) as \(n\to \infty\) when \(\E X < \infty\). The next theorem relates \(\E X\) to the limit of \(A(t)/t\) as \(t\to \infty\).

Th. 4.1.1

Assume that \(\lambda \in (0, \infty)\). Then, the existence of one of the next two limits implies the existence of the other, in which case,

\begin{equation*} \frac{A_{k}}{k} \to \lambda^{-1}>0 \iff \frac{A(t)}{t} \to \lambda >0. \end{equation*}
Proof

\(\implies:\) \(A_{A(t)}\) is the arrival time of the last job before time \(t\) and \(A_{A(t)+1}\) is the arrival time of the first job after time \(t\). Therefore \(A_{A(t)} \leq t < A_{A(t)+1}\) and, for \(A(t)>0\),

\begin{equation*} \frac{A_{A(t)}} {A(t)} \leq \frac{t}{A(t)} <\frac{A_{A(t)+1}}{A(t)} = \frac{A_{A(t)+1}}{A(t)+1}\frac{A(t)+1}{A(t)}. \end{equation*}

For the LHS, observe that the assumption \(A_{k}/k \to \lambda^{-1} > 0\) as \(k\to \infty\) implies that \(A_{k} \to \infty\). This in turn implies that \(A(t) := \max\{k : A_{k} \leq t\} \to \infty\) as \(t\to \infty\). Next, if \(k\) and \(t\) are such that \(A(t)= k\) then2 TF 3.1.4. \(A_{A(t)} = A_{k}\), hence, \(A_{A(t)}/A(t) = A_k/k\). Consequently, \(\lim_{t\to \infty} A_{A(t)+1}/(A(t)+1) = \lim_{t\to \infty} A_{A(t)}/A(t) = \lim_{k\to \infty} A_{k}/ k = \lambda^{-1}\). Finally, since \(A(t) \to \infty\) implies that \((A(t)+1)/A(t) \to 1\), we conclude that the LHS and RHS both converge to \(1/\lambda\).

\(\Longleftarrow\): If \(A(t)/t \to \lambda\), then at the epochs \(A_{k}\) we also have \(A(A_k)/A_k \to \lambda\). (Observe that the epochs \(\{A_{k}\}\) are just a subset of \(\R^{+}\) while \(t\) takes any value in \(\R^{+}\).)3 Recall: \(A(t)/t \to \lambda\) as \(t\to\infty\) \(\iff\) \(\forall \epsilon > 0: \exists s: \forall t > s : |A(t)/t - \lambda| < \epsilon\). Noting that \(A(A_{k}) = k\), it follows that \(k/A_{k} \to \lambda\).

Thus, from this theorem it follows that

\begin{equation*} \E X = \lim_{n\to\infty} \frac{1}n\sum_{k=1}^n X_k = \lim_{n\to\infty} \frac{A_{n}}{n} = \lim_{t\to\infty} \frac t{A(t)} = \frac 1 \lambda. \end{equation*}

For the service rate of a single server, let \(S_k\) denote the required service time of the \(k\)th job served, so that \(U_n = \sum_{k=1}^n S_k\) is the total service time of the first \(n\) jobs. Similarly to the definition of \(A(t)\), we let \( U(t) = \max\{k: U_k \leq t\}\) and define the service, or processing, rate of a single server as

\begin{equation*} \mu := \lim_{t\to\infty} \frac{U(t)}t, \quad 0 < \mu < \infty, \end{equation*}

assuming that this limit exists. This provides us with the relation

\begin{equation*} \E S = \lim_{n\to\infty} \frac 1 n \sum_{k=1}^n S_k = \lim_{n\to\infty} \frac{U_n}{n} = \lim_{t\to\infty} \frac t{U(t)} = \frac 1 \mu. \end{equation*}

Hence, the service rate \(\mu\) of a single server equals \(1/\E S\).

Given the arrival and service rate, we define for the single-server queue the

\begin{align*} \textrm{Server load } = \rho := \lambda \E S=\frac{\lambda}{\mu} = \frac{\E S}{\E X}, \end{align*}

and the

\begin{align*} \textrm{Server utilization } &= \delta \E S = \frac{\delta}{\mu}. \end{align*}

We need to distinguish between these two concepts either when \(\lambda> \mu\) or when arriving customers are blocked. In the first situation, the queue in front of the server increases without bound so that \(\lambda > \mu = \delta\).4 Of course, the departure rate can never exceed the service rate. In the second situation, the blocked jobs do not enter the system and therefore are not served, so \(\lambda > \delta\). Bear in mind that the load can exceed 1 (when \(\lambda > \mu\)), while the utilization is always \( \leq 1\).

In general we say that a system is rate-stable if

\begin{equation*} \lambda = \delta, \end{equation*}

in words, the system is rate-stable whenever in the long run jobs leave the system just as fast as they enter the system. In this case, load and utilization are equal; we use these terms interchangeably for rate-stable systems. Rate-stability requires that \(\lambda \leq \mu\).

It is easy to generalize the definition of the load to the \(G^{B}/G^{C}/c\) queue. When each batch contains \(\E B\) jobs each of which requires \(\E S\) time to serve, the expected work per job is \(\E B \E S\). Similarly, when a server serves, on average, a batch of \(\E C\) jobs per service time, its service rate is \(\E C/\E S\). Finally, when the station contains \(c\) identical servers, the station's capacity is \(c\) times as large as that of a single server. Hence, in this more general case,

\begin{equation*} \rho := \frac{\lambda \E S}c \frac{\E B}{\E C}. \tag{4.1.2} \end{equation*}

The definition of utilization changes accordingly.

We can use the memoryless property of the inter-arrival times in the \(M/G/1\) queue to show that the expected idle and busy times5 The busy time of a server is the time between an arrival to an empty system until the server becomes idle again. of the \(M/G/1\) satisfy

\begin{align*} \E I &= \frac1\lambda & \E \U &= \frac{\E S}{1-\rho}. \tag{4.1.3} \end{align*}

For this, observe that since inter-arrival times are memoryless in the \(M/G/1\) queue, the expected time to the next arrival is \(1/\lambda\). For the busy time, observe that during the service time of the customer starting a busy time6 This customer necessarily arrives at an empty system. , an expected number \(\lambda \E S\) of new jobs arrives. As each of these jobs generates a busy period of its own, each of which is independent and has the same expected duration \(\E \U\), the expected busy time started by the first job must be equal to the expected service time of this job plus the expected busy periods generated by jobs arriving during this service time; hence \(\E \U = \E S + \lambda \E S \E \U\). But from this, \(\E \U = \E S / (1-\rho)\), because \(\rho = \lambda \E S\).

TF 4.1.1

Claim: If for a queueing system \(\lim_{t\to\infty}A(t) = \infty\), then the system is not rate stable.

Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

False. In any reasonable queueing system this limit is \(\infty\).

TF 4.1.2

Claim: \(\lim_{t\to\infty}\frac{A(t)}{t}=\lim_{n\to\infty}\frac{A_n}{n}\).

Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

False.

TF 4.1.3

Claim: \(\lambda = \delta \implies \rho = 1\), so the queue is not stable.

Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

False. \(\delta\) differs from \(\mu\).

TF 4.1.4

Claim: A queueing system with \(\lambda \E S > 1\) and a single server can still be rate-stable if jobs are sometimes rejected.

Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

True. By blocking or rejecting jobs, the effective arrival rate can be reduced below the service rate, making the system rate-stable.

TF 4.1.5

Claim: The load \(\rho = \lambda \E S / c\) of a queueing system with \(c\) servers is always at most \(1\).

Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

False. The load can exceed \(1\); it is the utilization that is at most \(1\). When \(\rho > 1\) the system is not rate-stable.

Opt 4.1.6

Can you make an arrival process such that \(A(t)/t\) does not have a limit?

Hint

As a start, the function \(\sin(t)\) does not have a limit as \(t\to\infty\). However, the time-average \(\sin(t)/t \to 0\). Now construct a function whose time-average does not converge; it must grow fast or oscillate increasingly wildly.

For the mathematically interested, we seek a function whose Ces\`aro limit does not exist.

Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

If \(A(t) = 3 t^2\), then clearly \(A(t)/t = 3t\). This does not converge to a limit.

Another example, let the arrival rate \(\lambda(t)\) be given as follows:

\begin{equation*} \lambda(t) = \begin{cases} 1 & \text{if } 2^{2k} \leq t < 2^{2k+1} \\ 0 & \text{if } 2^{2k+1} \leq t < 2^{2(k+1)}, \end{cases} \end{equation*}

for \(k=0,1,2,\ldots\). Let \(A(t) = \lambda(t) t\). Then \(A(t)/t\) fails to converge. Of course, these examples are quite pathological, and are not representative of `real life cases'. (Though this is somewhat vague: What is actually a real-life case?)