Stochastic OR

3.6 Discrete-Event Simulations: Examples

The discrete-event simulator for the \(G/G/c\) queue developed in the previous section can be easily adapted to other queueing systems. We first analyze the effects of several server speeds for the \(G/G/c\) queue. Then we implement a queue in which jobs can be served with priority.1 A hospital serves as an example. Next, we focus on a policy that controls the server based on the number of jobs in the system. Our final example is more extensive: we analyze whether servers dedicated to business customers in an airport check-in process should be shared with economy class customers. This would correspond to a part of a business consultancy project.

Consider a station with three servers with rates \(1\), \(0.1\), and \(0.01\), respectively, and such that whenever multiple servers are free, the fastest available one is chosen. If queue sizes are very long, then it could help to use the extra capacity offered by the two slower servers. However, when the load is small and a job ends up at the slowest server, its sojourn time is large2 Because its service time is large. . Should we actually use this extra capacity? To obtain insight into the value of the extra capacity, we compare this system with an \(M/M/1\) queue with service rate \(1\)3 In other words, a queue with no extra capacity.

We take \(\E S = 1/\mu = 1/4\) at first.

"""Simulate a multi-server queue with different speeds and compare
to an M/M/C queue."""

from collections import Counter

import numpy as np

import mmc
from job import Job
from simulator import Simulation


def run_multiple_speeds(labda=3, mu=4, server_rates=[1, 0.1, 0.01], num=10000):
    rng = np.random.default_rng(3)
    X = rng.exponential(scale=1 / labda, size=num)
    X[0] = 0
    A = X.cumsum()
    loads = rng.exponential(scale=1 / mu, size=num)
    jobs = [Job(id=i, arrival_time=A[i], load=loads[i]) for i in range(1, num)]

    sim = Simulation()
    sim.initialize_jobs(jobs)
    sim.initialize_servers(server_rates=server_rates)
    sim.run()

    mm1 = mmc.MMC(labda, mu, c=1)
    print(f"W: sim: {sim.stats.mean_waiting_time():0.2f}, exact: {mm1.EW:0.2f}")
    print(f"J: sim: {sim.stats.mean_sojourn_time():0.2f}, exact: {mm1.EJ:0.2f}")
    print(f"Q: sim: {sim.stats.mean_queue_length():0.2f}, exact: {mm1.EQ:0.2f}")

    count = Counter([j.free_servers for j in sim.stats])
    print(count)

Here is the output. We see that the average queue length is smaller in the multi-server station, but the sojourn time is longer. The counter shows that about \(91\) of the \(10000\) jobs are served by the slowest server, thereby providing the explanation.

Python
W: sim: 0.42, exact: 0.75
J: sim: 1.24, exact: 1.00
Q: sim: 1.27, exact: 2.25
Counter({0: 8772, 1: 1136, 2: 91})

Since the slow servers increase sojourn times, we may question whether to use them at all. Let us examine the output if the job load becomes quite a bit larger. We increase \(\E S\) from \(1/4\) to \(1/3.1\), so \(\rho\) increases from \(3/4\) to \(3/3.1\).

The new results make it clear that the extra servers do help; the expected waiting and sojourn go down as compared to the \(M/M/1\) queue, i.e., the queue without extra capacity. Interestingly, the fraction of jobs served by the fastest server is \emph{larger} than when \(\E S =1/4\).

Python
W: sim: 2.04, exact: 9.68
J: sim: 2.97, exact: 10.00
Q: sim: 6.17, exact: 29.03
Counter({0: 9615, 1: 357, 2: 27})

In conclusion, if the load is low, it is best to use only the fastest server. But when the load is high, queueing times form the larger part of the sojourn time, and it then becomes better to use the slower servers as well.

Different scheduling and priority rules are easy to implement by subclassing Job and overriding its __lt__ method. For instance, here is the code to implement LIFO4 LIFO = Last In First Out. , and SPTF5 SPTF = Shortest Processing Time First; a rule giving preference to smaller jobs. .

Python
from dataclasses import dataclass

from job import Job


@dataclass(eq=False)
class LiFOJob(Job):
    def __lt__(self, other):
        return self.arrival_time > other.arrival_time


@dataclass(eq=False)
class SPTFJob(Job):
    def __lt__(self, other):
        return self.load < other.load

In priority queues, higher priority is better, and only when jobs have the same priority, they should be sorted according to FIFO.6 A variation would be to provide jobs with due dates: the job with the smaller due date should be served first.

from dataclasses import dataclass

from job import Job


@dataclass(eq=False)
class PriorityJob(Job):
    priority: int = 0

    def __lt__(self, other):
        if self.priority == other.priority:
            return self.arrival_time < other.arrival_time
        else:
            return self.priority > other.priority

Here is how to use it in a simulation. In the simulation we give ten percent of the jobs a higher priority, and then we compare the results of the simulation to a regular \(M/M/1\) queue with the same service capacity but without job priorities.

"""The effect of priority service in multi-server queues."""

from collections import defaultdict

import numpy as np

import mmc
from priority_job import PriorityJob
from simulator import Simulation

num = 10000
labda, mu = 3, 3.2

rng = np.random.default_rng(3)
X = rng.exponential(scale=1 / labda, size=num)
X[0] = 0
A = X.cumsum()
loads = rng.exponential(scale=1 / mu, size=num)
priorities = rng.binomial(1, 0.1, size=num)
jobs = []
for i in range(1, num):
    job = PriorityJob(id=i, arrival_time=A[i], load=loads[i])
    job.priority = priorities[i]
    jobs.append(job)


sim = Simulation()
sim.initialize_jobs(jobs)
sim.initialize_servers(server_rates=[1])
sim.run()

mm1 = mmc.MMC(labda, mu, c=1)
print(f"W: sim: {sim.stats.mean_waiting_time():0.2f}, exact: {mm1.EW:0.2f}")
print(f"J: sim: {sim.stats.mean_sojourn_time():0.2f}, exact: {mm1.EJ:0.2f}")
print(f"Q: sim: {sim.stats.mean_queue_length():0.2f}, exact: {mm1.EQ:0.2f}")

wait = defaultdict(float)
nums = defaultdict(float)
for j in sim.stats:
    nums[j.priority] += 1
    wait[j.priority] += j.waiting_time
for k in sorted(nums.keys()):
    print(f"Priority: {k}, EW: {wait[k] / nums[k]:.2f}")
Python
W: sim: 4.82, exact: 4.69
J: sim: 5.14, exact: 5.00
Q: sim: 14.46, exact: 14.06
Priority: 0, EW: 5.34
Priority: 1, EW: 0.32

From the output we see that for the population as a whole, there is no difference in average waiting time7 Which is in accordance with our intuition. : there is hardly any difference in waiting time between the simulation and the exact model (without priorities). However, for the jobs with the higher priority class \(\E \W = 0.32\), while for the lower priority \(\E \W = 5.34\). That is, the lower priority jobs do wait somewhat (but not much) longer than in the case without priorities. We explain this by the fact that the fraction of jobs with higher priority is just 10%.

Implementing the \(N\) policy is now very simple8 See Ex 2.2.6. : we only have to subclass the Simulation class and override the handle_arrival method, and nothing else! To see why, realize that in Simulation.handle_departure when there are jobs in queue and a job departs, a new job is assigned to the server; this does not depend on the state of the server. Fig. shows a sample path of the queue length process.

"""Queueing behavior under an N policy."""

import numpy as np

from job import Job
from simulator import Simulation


class N_policy(Simulation):
    def __init__(self, thres_N):
        self.thres_N = thres_N
        super().__init__()

    def handle_arrival(self, job: Job):
        job.q_length_at_arrival = self.queue.length()
        self.queue.push(job)
        if (
            self.queue.length() >= self.thres_N
            and self.servers.is_server_available()
        ):
            self.serve_job(self.queue.pop(), self.servers)
n_policies.png

The check-in process at an airport is a somewhat more involved queueing system, because at a check-in desk business and economy class customers typically have their own set of servers. Here, we are going to use simulation to compare different ways to organize the server allocation. One policy is like this: some of the servers are strictly allocated to business class customers, and the rest of the servers are strictly used for the economy class customers. Another policy is that if a business server is free, this server accepts just one economy customer for service9 Assuming there is such a customer in queue. . When, after this service, there is still no business customer, the business server serves a second economy customer. Once a business customer arrives, the business server finishes the economy customer10 If one is in service. , and then serves the business customer.11 Note that this system is not identical to a priority queue. When the business servers are shared, business customers are served with priority; however, the economy servers are not shared. Which of these two policies is best?

Assume that a flight departing at 11 am consists of \(300\) economy class and 100 business class people. For ease, assume that all \(N\) customers arrive between 8 and 10, and that the desks are open from 8 to 10h30 to serve the demand.

Clearly, this is a transient queueing process that ends when all jobs are served. As this system is mathematically intractable, we only have simulation at our disposal.

We first load the relevant modules.

"""Queueing at an airport check-in desk with business and economy customers."""

from dataclasses import dataclass

import numpy as np

import events
import job
import queues
import server
import servers
import simulator
import stats

Since jobs and servers now are of the business or economy class, we subclass Job and add the relevant attribute.

ECONOMY = 0
BUSINESS = 1


@dataclass
class Job(job.Job):
    type: int = ECONOMY


@dataclass
class Server(server.Server):
    type: int = ECONOMY

In the simulator we configure separate queues and servers for the economy and the business customers. The boolean argument share indicates whether the business servers have to be shared or not. When handling arrivals and departures, we need to take the type of job and server into account. For the rest, we can subclass from the Simulation class.

class Simulation(simulator.Simulation):
    def __init__(self, num_e_servers, num_b_servers, share=False):
        self.share = share
        self.events = events.Events()
        self.stats = stats.Statistics()
        self.now = 0
        # ECONOMY
        self.e_queue = queues.Queue()
        self.e_servers = servers.Servers()
        for i in range(num_e_servers):
            self.e_servers.push(Server(id=i, rate=1, type=ECONOMY))
        # BUSINESS
        self.b_queue = queues.Queue()
        self.b_servers = servers.Servers()
        for i in range(num_b_servers):
            self.b_servers.push(Server(id=i, rate=1, type=BUSINESS))

    def handle_arrival(self, job):
        if job.type == ECONOMY:
            if self.e_servers.is_server_available():
                self.serve_job(job, self.e_servers)
            elif self.share and self.b_servers.is_server_available():
                self.serve_job(job, self.b_servers)
            else:
                self.e_queue.push(job)
        elif job.type == BUSINESS:
            if self.b_servers.is_server_available():
                self.serve_job(job, self.b_servers)
            else:
                self.b_queue.push(job)
        else:
            raise ValueError("Unknown job type")

    def handle_departure(self, job):
        self.stats.push(job)
        if job.server.type == ECONOMY:
            self.e_servers.push(job.server)
            if not self.e_queue.is_empty():
                self.serve_job(self.e_queue.pop(), self.e_servers)
        elif job.server.type == BUSINESS:
            self.b_servers.push(job.server)
            if not self.b_queue.is_empty():
                self.serve_job(self.b_queue.pop(), self.b_servers)
            elif self.share and not self.e_queue.is_empty():
                self.serve_job(self.e_queue.pop(), self.b_servers)
        else:
            raise ValueError("Unknown job type")

The next step is to configure the jobs and define the inter-arrival times and the time the desks have to be open to serve all customers.

rng = np.random.default_rng(3)
check_in_window = 120  # minutes
desks_open = check_in_window + 30


# Make ECONOMY jobs
num = 300
A = np.sort(rng.uniform(low=0, high=check_in_window, size=num))
loads = rng.exponential(scale=2, size=num)
e_jobs = [
    Job(id=i, arrival_time=A[i], load=loads[i], type=ECONOMY)
    for i in range(1, num)
]


# BUSINESS jobs
num = 100
A = np.sort(rng.uniform(low=0, high=check_in_window, size=num))
loads = rng.exponential(scale=2, size=num)
# loads = rng.uniform(low=1, high=3, size=num)
b_jobs = [
    Job(id=i, arrival_time=A[i], load=loads[i], type=BUSINESS)
    for i in range(1, num)
]

It is evident that we need to determine the required number of servers \(c\) such that the performance is acceptable. The following code prints the total capacity and the demand so that we can check whether the available server capacity suffices. It turns out that the capacities suffice to get the work done.

num_e_servers, num_b_servers = 4, 2

e_work = sum(j.load for j in e_jobs)
e_service_available = num_e_servers * desks_open
print(f"Check e capacity: {e_work=:.0f}, {e_service_available=:.0f}")
b_work = sum(j.load for j in b_jobs)
b_service_available = num_b_servers * desks_open
print(f"Check b capacity: {b_work=:.0f}, {b_service_available=:.0f}")
tot_work = e_work + b_work
tot_service_available = (num_e_servers + num_b_servers) * desks_open
print(f"Check b capacity: {tot_work=:.0f}, {tot_service_available=:.0f}")
Python
Check e capacity: e_work=553, e_service_available=600
Check b capacity: b_work=193, b_service_available=300
Check b capacity: tot_work=747, tot_service_available=900

We can run the simulation.

sim = Simulation(num_e_servers, num_b_servers, share=False)
sim.initialize_jobs(e_jobs)
sim.initialize_jobs(b_jobs)
sim.run()

And we can print the KPIs. It is reasonable12 Within this context, that is. that besides the mean waiting time the largest waiting time is important. Apparently, all customers are served in time, but the waiting times of the economy class customers are quite long.

def mean(iterable):
    "Compute mean of values in an iterable."
    total = count = 0
    for value in iterable:
        total += value
        count += 1
    if count == 0:
        raise ValueError("iterable is empty.")
    return total / count


print(f"Last departure e job: {e_jobs[-1].departure_time:.2f}")
print(f"Last departure b job: {b_jobs[-1].departure_time:.2f}")
print(f"Largest waiting time e job: {max(j.waiting_time for j in e_jobs):.2f}")
print(f"Largest waiting time b job: {max(j.waiting_time for j in b_jobs):.2f}")
print(f"Overall mean waiting time: {sim.stats.mean_waiting_time():.2f}")
print(f"mean waiting e jobs: {mean(job.waiting_time for job in e_jobs):.2f}")
print(f"mean waiting b jobs: {mean(job.waiting_time for job in b_jobs):.2f}")
Python
Last departure e job: 146.96
Last departure b job: 135.60
Largest waiting time e job: 25.17
Largest waiting time b job: 7.47
Overall mean waiting time: 8.17
mean waiting e jobs: 10.51
mean waiting b jobs: 1.09

One way to reduce the waiting times for the economy class is by opening an extra economy desk, that is, use \(5+2\) desks instead of \(4+2\).13 This comes at the expense of hiring an extra server: 7 instead of 6. The results show that this decision has a large impact on the waiting time.

Python
Last departure e job: 126.90
Last departure b job: 135.60
Largest waiting time e job: 8.32
Largest waiting time b job: 7.47
Overall mean waiting time: 2.36
mean waiting e jobs: 2.78
mean waiting b jobs: 1.09

However, rather than paying for this extra desk, we can share the business class servers and keep using \(4+2=6\) desks. Clearly, the business customers are hardly affected, but the mean waiting time of the economy class is reduced by a factor two, as compared to the situation without server sharing and 6 servers.

Python
Last departure e job: 134.48
Last departure b job: 137.10
Largest waiting time e job: 13.61
Largest waiting time b job: 7.58
Overall mean waiting time: 4.59
mean waiting e jobs: 5.64
mean waiting b jobs: 1.43

To conclude, adding extra service capacity has the largest effect on waiting time, but server sharing can also be an effective strategy.

TF 3.6.1

Claim: this class implements a sorting rule in which jobs with the same priority are served in LIFO order.

Python
class PriorityJob(Job):
    priority: int = 0

    def __lt__(self, other):
        if self.priority == other.priority:
            return self.arrival_time < other.arrival_time
        else:
            return self.priority > other.priority
Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

False. Compare with the example code.

TF 3.6.2

Claim: This method of the simulation class switches on the server when the system contains thres_N jobs or more.

Python
def handle_arrival(self, job: Job):
    job.q_length_at_arrival = self.queue.length()
    if (
        self.queue.length() >= self.thres_N
        and self.servers.is_server_available()
    ):
        self.serve_job(self.queue.pop(), self.servers)
    else:
        self.queue.push(job)
Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

False. It waits until the system contains thres_N plus one jobs.

TF 3.6.3

Consider the check-in process at an airport when the business servers are shared with the economy customers. Claim: This method of the simulation class handles the arrivals correctly.

Python
def handle_arrival(self, job):
    if job.type == ECONOMY:
        if self.e_servers.is_server_available():
            self.serve_job(job, self.e_servers)
        elif self.share:
            self.serve_job(job, self.b_servers)
        else:
            self.e_queue.push(job)
Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

False. Compare with the provided code in the main text. There is no check on the availability of the business servers.

TF 3.6.4

Claim: In a discrete-event simulation of a multi-server queue, the server with the shortest remaining service time should always be chosen for new arrivals.

Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

False. A common policy is to assign the arriving job to the server that becomes free earliest, i.e., the one with the smallest workload. This minimizes the waiting time, not the remaining service time.

TF 3.6.5

Claim: In a priority queue, jobs with higher priority always have a shorter expected sojourn time than jobs with lower priority.

Solution
Did you actually try? Maybe see the 'hints' above!:
Solution, for real

True, provided the system is stable. Higher priority jobs are served before lower priority jobs, so their expected waiting time and hence sojourn time is shorter.