Queuing Theory

M/M/1 and M/M/c Queue Systems

MIN Sothearith

Institute of Technology of Cambodia

What is Queuing?

Queues Are Everywhere!

Daily Life Examples:

🏪 Supermarket checkout

🏥 Doctor’s office

📞 Customer service

🚗 Traffic signals

🍔 Fast food restaurants

🖨️ Print queue

💻 Server requests

✈️ Airport security

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#e3f2fd','primaryTextColor':'#000','primaryBorderColor':'#1976d2','lineColor':'#1976d2','secondaryColor':'#fff3e0','tertiaryColor':'#c8e6c9','background':'#ffffff','mainBkg':'#ffffff','secondBkg':'#f5f5f5'}}}%%
graph LR
    A[Customers<br/>Arrive] --> B[Queue<br/>▓▓▓]
    B --> C[Server<br/>⚙️]
    C --> D[Customers<br/>Depart]
    style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style B fill:#fff3e0,stroke:#f57f17,stroke-width:2px
    style C fill:#c8e6c9,stroke:#388e3c,stroke-width:2px
    style D fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px

Key Insight

Queuing Theory provides mathematical tools to analyze and optimize waiting line systems.

The Basic Queue System

%%{init: {'theme':'base', 'themeVariables': { 'background':'#ffffff','mainBkg':'#ffffff','secondBkg':'#f5f5f5'}}}%%
graph LR
    A[Arrival Process<br/>λ = 30/hour] --> B{Queue<br/>Waiting Area}
    B --> C[Service Facility<br/>μ = 40/hour]
    C --> D[Departures]
    
    style A fill:#bbdefb,stroke:#1976d2,stroke-width:3px,color:#000
    style B fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#000
    style C fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000
    style D fill:#f8bbd0,stroke:#c2185b,stroke-width:3px,color:#000

Input

Arrival rate (λ)
Random/scheduled
Independent arrivals

Queue

Waiting capacity
Discipline (FCFS, Priority)
Queue length

Service

Service rate (μ)
Number of servers
Service time distribution

Why Study Queues?

Customer Perspective

Key Questions:

How long will I wait?
How many ahead of me?
Total time until done?

Manager Perspective

Key Questions:

How many servers needed?
What’s the utilization?
Cost vs. service trade-off?

Optimizes cashier scheduling
Designs efficient drive-throughs
Reduces average wait time
Balances labor cost vs. service

ER staffing optimization
Reduces patient waiting times
Balances cost vs. care quality
Handles variable demand

Determines optimal agent count
Minimizes hold times
Maximizes utilization
Service level agreements

Mathematical Foundations

Poisson Process - Arrivals

Probability Distribution

\[P(N(t) = n) = \frac{(\lambda t)^n e^{-\lambda t}}{n!}\]

Parameters:

\(\lambda\) = average arrival rate
\(t\) = time interval
\(n\) = number of arrivals

Properties:

Memoryless
Independent increments
Mean = Variance = \(\lambda t\)

Example

Given: λ = 5/hour

Question: P(3 arrivals in 1 hr)?

Answer: 14.0%

Interpretation:

Random arrivals follow exponential inter-arrival times

Exponential Distribution - Service

PDF and CDF

Probability Density: \[f(x) = \mu e^{-\mu x}, \quad x \geq 0\]

Cumulative Distribution: \[F(x) = 1 - e^{-\mu x}\]

Key Properties:

Mean: \(E[X] = \frac{1}{\mu}\)
Variance: \(\text{Var}(X) = \frac{1}{\mu^2}\)
Memoryless property

Example

Given: μ = 8/hour

Results:

Mean service = 7.5 min
P(service ≤ 5 min) = 48.7%

Interpretation:

Service times are random but average to 1/μ

Birth–Death Process

The number of customers in the system changes one at a time:

arrivals increase the count by +1 at rate \(\lambda\)
service completions decrease the count by –1 at rate \(\mu\)

Balance Equations

At steady state, flow into each state equals flow out.

Between states 0 and 1:

\[ \lambda P_0 = \mu P_1 \]

Between states (n-1) and (n):

\[ \lambda P_{n-1} = \mu P_n \]

Recursive Form

From the balance equations:

\[ P_n = \frac{\lambda}{\mu}P_{\,n-1} \]

Let:

\[ \rho = \frac{\lambda}{\mu} \]

Then:

\[ P_n = \rho^n P_0 \]

Normalization

All probabilities add to 1:

\[ \sum_{n=0}^{\infty} P_n = 1 \]

Substitute the recursive form:

\[ P_0\sum_{n=0}^{\infty}\rho^n = 1 \]

Using the geometric series result (valid if \(\rho < 1\) ):

\[ P_0 = 1 - \rho \]

Final Result

\[ P_n = (1-\rho)\rho^n \]

This gives the long-run probability of having exactly (n) customers in the system.

State Probabilities

For ρ = 0.6

P₀ = 0.40 (Empty 40%)
P₁ = 0.24 (1 customer)
P₂ = 0.14 (2 customers)
P₃ = 0.09 (3 customers)
P₄+ = 0.13 (4+ customers)

Low utilization → mostly empty

For ρ = 0.9

P₀ = 0.10 (Empty 10%)
P₁ = 0.09 (1 customer)
P₂ = 0.08 (2 customers)
P₃₋₅ = 0.22 (3-5 customers)
P₆+ = 0.51 (6+ customers)

High utilization → often crowded

Key Insight: As ρ increases, probability of large queues increases dramatically!

Kendall’s Notation

A/B/C/D/E/F System

Full Notation

A = Arrival distribution

B = Service distribution

C = Number of servers

D = System capacity

E = Population size

F = Service discipline

Common Symbols

M = Markovian (exponential)
D = Deterministic (constant)
G = General distribution
GI = General Independent

Examples

M/M/1

Single server
Random arrival/service

M/M/c

Multiple servers
Random arrival/service

M/D/1

Single server
Constant service

M/M/1/K

Limited capacity K

M/M/1 System Breakdown

%%{init: {'theme':'base', 'themeVariables': { 'background':'#ffffff','mainBkg':'#ffffff'}}}%%
graph LR
    A[M<br/>Markovian<br/>Arrivals] --> B[M<br/>Markovian<br/>Service]
    B --> C[1<br/>Single<br/>Server]
    
    A1[Poisson Process<br/>Rate λ<br/>Random timing] -.-> A
    B1[Exponential Times<br/>Rate μ<br/>Memoryless] -.-> B
    C1[One facility<br/>FCFS discipline<br/>Infinite capacity] -.-> C
    
    style A fill:#e3f2fd,stroke:#1976d2,stroke-width:3px,color:#000
    style B fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px,color:#000
    style C fill:#e8f5e9,stroke:#388e3c,stroke-width:3px,color:#000
    style A1 fill:#bbdefb,stroke:#1565c0,stroke-width:1px,color:#000
    style B1 fill:#ce93d8,stroke:#6a1b9a,stroke-width:1px,color:#000
    style C1 fill:#a5d6a7,stroke:#2e7d32,stroke-width:1px,color:#000

Critical Assumption

Stability Condition: \(\rho = \frac{\lambda}{\mu} < 1\)

Without this, the queue grows infinitely!

M/M/1 Performance Measures

The Four Key Metrics

System Metrics (Total Experience):

L = Average number in system = \(\frac{\lambda}{\mu-\lambda}\)
W = Average time in system = \(\frac{1}{\mu-\lambda}\)

Queue Metrics (Waiting Only):

Lq = Average number in queue = \(\frac{\lambda^2}{\mu(\mu-\lambda)}\)
Wq = Average wait time = \(\frac{\lambda}{\mu(\mu-\lambda)}\)

Little’s Law:

L = λW
Lq = λWq

System vs Queue

System (L, W)

Includes everything:

Waiting time
Service time
Total experience

Formula: \[W = W_q + \frac{1}{\mu}\]

Queue Only (Lq, Wq)

Excludes service:

Only waiting
Not being served
“Pain” customers feel

Formula: \[L = L_q + \rho\]

Formula Relationships

Starting Point: \[\rho = \frac{\lambda}{\mu}\]

Derive Everything:

\(L = \frac{\rho}{1-\rho}\)
\(L_q = \frac{\rho^2}{1-\rho}\)
\(W = \frac{L}{\lambda}\)
\(W_q = \frac{L_q}{\lambda}\)

Verification:

\(L = L_q + \rho\)
\(W = W_q + \frac{1}{\mu}\)

The Danger of High Utilization

ρ	L (Avg in System)	W (Multiple of 1/μ)
0.5	1.0	2×
0.7	2.3	3.3×
0.8	4.0	5×
0.9	9.0	10×
0.95	19.0	20×
0.99	99.0	100×

Key Insight: Near saturation, small increases in ρ cause exponential growth in waiting!

Sweet spot: ρ = 0.70-0.80

Coffee Shop Example

Problem Setup

Scenario

Given:

Students arrive: λ = 30/hour
Barista serves: μ = 40/hour
One barista (M/M/1)
FCFS service

Questions

Server utilization ρ?
Average students in shop (L)?
Average in queue (Lq)?
Average time in shop (W)?
Average wait time (Wq)?
Probability 5+ students?
Standard deviation?

Solution: Step 1 - Utilization

Calculation

\[\rho = \frac{\lambda}{\mu} = \frac{30}{40} = 0.75\]

Stability Check: \[\rho = 0.75 < 1 \quad ✓\]

Interpretation

Busy 75% of time
Idle 25% of time
Healthy utilization ✓
System is stable ✓

In 8-hour shift: 6 hours working, 2 hours idle

Solution: Steps 2-3 - System Size

L (Total in System)

\[L = \frac{\rho}{1-\rho} = \frac{0.75}{0.25} = 3\]

Average: 3 students total

Breakdown:

1 Being Served
2 Waiting in Queue

Lq (Only Waiting)

\[L_q = \frac{\rho^2}{1-\rho} = \frac{0.5625}{0.25} = 2.25\]

Or: \(L_q = L - \rho = 3 - 0.75 = 2.25\)

Average: 2.25 waiting

Interpretation:

Not being served
Actual queue length

Solution: Steps 4-5 - Time Measures

W (Total Time)

\[W = \frac{1}{\mu-\lambda} = \frac{1}{10} = 0.1 \text{ hr}\]

W = 6 minutes

Breakdown:

Wait: 4.5 min
Service: 1.5 min
Total: 6 min

Verify: \(W = L/\lambda = 3/30 = 0.1\) ✓

Wq (Wait Time)

\[W_q = \frac{\lambda}{\mu(\mu-\lambda)} = \frac{30}{400} = 0.075 \text{ hr}\]

Wq = 4.5 minutes

Interpretation:

Service time: \(1/\mu = 1.5\) min

Total: \(4.5 + 1.5 = 6\) ✓

Solution: Steps 6-7 - Probabilities

Probability Calculations

P(5 or more students): \[P(N \geq 5) = \rho^5 = 0.75^5 = 0.237\]

23.7% chance

P(Empty system): \[P_0 = 1 - \rho = 0.25\]

Empty 25% of time

Distribution:

Empty: 25%
1-4 customers: 51%
5+ customers: 24%

Variability Analysis

Variance: \[\text{Var}(N) = \frac{\rho}{(1-\rho)^2} = \frac{0.75}{0.0625} = 12\]

Standard deviation: \[\sigma = \sqrt{12} = 3.46\]

Coefficient of Variation: \[CV = \frac{\sigma}{L} = \frac{3.46}{3} = 1.15\]

High variability is typical of queueing systems!

Complete Results Dashboard

Metric	Value	Interpretation
ρ	75%	Server utilization
L	3.0	Avg students in shop
Lq	2.25	Avg waiting in line
W	6 min	Total time in shop
Wq	4.5 min	Time waiting
σ	3.46	High variability

Insights:

✓ System is stable
✓ Reasonable wait times
✓ Good utilization
⚠️ High variability means occasional long lines
💡 Consider second barista if demand increases

Sensitivity Analysis

Base Case

Parameters: λ=30, μ=40

Results:

ρ = 0.75
L = 3
W = 6 min

Scenario 1: Busier

Parameters: λ=36, μ=40

Results:

ρ = 0.90 (danger!)
L = 9 (3× worse!)
W = 15 min (2.5× worse!)

Scenario 2: Slower Service

Parameters: λ=30, μ=35

Results:

ρ = 0.86
L = 6 (2× worse)
W = 12 min (2× worse)

Scenario 3: Optimal

Parameters: λ=30, μ=43

Results:

ρ = 0.70 (ideal!)
L = 2.3 (better)
W = 4.7 min (better)

Key insight: Small parameter changes → Large performance impacts!

M/M/c Queue System

Multi-Server Configuration

%%{init: {'theme':'base', 'themeVariables': { 'background':'#ffffff','mainBkg':'#ffffff'}}}%%
graph LR
    A[Arrivals<br/>λ] --> B[Single Queue<br/>FCFS]
    
    B --> C1[Server 1<br/>μ]
    B --> C2[Server 2<br/>μ]
    B --> C3[Server 3<br/>μ]
    B --> C4[Server c<br/>μ]
    
    C1 --> D[Departures]
    C2 --> D
    C3 --> D
    C4 --> D
    
    style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000
    style B fill:#fff9c4,stroke:#f57c00,stroke-width:2px,color:#000
    style C1 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style C2 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style C3 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style C4 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style D fill:#f8bbd0,stroke:#c2185b,stroke-width:2px,color:#000

Key Differences from M/M/1

Single queue feeds multiple servers
Total capacity = c × μ
First available server takes next customer
More complex formulas (Erlang-C)

M/M/c Parameters

Input Parameters

λ = Arrival Rate (customers/time)

μ = Service Rate (per server)

c = Number of Servers

Derived Measures

Traffic Intensity: \[a = \frac{\lambda}{\mu}\]

Per-Server Utilization: \[\rho = \frac{\lambda}{c\mu} = \frac{a}{c}\]

Stability Condition

\[a < c \Leftrightarrow \rho < 1\]

Interpretation:

Need at least ⌈a⌉ servers
Actually need more for good service
Rule of thumb: c ≥ a + 1 or 2

Example:

If a = 4, need c ≥ 5
If c = 4, then ρ = 1.0 (unstable!)

Erlang-C Formula

The Formula

\[C(c, a) = \frac{\frac{a^c}{c!} \cdot \frac{c}{c-a}}{\sum_{n=0}^{c-1} \frac{a^n}{n!} + \frac{a^c}{c!} \cdot \frac{c}{c-a}}\]

Meaning:

Probability all servers busy = Probability customer must wait

Erlang-C Values

c	ρ	C(c,a)	Meaning
2	0.8	60%	60% wait
3	0.8	40%	40% wait
4	0.8	29%	29% wait
5	0.8	22%	22% wait

Adding servers dramatically reduces wait probability!

M/M/c Performance Measures

Starting with Erlang-C: \[C(c,a) = \text{Probability of waiting}\]

Queue Metrics:

\[L_q = C(c,a) \cdot \frac{a}{c-a}\]

\[W_q = \frac{C(c,a)}{c\mu - \lambda}\]

System Metrics:

\[L = L_q + a\]

\[W = W_q + \frac{1}{\mu}\]

Little’s Law still applies!

\(L_q = \lambda W_q\)
\(L = \lambda W\)

Call Center Example

Problem Setup

Given:

λ = 40 calls/hour
μ = 10 calls/hour per agent
c = 5 agents

Question: Performance metrics?

Call Center: Step 1 - Parameters

Traffic Intensity

\[a = \frac{\lambda}{\mu} = \frac{40}{10} = 4\]

Interpretation:

Need minimum 4 servers

But with c = 4: ρ = 1.0 (unstable!)

Need c ≥ 5 for stability

Utilization

\[\rho = \frac{a}{c} = \frac{4}{5} = 0.80\]

Stability Check: ✓

Interpretation:

Each agent busy 80% of time
Idle 20% of time
System is stable

Call Center: Step 2 - Erlang-C

Calculation Process

\[P_0 = \left[\sum_{n=0}^{4} \frac{4^n}{n!} + \frac{4^5}{5!} \cdot \frac{5}{1}\right]^{-1}\]

\[P_0 = \frac{1}{77.01} = 0.013\]

\[C(5,4) = \frac{42.67 \times 0.013}{1} = 0.555\]

Interpretation

55.5% of callers wait on hold

44.5% get immediate service

Out of 100 calls:

55 calls must wait
45 calls served immediately

Call Center: Performance Results

Queue Metrics

\[L_q = \frac{0.555 \times 4}{5-4} = 2.22 \text{ calls}\]

\[W_q = \frac{2.22}{40} = 0.0555 \text{ hr} = 3.33 \text{ min}\]

System Metrics

\[L = 2.22 + 4 = 6.22 \text{ calls}\]

\[W = 3.33 + 6 = 9.33 \text{ min}\]

Summary

Metric	Value	Status
ρ	80%	🟡 High
C(5,4)	55.5%	🟡 Moderate
Lq	2.22	🟡 Acceptable
Wq	3.33 min	🟢 Good
L	6.22	🟢 Reasonable
W	9.33 min	🟢 Good

M/M/c vs M/M/1 Comparison

M/M/1: Fast Server

Configuration:

1 server at μ=50/hr
ρ = 0.80
W = 6 min
Wq = 4.8 min

Pros:

Shorter total time

Cons:

Requires expensive high-speed server
Single point of failure
High variability

M/M/5: Multiple Servers

Configuration:

5 servers at μ=10/hr each
ρ = 0.80
W = 9.33 min
Wq = 3.33 min ✓

Pros:

Less variability (pooling effect)
More resilient to failures
Cost-effective flexibility

Cons:

Slightly longer total time

Winner: M/M/c for real-world applications!

Why M/M/c is Better

Advantages:

Less Variability
- Statistical smoothing from pooling
More Resilient
- Backup capacity if server fails
Better Experience
- More consistent wait times
More Realistic
- Standard servers are cheaper
- Easier to scale up/down

Key Principle: Multiple slow servers beat one fast server!

Optimal Server Calculation

Objective: Minimize Total Cost

\[TC(c) = c \cdot C_s + L(c) \cdot C_w\]

Where:

\(C_s\) = cost per server
\(C_w\) = waiting cost per customer
\(L(c)\) = average in system with c servers

Algorithm:

Start with \(c = \lceil a \rceil + 1\)
Calculate \(TC(c)\)
Calculate \(TC(c+1)\)
If \(TC(c+1) < TC(c)\), increase c and repeat
Otherwise, optimal c found

Practice: Bank Teller Problem

Scenario

Given:

λ = 30 customers/hr
Service time = 4 min/customer
Therefore μ = 15/hr

Question: Analyze with c=3 tellers

Practice Solution

Step 1: Parameters

\(a = \frac{30}{15} = 2\)
\(\rho = \frac{2}{3} = 0.667\) ✓
Stable system

Step 2: Erlang-C

\(P_0 = 0.111\)
\(C(3,2) = 0.444\)
44.4% must wait

Step 3: Performance

Metric	Value
Lq	0.888
Wq	1.78 min
L	2.888
W	5.78 min

Interpretation: Good service level with reasonable waiting times

Key Takeaways

Summary: M/M/1

Foundation:

Poisson arrivals, Exponential service
Single server
Requires ρ < 1

Key Formulas:

\(L = \frac{\rho}{1-\rho}\)
\(L_q = \frac{\rho^2}{1-\rho}\)
Little’s Law: L = λW

Applications:

ATMs, Single checkout
One-person service
Initial analysis

Summary: M/M/c

Multi-Server System:

c parallel servers
Single queue, FCFS
Requires a < c

Erlang-C:

Probability of waiting
Complex but critical formula

Advantages:

Better performance
Less variability
More resilient

Applications:

Call centers, Banks
Hospitals, Service desks

Essential Formulas

M/M/1

\[\rho = \frac{\lambda}{\mu} \quad L = \frac{\rho}{1-\rho} \quad L_q = \frac{\rho^2}{1-\rho}\]

\[W = \frac{1}{\mu-\lambda} \quad W_q = \frac{\lambda}{\mu(\mu-\lambda)}\]

M/M/c

\[a = \frac{\lambda}{\mu} \quad \rho = \frac{a}{c}\]

\[L_q = C(c,a) \cdot \frac{a}{c-a} \quad W_q = \frac{L_q}{\lambda}\]

\[L = L_q + a \quad W = W_q + \frac{1}{\mu}\]

Little’s Law: Always verify with L = λW and Lq = λWq

Problem-Solving Strategy

Step 1: Identify λ, μ, c

Step 2: Determine queue type (M/M/1 or M/M/c)

Step 3: Check stability

M/M/1: ρ < 1?
M/M/c: a < c?

Step 4: Calculate performance measures

M/M/1: Direct formulas
M/M/c: Start with Erlang-C

Step 5: Verify with Little’s Law

Step 6: Interpret results

Thank You!

Min Sothearith!