Queuing Theory

M/M/1 and M/M/c Queue Systems

MIN Sothearith

Institute of Technology of Cambodia

What is Queuing?

Queues Are Everywhere!

Daily Life Examples:

🏪 Supermarket checkout

🏥 Doctor’s office

📞 Customer service

🚗 Traffic signals

🍔 Fast food restaurants

🖨️ Print queue

💻 Server requests

✈️ Airport security

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#e3f2fd','primaryTextColor':'#000','primaryBorderColor':'#1976d2','lineColor':'#1976d2','secondaryColor':'#fff3e0','tertiaryColor':'#c8e6c9','background':'#ffffff','mainBkg':'#ffffff','secondBkg':'#f5f5f5'}}}%%
graph LR
    A[Customers<br/>Arrive] --> B[Queue<br/>▓▓▓]
    B --> C[Server<br/>⚙️]
    C --> D[Customers<br/>Depart]
    style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style B fill:#fff3e0,stroke:#f57f17,stroke-width:2px
    style C fill:#c8e6c9,stroke:#388e3c,stroke-width:2px
    style D fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px

Key Insight

Queuing Theory provides mathematical tools to analyze and optimize waiting line systems.

The Basic Queue System

%%{init: {'theme':'base', 'themeVariables': { 'background':'#ffffff','mainBkg':'#ffffff','secondBkg':'#f5f5f5'}}}%%
graph LR
    A[Arrival Process<br/>λ = 30/hour] --> B{Queue<br/>Waiting Area}
    B --> C[Service Facility<br/>μ = 40/hour]
    C --> D[Departures]
    
    style A fill:#bbdefb,stroke:#1976d2,stroke-width:3px,color:#000
    style B fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#000
    style C fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000
    style D fill:#f8bbd0,stroke:#c2185b,stroke-width:3px,color:#000

Input

  • Arrival rate (λ)

  • Random/scheduled

  • Independent arrivals

Queue

  • Waiting capacity

  • Discipline (FCFS, Priority)

  • Queue length

Service

  • Service rate (μ)

  • Number of servers

  • Service time distribution

Why Study Queues?

Customer Perspective

Key Questions:

  • How long will I wait?

  • How many ahead of me?

  • Total time until done?

Manager Perspective

Key Questions:

  • How many servers needed?

  • What’s the utilization?

  • Cost vs. service trade-off?

Real-World Impact

  • Optimizes cashier scheduling

  • Designs efficient drive-throughs

  • Reduces average wait time

  • Balances labor cost vs. service

  • ER staffing optimization

  • Reduces patient waiting times

  • Balances cost vs. care quality

  • Handles variable demand

  • Determines optimal agent count

  • Minimizes hold times

  • Maximizes utilization

  • Service level agreements

Mathematical Foundations

Poisson Process - Arrivals

Probability Distribution

\[P(N(t) = n) = \frac{(\lambda t)^n e^{-\lambda t}}{n!}\]

Parameters:

  • \(\lambda\) = average arrival rate

  • \(t\) = time interval

  • \(n\) = number of arrivals

Properties:

  • Memoryless

  • Independent increments

  • Mean = Variance = \(\lambda t\)

Example

Given: λ = 5/hour

Question: P(3 arrivals in 1 hr)?

Answer: 14.0%

Interpretation:

Random arrivals follow exponential inter-arrival times

Exponential Distribution - Service

PDF and CDF

Probability Density: \[f(x) = \mu e^{-\mu x}, \quad x \geq 0\]

Cumulative Distribution: \[F(x) = 1 - e^{-\mu x}\]

Key Properties:

  • Mean: \(E[X] = \frac{1}{\mu}\)

  • Variance: \(\text{Var}(X) = \frac{1}{\mu^2}\)

  • Memoryless property

Example

Given: μ = 8/hour

Results:

  • Mean service = 7.5 min

  • P(service ≤ 5 min) = 48.7%

Interpretation:

Service times are random but average to 1/μ

Birth–Death Process

The number of customers in the system changes one at a time:

  • arrivals increase the count by +1 at rate \(\lambda\)
  • service completions decrease the count by –1 at rate \(\mu\)

Balance Equations

At steady state, flow into each state equals flow out.

Between states 0 and 1:

\[ \lambda P_0 = \mu P_1 \]

Between states (n-1) and (n):

\[ \lambda P_{n-1} = \mu P_n \]

Recursive Form

From the balance equations:

\[ P_n = \frac{\lambda}{\mu}P_{\,n-1} \]

Let:

\[ \rho = \frac{\lambda}{\mu} \]

Then:

\[ P_n = \rho^n P_0 \]

Normalization

All probabilities add to 1:

\[ \sum_{n=0}^{\infty} P_n = 1 \]

Substitute the recursive form:

\[ P_0\sum_{n=0}^{\infty}\rho^n = 1 \]

Using the geometric series result (valid if \(\rho < 1\) ):

\[ P_0 = 1 - \rho \]

Final Result

\[ P_n = (1-\rho)\rho^n \]

This gives the long-run probability of having exactly (n) customers in the system.

State Probabilities

For ρ = 0.6

  • P₀ = 0.40 (Empty 40%)

  • P₁ = 0.24 (1 customer)

  • P₂ = 0.14 (2 customers)

  • P₃ = 0.09 (3 customers)

  • P₄+ = 0.13 (4+ customers)

Low utilization → mostly empty

For ρ = 0.9

  • P₀ = 0.10 (Empty 10%)

  • P₁ = 0.09 (1 customer)

  • P₂ = 0.08 (2 customers)

  • P₃₋₅ = 0.22 (3-5 customers)

  • P₆+ = 0.51 (6+ customers)

High utilization → often crowded

Key Insight: As ρ increases, probability of large queues increases dramatically!

Kendall’s Notation

A/B/C/D/E/F System

Full Notation

A = Arrival distribution

B = Service distribution

C = Number of servers

D = System capacity

E = Population size

F = Service discipline

Common Symbols

  • M = Markovian (exponential)

  • D = Deterministic (constant)

  • G = General distribution

  • GI = General Independent

Examples

M/M/1

  • Single server

  • Random arrival/service

M/M/c

  • Multiple servers

  • Random arrival/service

M/D/1

  • Single server

  • Constant service

M/M/1/K

  • Limited capacity K

M/M/1 System Breakdown

%%{init: {'theme':'base', 'themeVariables': { 'background':'#ffffff','mainBkg':'#ffffff'}}}%%
graph LR
    A[M<br/>Markovian<br/>Arrivals] --> B[M<br/>Markovian<br/>Service]
    B --> C[1<br/>Single<br/>Server]
    
    A1[Poisson Process<br/>Rate λ<br/>Random timing] -.-> A
    B1[Exponential Times<br/>Rate μ<br/>Memoryless] -.-> B
    C1[One facility<br/>FCFS discipline<br/>Infinite capacity] -.-> C
    
    style A fill:#e3f2fd,stroke:#1976d2,stroke-width:3px,color:#000
    style B fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px,color:#000
    style C fill:#e8f5e9,stroke:#388e3c,stroke-width:3px,color:#000
    style A1 fill:#bbdefb,stroke:#1565c0,stroke-width:1px,color:#000
    style B1 fill:#ce93d8,stroke:#6a1b9a,stroke-width:1px,color:#000
    style C1 fill:#a5d6a7,stroke:#2e7d32,stroke-width:1px,color:#000

Critical Assumption

Stability Condition: \(\rho = \frac{\lambda}{\mu} < 1\)

Without this, the queue grows infinitely!

M/M/1 Performance Measures

The Four Key Metrics

System Metrics (Total Experience):

  • L = Average number in system = \(\frac{\lambda}{\mu-\lambda}\)

  • W = Average time in system = \(\frac{1}{\mu-\lambda}\)

Queue Metrics (Waiting Only):

  • Lq = Average number in queue = \(\frac{\lambda^2}{\mu(\mu-\lambda)}\)

  • Wq = Average wait time = \(\frac{\lambda}{\mu(\mu-\lambda)}\)

Little’s Law:

  • L = λW

  • Lq = λWq

System vs Queue

System (L, W)

Includes everything:

  • Waiting time

  • Service time

  • Total experience

Formula: \[W = W_q + \frac{1}{\mu}\]

Queue Only (Lq, Wq)

Excludes service:

  • Only waiting

  • Not being served

  • “Pain” customers feel

Formula: \[L = L_q + \rho\]

Formula Relationships

Starting Point: \[\rho = \frac{\lambda}{\mu}\]

Derive Everything:

  1. \(L = \frac{\rho}{1-\rho}\)

  2. \(L_q = \frac{\rho^2}{1-\rho}\)

  3. \(W = \frac{L}{\lambda}\)

  4. \(W_q = \frac{L_q}{\lambda}\)

Verification:

  • \(L = L_q + \rho\)

  • \(W = W_q + \frac{1}{\mu}\)

The Danger of High Utilization

ρ L (Avg in System) W (Multiple of 1/μ)
0.5 1.0
0.7 2.3 3.3×
0.8 4.0
0.9 9.0 10×
0.95 19.0 20×
0.99 99.0 100×

Key Insight: Near saturation, small increases in ρ cause exponential growth in waiting!

Sweet spot: ρ = 0.70-0.80

Coffee Shop Example

Problem Setup

Scenario

Given:

  • Students arrive: λ = 30/hour

  • Barista serves: μ = 40/hour

  • One barista (M/M/1)

  • FCFS service

Questions

  1. Server utilization ρ?

  2. Average students in shop (L)?

  3. Average in queue (Lq)?

  4. Average time in shop (W)?

  5. Average wait time (Wq)?

  6. Probability 5+ students?

  7. Standard deviation?

Solution: Step 1 - Utilization

Calculation

\[\rho = \frac{\lambda}{\mu} = \frac{30}{40} = 0.75\]

Stability Check: \[\rho = 0.75 < 1 \quad ✓\]

Interpretation

  • Busy 75% of time

  • Idle 25% of time

  • Healthy utilization ✓

  • System is stable ✓

In 8-hour shift: 6 hours working, 2 hours idle

Solution: Steps 2-3 - System Size

L (Total in System)

\[L = \frac{\rho}{1-\rho} = \frac{0.75}{0.25} = 3\]

Average: 3 students total

Breakdown:

  • 1 Being Served

  • 2 Waiting in Queue

Lq (Only Waiting)

\[L_q = \frac{\rho^2}{1-\rho} = \frac{0.5625}{0.25} = 2.25\]

Or: \(L_q = L - \rho = 3 - 0.75 = 2.25\)

Average: 2.25 waiting

Interpretation:

  • Not being served

  • Actual queue length

Solution: Steps 4-5 - Time Measures

W (Total Time)

\[W = \frac{1}{\mu-\lambda} = \frac{1}{10} = 0.1 \text{ hr}\]

W = 6 minutes

Breakdown:

  • Wait: 4.5 min

  • Service: 1.5 min

  • Total: 6 min

Verify: \(W = L/\lambda = 3/30 = 0.1\)

Wq (Wait Time)

\[W_q = \frac{\lambda}{\mu(\mu-\lambda)} = \frac{30}{400} = 0.075 \text{ hr}\]

Wq = 4.5 minutes

Interpretation:

Service time: \(1/\mu = 1.5\) min

Total: \(4.5 + 1.5 = 6\)

Solution: Steps 6-7 - Probabilities

Probability Calculations

P(5 or more students): \[P(N \geq 5) = \rho^5 = 0.75^5 = 0.237\]

23.7% chance

P(Empty system): \[P_0 = 1 - \rho = 0.25\]

Empty 25% of time

Distribution:

  • Empty: 25%

  • 1-4 customers: 51%

  • 5+ customers: 24%

Variability Analysis

Variance: \[\text{Var}(N) = \frac{\rho}{(1-\rho)^2} = \frac{0.75}{0.0625} = 12\]

Standard deviation: \[\sigma = \sqrt{12} = 3.46\]

Coefficient of Variation: \[CV = \frac{\sigma}{L} = \frac{3.46}{3} = 1.15\]

High variability is typical of queueing systems!

Complete Results Dashboard

Metric Value Interpretation
ρ 75% Server utilization
L 3.0 Avg students in shop
Lq 2.25 Avg waiting in line
W 6 min Total time in shop
Wq 4.5 min Time waiting
σ 3.46 High variability

Insights:

  • ✓ System is stable

  • ✓ Reasonable wait times

  • ✓ Good utilization

  • ⚠️ High variability means occasional long lines

  • 💡 Consider second barista if demand increases

Sensitivity Analysis

Base Case

Parameters: λ=30, μ=40

Results:

  • ρ = 0.75

  • L = 3

  • W = 6 min

Scenario 1: Busier

Parameters: λ=36, μ=40

Results:

  • ρ = 0.90 (danger!)

  • L = 9 (3× worse!)

  • W = 15 min (2.5× worse!)

Scenario 2: Slower Service

Parameters: λ=30, μ=35

Results:

  • ρ = 0.86

  • L = 6 (2× worse)

  • W = 12 min (2× worse)

Scenario 3: Optimal

Parameters: λ=30, μ=43

Results:

  • ρ = 0.70 (ideal!)

  • L = 2.3 (better)

  • W = 4.7 min (better)

Key insight: Small parameter changes → Large performance impacts!

M/M/c Queue System

Multi-Server Configuration

%%{init: {'theme':'base', 'themeVariables': { 'background':'#ffffff','mainBkg':'#ffffff'}}}%%
graph LR
    A[Arrivals<br/>λ] --> B[Single Queue<br/>FCFS]
    
    B --> C1[Server 1<br/>μ]
    B --> C2[Server 2<br/>μ]
    B --> C3[Server 3<br/>μ]
    B --> C4[Server c<br/>μ]
    
    C1 --> D[Departures]
    C2 --> D
    C3 --> D
    C4 --> D
    
    style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000
    style B fill:#fff9c4,stroke:#f57c00,stroke-width:2px,color:#000
    style C1 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style C2 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style C3 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style C4 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style D fill:#f8bbd0,stroke:#c2185b,stroke-width:2px,color:#000

Key Differences from M/M/1

  • Single queue feeds multiple servers

  • Total capacity = c × μ

  • First available server takes next customer

  • More complex formulas (Erlang-C)

M/M/c Parameters

Input Parameters

λ = Arrival Rate (customers/time)

μ = Service Rate (per server)

c = Number of Servers

Derived Measures

Traffic Intensity: \[a = \frac{\lambda}{\mu}\]

Per-Server Utilization: \[\rho = \frac{\lambda}{c\mu} = \frac{a}{c}\]

Stability Condition

\[a < c \Leftrightarrow \rho < 1\]

Interpretation:

  • Need at least ⌈a⌉ servers

  • Actually need more for good service

  • Rule of thumb: c ≥ a + 1 or 2

Example:

  • If a = 4, need c ≥ 5

  • If c = 4, then ρ = 1.0 (unstable!)

Erlang-C Formula

The Formula

\[C(c, a) = \frac{\frac{a^c}{c!} \cdot \frac{c}{c-a}}{\sum_{n=0}^{c-1} \frac{a^n}{n!} + \frac{a^c}{c!} \cdot \frac{c}{c-a}}\]

Meaning:

Probability all servers busy = Probability customer must wait

Erlang-C Values

c ρ C(c,a) Meaning
2 0.8 60% 60% wait
3 0.8 40% 40% wait
4 0.8 29% 29% wait
5 0.8 22% 22% wait

Adding servers dramatically reduces wait probability!

M/M/c Performance Measures

Starting with Erlang-C: \[C(c,a) = \text{Probability of waiting}\]

Queue Metrics:

\[L_q = C(c,a) \cdot \frac{a}{c-a}\]

\[W_q = \frac{C(c,a)}{c\mu - \lambda}\]

System Metrics:

\[L = L_q + a\]

\[W = W_q + \frac{1}{\mu}\]

Little’s Law still applies!

  • \(L_q = \lambda W_q\)

  • \(L = \lambda W\)

Call Center Example

Problem Setup

Given:

  • λ = 40 calls/hour

  • μ = 10 calls/hour per agent

  • c = 5 agents

Question: Performance metrics?

Call Center: Step 1 - Parameters

Traffic Intensity

\[a = \frac{\lambda}{\mu} = \frac{40}{10} = 4\]

Interpretation:

Need minimum 4 servers

But with c = 4: ρ = 1.0 (unstable!)

Need c ≥ 5 for stability

Utilization

\[\rho = \frac{a}{c} = \frac{4}{5} = 0.80\]

Stability Check:

Interpretation:

  • Each agent busy 80% of time

  • Idle 20% of time

  • System is stable

Call Center: Step 2 - Erlang-C

Calculation Process

\[P_0 = \left[\sum_{n=0}^{4} \frac{4^n}{n!} + \frac{4^5}{5!} \cdot \frac{5}{1}\right]^{-1}\]

\[P_0 = \frac{1}{77.01} = 0.013\]

\[C(5,4) = \frac{42.67 \times 0.013}{1} = 0.555\]

Interpretation

55.5% of callers wait on hold

44.5% get immediate service

Out of 100 calls:

  • 55 calls must wait

  • 45 calls served immediately

Call Center: Performance Results

Queue Metrics

\[L_q = \frac{0.555 \times 4}{5-4} = 2.22 \text{ calls}\]

\[W_q = \frac{2.22}{40} = 0.0555 \text{ hr} = 3.33 \text{ min}\]

System Metrics

\[L = 2.22 + 4 = 6.22 \text{ calls}\]

\[W = 3.33 + 6 = 9.33 \text{ min}\]

Summary

Metric Value Status
ρ 80% 🟡 High
C(5,4) 55.5% 🟡 Moderate
Lq 2.22 🟡 Acceptable
Wq 3.33 min 🟢 Good
L 6.22 🟢 Reasonable
W 9.33 min 🟢 Good

M/M/c vs M/M/1 Comparison

M/M/1: Fast Server

Configuration:

  • 1 server at μ=50/hr

  • ρ = 0.80

  • W = 6 min

  • Wq = 4.8 min

Pros:

  • Shorter total time

Cons:

  • Requires expensive high-speed server

  • Single point of failure

  • High variability

M/M/5: Multiple Servers

Configuration:

  • 5 servers at μ=10/hr each

  • ρ = 0.80

  • W = 9.33 min

  • Wq = 3.33 min

Pros:

  • Less variability (pooling effect)

  • More resilient to failures

  • Cost-effective flexibility

Cons:

  • Slightly longer total time

Winner: M/M/c for real-world applications!

Why M/M/c is Better

Advantages:

  1. Less Variability

    • Statistical smoothing from pooling
  2. More Resilient

    • Backup capacity if server fails
  3. Better Experience

    • More consistent wait times
  4. More Realistic

    • Standard servers are cheaper

    • Easier to scale up/down

Key Principle: Multiple slow servers beat one fast server!

Optimal Server Calculation

Objective: Minimize Total Cost

\[TC(c) = c \cdot C_s + L(c) \cdot C_w\]

Where:

  • \(C_s\) = cost per server

  • \(C_w\) = waiting cost per customer

  • \(L(c)\) = average in system with c servers

Algorithm:

  1. Start with \(c = \lceil a \rceil + 1\)

  2. Calculate \(TC(c)\)

  3. Calculate \(TC(c+1)\)

  4. If \(TC(c+1) < TC(c)\), increase c and repeat

  5. Otherwise, optimal c found

Practice: Bank Teller Problem

Scenario

Given:

  • λ = 30 customers/hr

  • Service time = 4 min/customer

  • Therefore μ = 15/hr

Question: Analyze with c=3 tellers

Practice Solution

Step 1: Parameters

  • \(a = \frac{30}{15} = 2\)

  • \(\rho = \frac{2}{3} = 0.667\)

  • Stable system

Step 2: Erlang-C

  • \(P_0 = 0.111\)

  • \(C(3,2) = 0.444\)

  • 44.4% must wait

Step 3: Performance

Metric Value
Lq 0.888
Wq 1.78 min
L 2.888
W 5.78 min

Interpretation: Good service level with reasonable waiting times

Key Takeaways

Summary: M/M/1

Foundation:

  • Poisson arrivals, Exponential service

  • Single server

  • Requires ρ < 1

Key Formulas:

  • \(L = \frac{\rho}{1-\rho}\)

  • \(L_q = \frac{\rho^2}{1-\rho}\)

  • Little’s Law: L = λW

Applications:

  • ATMs, Single checkout

  • One-person service

  • Initial analysis

Summary: M/M/c

Multi-Server System:

  • c parallel servers

  • Single queue, FCFS

  • Requires a < c

Erlang-C:

  • Probability of waiting

  • Complex but critical formula

Advantages:

  • Better performance

  • Less variability

  • More resilient

Applications:

  • Call centers, Banks

  • Hospitals, Service desks

Essential Formulas

M/M/1

\[\rho = \frac{\lambda}{\mu} \quad L = \frac{\rho}{1-\rho} \quad L_q = \frac{\rho^2}{1-\rho}\]

\[W = \frac{1}{\mu-\lambda} \quad W_q = \frac{\lambda}{\mu(\mu-\lambda)}\]

M/M/c

\[a = \frac{\lambda}{\mu} \quad \rho = \frac{a}{c}\]

\[L_q = C(c,a) \cdot \frac{a}{c-a} \quad W_q = \frac{L_q}{\lambda}\]

\[L = L_q + a \quad W = W_q + \frac{1}{\mu}\]

Little’s Law: Always verify with L = λW and Lq = λWq

Problem-Solving Strategy

Step 1: Identify λ, μ, c

Step 2: Determine queue type (M/M/1 or M/M/c)

Step 3: Check stability

  • M/M/1: ρ < 1?

  • M/M/c: a < c?

Step 4: Calculate performance measures

  • M/M/1: Direct formulas

  • M/M/c: Start with Erlang-C

Step 5: Verify with Little’s Law

Step 6: Interpret results

Thank You!

Min Sothearith!