Active Set Methods: From Lagrange Multipliers to Constrained Optimization

Introduction

In control, machine learning, economics, and engineering, we constantly face optimization problems with constraints. We want to minimize a cost, but we can't just go anywhere — there are limits on what's allowed.

For example:

MPC: Minimize control effort, but actuators have limits
Portfolio optimization: Maximize returns, but you can't invest more than 100%
Robot motion: Minimize energy, but joints have angle limits

The Active Set Method is one of the most elegant algorithms for solving these constrained optimization problems. But to understand it deeply, we need to first understand Lagrange multipliers — a beautiful mathematical tool that transforms constrained problems into unconstrained ones.

Let's build this understanding from the ground up.

Starting Point: Unconstrained Optimization

Before adding constraints, let's recall unconstrained optimization. Given a function $f(x)$ where $x \in \mathbb{R}^n$, we want to find:

x^* = \arg\min_x f(x)

The Gradient Condition

At a minimum, the function must be "flat" — moving in any direction doesn't immediately decrease the cost. Mathematically, the gradient must be zero:

\nabla f(x^*) = 0

The gradient $\nabla f = \begin{bmatrix} \frac{\partial f}{\partial x_1} & \cdots & \frac{\partial f}{\partial x_n} \end{bmatrix}^T$ points in the direction of steepest increase. At a minimum, there's no direction of increase (or decrease) — hence zero gradient.

Example: Quadratic Function

Consider $f(x) = \frac{1}{2}x^T H x - c^T x$ where $H$ is positive definite.

The gradient is: $\nabla f = Hx - c$

Setting $\nabla f = 0$: $Hx = c \implies x^* = H^{-1}c$

This is a linear system — easy to solve!

For unconstrained problems, we just solve $\nabla f = 0$. But what happens when we add constraints?

The Problem: Constraints Change Everything

Now consider minimizing $f(x)$ subject to a constraint $g(x) = 0$:

\min_x f(x) \quad \text{subject to} \quad g(x) = 0

The constraint $g(x) = 0$ defines a surface (or curve, or hyperplane) in the space. We can only move along this surface — the solution must lie on it.

Why $\nabla f = 0$ No Longer Works

The unconstrained minimum (where $\nabla f = 0$) might violate the constraint! We need a new condition.

Interactive: Constrained vs Unconstrained Optimum

Constraint position: 0.5

The blue dot is the unconstrained minimum (where $\nabla f = 0$). The red line is the constraint $g(x) = 0$. The orange dot is the constrained minimum — the best point on the constraint. Notice how they're different!

The Key Geometric Insight

At the constrained minimum, we're on the constraint surface. If we could improve (decrease $f$) by moving along the constraint, we would — so we're not at the minimum yet.

At the true constrained minimum, the gradient $\nabla f$ must point perpendicular to the constraint. Why? Because if $\nabla f$ had any component along the constraint, we could move in that direction (staying on the constraint) and decrease $f$.

Geometric Condition: At a constrained minimum, $\nabla f$ is perpendicular to the constraint surface. Since $\nabla g$ is also perpendicular to the constraint surface (it's the normal), we must have $\nabla f$ parallel to $\nabla g$.

"Parallel" means one is a scalar multiple of the other:

\nabla f(x^*) = \lambda \nabla g(x^*)

The scalar $\lambda$ is the Lagrange multiplier. It tells us how much the gradients are "aligned."

Deriving Lagrange Multipliers Rigorously

Let's derive this more carefully. We'll see where the Lagrangian comes from and why it works.

The Lagrangian Function

Define the Lagrangian:

\mathcal{L}(x, \lambda) = f(x) - \lambda g(x)

This combines the objective and constraint into one function with an extra variable $\lambda$.

Stationary Points of the Lagrangian

Now we find where the Lagrangian has zero gradient with respect to both $x$ and $\lambda$:

\frac{\partial \mathcal{L}}{\partial x} = \nabla f(x) - \lambda \nabla g(x) = 0

\frac{\partial \mathcal{L}}{\partial \lambda} = -g(x) = 0

Look at what these equations say:

First equation: $\nabla f = \lambda \nabla g$ — the gradients are parallel (our geometric condition!)
Second equation: $g(x) = 0$ — the constraint is satisfied

By finding stationary points of $\mathcal{L}$, we automatically get both conditions!

The Lagrangian Trick: Instead of solving the constrained problem directly, we solve the unconstrained problem of finding stationary points of $\mathcal{L}(x, \lambda)$. The constraint "reappears" as the condition $\frac{\partial \mathcal{L}}{\partial \lambda} = 0$.

What Does $\lambda$ Mean?

The Lagrange multiplier $\lambda$ has a beautiful interpretation: it measures the sensitivity of the optimal cost to the constraint.

If we slightly relax the constraint from $g(x) = 0$ to $g(x) = \epsilon$, how much does the optimal cost change?

\frac{d f^*}{d \epsilon} = \lambda

So $\lambda$ tells us the "price" or "shadow cost" of the constraint:

$\lambda > 0$: The constraint is "expensive" — relaxing it would help
$\lambda < 0$: The constraint is helping us (unusual for minimization)
$\lambda = 0$: The constraint doesn't matter at this point

Example: Minimizing with a Linear Constraint

Minimize $f(x,y) = x^2 + y^2$ subject to $g(x,y) = x + y - 1 = 0$.

Step 1: Form the Lagrangian: $\mathcal{L} = x^2 + y^2 - \lambda(x + y - 1)$

Step 2: Take derivatives:

$\frac{\partial \mathcal{L}}{\partial x} = 2x - \lambda = 0 \implies x = \lambda/2$

$\frac{\partial \mathcal{L}}{\partial y} = 2y - \lambda = 0 \implies y = \lambda/2$

$\frac{\partial \mathcal{L}}{\partial \lambda} = -(x + y - 1) = 0 \implies x + y = 1$

Step 3: Solve: From the first two equations, $x = y$. Substituting into the constraint: $2x = 1 \implies x = y = 1/2$, and $\lambda = 1$.

Solution: $(x^*, y^*) = (1/2, 1/2)$ with $\lambda^* = 1$.

The multiplier $\lambda = 1$ tells us that if we changed the constraint to $x + y = 1 + \epsilon$, the optimal cost would increase by approximately $\epsilon$.

Inequality Constraints and KKT Conditions

In practice, most constraints are inequalities: $g(x) \leq 0$ rather than $g(x) = 0$. This changes things significantly.

The Key Difference: Active vs. Inactive

An inequality constraint can be in two states:

Active (binding): $g(x^*) = 0$ — we're right on the boundary
Inactive: $g(x^*) < 0$ — we have "slack," the constraint doesn't affect us

If a constraint is inactive at the solution, it's as if it doesn't exist — we could remove it and get the same answer.

If a constraint is active, it becomes an equality constraint, and Lagrange multipliers apply.

Interactive: Active vs Inactive Constraints

Move constraint: 0.8

The constraint is $x + y \leq c$ (gray region is feasible). When $c$ is small, the constraint is active — the optimum sits on the boundary. When $c$ is large, the constraint is inactive — the unconstrained optimum is already feasible.

The KKT Conditions

For optimization with inequality constraints, the generalization of Lagrange conditions is called the Karush-Kuhn-Tucker (KKT) conditions.

Consider: $\min_x f(x)$ subject to $g_i(x) \leq 0$ for $i = 1, \ldots, m$.

The KKT conditions are:

KKT Conditions

Stationarity: $\nabla f(x^*) = \sum_{i=1}^{m} \lambda_i \nabla g_i(x^*)$
The gradient of $f$ is a combination of constraint gradients
Primal feasibility: $g_i(x^*) \leq 0$ for all $i$
The solution satisfies all constraints
Dual feasibility: $\lambda_i \geq 0$ for all $i$
Multipliers for inequalities are non-negative
Complementary slackness: $\lambda_i g_i(x^*) = 0$ for all $i$
Either the constraint is active ($g_i = 0$) or the multiplier is zero ($\lambda_i = 0$), or both

Understanding Complementary Slackness

Condition 4 is the crucial one. It says: for each constraint, at least one of $\lambda_i$ or $g_i$ must be zero.

If $g_i(x^*) < 0$ (constraint inactive, we have slack), then $\lambda_i = 0$ (no "cost" to the constraint)
If $\lambda_i > 0$ (constraint has positive cost), then $g_i(x^*) = 0$ (constraint is active/binding)

This makes intuitive sense: why would a constraint that isn't binding have any influence on the solution?

The Sign of $\lambda$: For inequality constraints $g(x) \leq 0$, the multiplier $\lambda \geq 0$. This is because the constraint can only "push back" against decreasing $f$, never help it. If you found $\lambda < 0$, the constraint shouldn't be in your active set!

The Active Set Method

Now we have all the ingredients. The Active Set Method is an algorithm that solves inequality-constrained optimization by cleverly managing which constraints are active.

The Core Idea

If we knew exactly which constraints were active at the solution, we could:

Treat active constraints as equalities
Ignore inactive constraints entirely
Solve the resulting equality-constrained problem using Lagrange multipliers

But we don't know which constraints are active! The active set method guesses and refines:

Active Set Algorithm

Initialize: Start with a feasible point $x^{(0)}$ and an initial "working set" $\mathcal{W}$ of constraints we think are active
Solve equality subproblem: Minimize $f(x)$ treating constraints in $\mathcal{W}$ as equalities. This gives candidate solution $x^*$ and multipliers $\lambda_i$ for $i \in \mathcal{W}$
Check multiplier signs:
- If all $\lambda_i \geq 0$ for $i \in \mathcal{W}$: go to step 4
- If some $\lambda_j < 0$: remove constraint $j$ from $\mathcal{W}$ (it shouldn't be active), go to step 2
Check feasibility:
- If $x^*$ satisfies all constraints: DONE! $x^*$ is optimal
- If $x^*$ violates some constraint $g_k(x^*) > 0$: move toward $x^*$ until hitting constraint $k$, add $k$ to $\mathcal{W}$, go to step 2

Solving the Equality Subproblem

At each iteration, we solve an equality-constrained problem. For quadratic objectives (like in QP/MPC), this is a linear system.

Given objective $f(x) = \frac{1}{2}x^T H x + c^T x$ and active constraints $A_{eq} x = b_{eq}$:

The Lagrangian is:

\mathcal{L}(x, \lambda) = \frac{1}{2}x^T H x + c^T x - \lambda^T(A_{eq} x - b_{eq})

Setting gradients to zero:

\nabla_x \mathcal{L} = Hx + c - A_{eq}^T \lambda = 0

\nabla_\lambda \mathcal{L} = -(A_{eq} x - b_{eq}) = 0

This gives the KKT system:

\begin{bmatrix} H & -A_{eq}^T \\ A_{eq} & 0 \end{bmatrix} \begin{bmatrix} x \\ \lambda \end{bmatrix} = \begin{bmatrix} -c \\ b_{eq} \end{bmatrix}

This is a linear system! Standard linear algebra (LU decomposition, etc.) solves it efficiently.

Why Check $\lambda_i < 0$? If $\lambda_i < 0$ for an "active" constraint, it means the constraint is actually helping us — the unconstrained direction would go further into the feasible region. But inequality constraints can only block us, not help. So a negative multiplier means we've wrongly included this constraint in the active set.

Geometric Interpretation

Picture the cost function as a bowl. You're trying to roll a ball to the bottom, but there are walls (constraint boundaries). The active set method:

Rolls the ball downhill
When it hits a wall, slides along the wall (constraint becomes active)
If it reaches a corner, checks if it can leave any wall (remove from active set)
Stops when at the lowest point that doesn't go through any wall

Interactive: Active Set Method

Minimize $f(x,y) = (x-1.5)^2 + 2(y-1)^2$ subject to $x \geq 0$, $y \geq 0$, $x + y \leq 1.5$. Green point: current iterate. Red highlighted lines: active constraints. The algorithm walks along boundaries toward the optimum.

Detailed Iterative Examples

Now let's work through the Active Set Method step-by-step with concrete numbers, showing every calculation. First a simple 2D example, then an MPC problem.

Example 1: Simple 2D Quadratic Program

Problem Statement

Minimize:

f(x_1, x_2) = \frac{1}{2}(x_1^2 + 2x_2^2) - 2x_1 - 3x_2

Subject to:

$g_1: x_1 + x_2 \leq 2$
$g_2: -x_1 \leq 0$ (i.e., $x_1 \geq 0$)
$g_3: -x_2 \leq 0$ (i.e., $x_2 \geq 0$)

In standard form $g_i(x) \leq 0$:

$g_1(x) = x_1 + x_2 - 2 \leq 0$
$g_2(x) = -x_1 \leq 0$
$g_3(x) = -x_2 \leq 0$

Setup: Matrices and Gradients

First, let's identify the quadratic structure. Our objective is:

f(x) = \frac{1}{2}x^T H x + c^T x

where:

H = \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix}, \quad c = \begin{bmatrix} -2 \\ -3 \end{bmatrix}

The gradient of $f$ is:

\nabla f(x) = Hx + c = \begin{bmatrix} x_1 - 2 \\ 2x_2 - 3 \end{bmatrix}

The constraint gradients are:

\nabla g_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}, \quad \nabla g_2 = \begin{bmatrix} -1 \\ 0 \end{bmatrix}, \quad \nabla g_3 = \begin{bmatrix} 0 \\ -1 \end{bmatrix}

Find the Unconstrained Minimum

Setting $\nabla f = 0$:

x_1 - 2 = 0 \implies x_1 = 2$$ $$2x_2 - 3 = 0 \implies x_2 = 1.5

Unconstrained minimum: $x^{unc} = (2, 1.5)$

Check feasibility:

$g_1(2, 1.5) = 2 + 1.5 - 2 = 1.5 > 0$ ❌ Violated!
$g_2(2, 1.5) = -2 < 0$ ✓
$g_3(2, 1.5) = -1.5 < 0$ ✓

The unconstrained minimum violates $g_1$, so we need the Active Set Method.

Iteration 1: Initialize

Starting point: $x^{(0)} = (0.5, 0.5)$ (feasible)

Working set: $\mathcal{W} = \emptyset$ (empty - no constraints assumed active)

Iteration 2: Solve Unconstrained Subproblem

With $\mathcal{W} = \emptyset$, we solve the unconstrained problem:

$$x^* = (2, 1.5)$$

Check feasibility from $x^{(0)}$ to $x^*$:

We move along the line from $(0.5, 0.5)$ to $(2, 1.5)$:

x(t) = (0.5, 0.5) + t \cdot (1.5, 1) = (0.5 + 1.5t, 0.5 + t)

Check when we hit $g_1 = 0$ (the constraint $x_1 + x_2 = 2$):

$$(0.5 + 1.5t) + (0.5 + t) = 2$$ $$1 + 2.5t = 2$$ $$t = 0.4$$

At $t = 0.4$:

x^{(1)} = (0.5 + 1.5 \times 0.4, 0.5 + 0.4) = (1.1, 0.9)

Update working set: $\mathcal{W} = \{g_1\}$

Iteration 3: Solve with $g_1$ Active

Now we minimize $f$ subject to $g_1(x) = 0$, i.e., $x_1 + x_2 = 2$.

Form the KKT system:

\begin{bmatrix} H & -A^T \\ A & 0 \end{bmatrix} \begin{bmatrix} x \\ \lambda \end{bmatrix} = \begin{bmatrix} -c \\ b \end{bmatrix}

Here $A = \nabla g_1^T = [1, 1]$ and $b = 2$ (from $x_1 + x_2 = 2$):

\begin{bmatrix} 1 & 0 & -1 \\ 0 & 2 & -1 \\ 1 & 1 & 0 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \lambda_1 \end{bmatrix} = \begin{bmatrix} 2 \\ 3 \\ 2 \end{bmatrix}

Solve the system:

From row 1: $x_1 = 2 + \lambda_1$

From row 2: $2x_2 = 3 + \lambda_1 \implies x_2 = \frac{3 + \lambda_1}{2}$

From row 3: $x_1 + x_2 = 2$

Substituting:

(2 + \lambda_1) + \frac{3 + \lambda_1}{2} = 2$$ $$4 + 2\lambda_1 + 3 + \lambda_1 = 4$$ $$3\lambda_1 = -3$$ $$\lambda_1 = -1

Therefore:

x_1 = 2 + (-1) = 1, \quad x_2 = \frac{3 + (-1)}{2} = 1

Candidate solution: $x^* = (1, 1)$ with $\lambda_1 = -1$

Iteration 3: Check Multiplier Sign

We found $\lambda_1 = -1 < 0$. This is a problem!

What does $\lambda_1 < 0$ mean? It means constraint $g_1$ is "helping" us - if we removed it, the cost would go up, not down. But inequality constraints can only hurt us (restrict where we can go), never help. So $g_1$ shouldn't be in the active set!

Wait - this seems wrong. Let's reconsider. Actually, let me recalculate more carefully...

The issue is that I need to be more careful about the sign conventions. Let me redo this with the Lagrangian:

\mathcal{L} = \frac{1}{2}(x_1^2 + 2x_2^2) - 2x_1 - 3x_2 + \lambda_1(x_1 + x_2 - 2)

Taking derivatives:

\frac{\partial \mathcal{L}}{\partial x_1} = x_1 - 2 + \lambda_1 = 0 \implies x_1 = 2 - \lambda_1

\frac{\partial \mathcal{L}}{\partial x_2} = 2x_2 - 3 + \lambda_1 = 0 \implies x_2 = \frac{3 - \lambda_1}{2}

Using constraint $x_1 + x_2 = 2$:

(2 - \lambda_1) + \frac{3 - \lambda_1}{2} = 2$$ $$4 - 2\lambda_1 + 3 - \lambda_1 = 4$$ $$3\lambda_1 = 3$$ $$\lambda_1 = 1

Therefore:

x_1 = 2 - 1 = 1, \quad x_2 = \frac{3 - 1}{2} = 1

Correct solution: $x^* = (1, 1)$ with $\lambda_1 = 1 > 0$ ✓

Iteration 3: Check Feasibility

Check all constraints at $x^* = (1, 1)$:

$g_1(1, 1) = 1 + 1 - 2 = 0$ ✓ (active, by construction)
$g_2(1, 1) = -1 < 0$ ✓ (satisfied with slack)
$g_3(1, 1) = -1 < 0$ ✓ (satisfied with slack)

All constraints satisfied!

Optimality Check

We have:

$\lambda_1 = 1 > 0$ ✓ (multiplier non-negative)
All constraints satisfied ✓
KKT stationarity satisfied ✓

Final Answer: $x^* = (1, 1)$ with optimal cost $f(1,1) = \frac{1}{2}(1 + 2) - 2 - 3 = -3.5$

The multiplier $\lambda_1 = 1$ tells us: relaxing the constraint $x_1 + x_2 \leq 2$ to $x_1 + x_2 \leq 2.01$ would decrease the optimal cost by approximately $0.01$.

Interactive: Example 1 Step-by-Step

Step: 0

Watch the algorithm: start at $(0.5, 0.5)$, move toward unconstrained minimum $(2, 1.5)$, hit constraint $x_1 + x_2 = 2$, then solve constrained problem to find optimum at $(1, 1)$.

Example 2: MPC for a Simple System

Now let's see how Active Set applies to Model Predictive Control. We'll control a simple first-order system with input constraints.

MPC Problem Setup

System dynamics: $x_{k+1} = 0.8 x_k + 0.5 u_k$

Initial state: $x_0 = 3$

Prediction horizon: $N = 2$

Cost function: $J = x_1^2 + x_2^2 + 0.1u_0^2 + 0.1u_1^2$

Constraints: $-1 \leq u_k \leq 1$ for all $k$

Goal: Drive state to zero while respecting input limits

Step 1: Write States in Terms of Inputs

Using the dynamics recursively:

$$x_1 = 0.8 x_0 + 0.5 u_0 = 0.8(3) + 0.5 u_0 = 2.4 + 0.5 u_0$$

$$x_2 = 0.8 x_1 + 0.5 u_1 = 0.8(2.4 + 0.5 u_0) + 0.5 u_1 = 1.92 + 0.4 u_0 + 0.5 u_1$$

Step 2: Write Cost in Terms of Inputs Only

Substituting into $J$:

$$J = (2.4 + 0.5u_0)^2 + (1.92 + 0.4u_0 + 0.5u_1)^2 + 0.1u_0^2 + 0.1u_1^2$$

Expanding (I'll spare the algebra):

J = \frac{1}{2}\mathbf{u}^T H \mathbf{u} + f^T \mathbf{u} + \text{const}

where $\mathbf{u} = [u_0, u_1]^T$ and:

H = \begin{bmatrix} 0.5 + 0.32 + 0.2 & 0.4 \\ 0.4 & 0.5 + 0.2 \end{bmatrix} = \begin{bmatrix} 1.02 & 0.4 \\ 0.4 & 0.7 \end{bmatrix}

f = \begin{bmatrix} 2.4 \times 0.5 \times 2 + 1.92 \times 0.4 \times 2 \\ 1.92 \times 0.5 \times 2 \end{bmatrix} = \begin{bmatrix} 3.936 \\ 1.92 \end{bmatrix}

(Let me recalculate this more carefully...)

Actually, let's expand directly:

$$J = (2.4 + 0.5u_0)^2 + (1.92 + 0.4u_0 + 0.5u_1)^2 + 0.1u_0^2 + 0.1u_1^2$$

First term: $(2.4 + 0.5u_0)^2 = 5.76 + 2.4u_0 + 0.25u_0^2$

Second term: $(1.92 + 0.4u_0 + 0.5u_1)^2 = 3.69 + 1.536u_0 + 1.92u_1 + 0.16u_0^2 + 0.4u_0u_1 + 0.25u_1^2$

Collecting:

J = (0.25 + 0.16 + 0.1)u_0^2 + (0.25 + 0.1)u_1^2 + 0.4u_0u_1 + (2.4 + 1.536)u_0 + 1.92u_1 + \text{const}

J = 0.51u_0^2 + 0.35u_1^2 + 0.4u_0u_1 + 3.936u_0 + 1.92u_1 + \text{const}

In matrix form:

H = \begin{bmatrix} 1.02 & 0.4 \\ 0.4 & 0.7 \end{bmatrix}, \quad f = \begin{bmatrix} 3.936 \\ 1.92 \end{bmatrix}

Step 3: Write Constraints in Standard Form

Input constraints $-1 \leq u_k \leq 1$ become:

$u_0 \leq 1 \implies g_1: u_0 - 1 \leq 0$
$-u_0 \leq 1 \implies g_2: -u_0 - 1 \leq 0$
$u_1 \leq 1 \implies g_3: u_1 - 1 \leq 0$
$-u_1 \leq 1 \implies g_4: -u_1 - 1 \leq 0$

Step 4: Find Unconstrained Minimum

Solve $H\mathbf{u} = -f$:

\begin{bmatrix} 1.02 & 0.4 \\ 0.4 & 0.7 \end{bmatrix} \begin{bmatrix} u_0 \\ u_1 \end{bmatrix} = \begin{bmatrix} -3.936 \\ -1.92 \end{bmatrix}

Using Cramer's rule or substitution:

$\det(H) = 1.02 \times 0.7 - 0.4 \times 0.4 = 0.714 - 0.16 = 0.554$

u_0 = \frac{-3.936 \times 0.7 - (-1.92) \times 0.4}{0.554} = \frac{-2.755 + 0.768}{0.554} = \frac{-1.987}{0.554} = -3.59

u_1 = \frac{1.02 \times (-1.92) - 0.4 \times (-3.936)}{0.554} = \frac{-1.958 + 1.574}{0.554} = \frac{-0.384}{0.554} = -0.69

Unconstrained solution: $(u_0, u_1) = (-3.59, -0.69)$

Check constraints:

$g_1: -3.59 - 1 = -4.59 < 0$ ✓
$g_2: -(-3.59) - 1 = 2.59 > 0$ ❌ Violated! (means $u_0 < -1$)
$g_3: -0.69 - 1 = -1.69 < 0$ ✓
$g_4: -(-0.69) - 1 = -0.31 < 0$ ✓

The unconstrained $u_0 = -3.59$ violates the lower bound $u_0 \geq -1$.

Iteration 1: Add $g_2$ to Working Set

Set $u_0 = -1$ (constraint $g_2$ active) and minimize over $u_1$:

With $u_0 = -1$ fixed:

J(u_1) = 0.51(-1)^2 + 0.35u_1^2 + 0.4(-1)u_1 + 3.936(-1) + 1.92u_1 + \text{const}

= 0.35u_1^2 + (-0.4 + 1.92)u_1 + \text{const} = 0.35u_1^2 + 1.52u_1 + \text{const}

Minimize: $\frac{dJ}{du_1} = 0.7u_1 + 1.52 = 0 \implies u_1 = -2.17$

Check constraints on $u_1$:

$u_1 = -2.17 < -1$ ❌ Violated!

Iteration 2: Add $g_4$ to Working Set

Now both $u_0 = -1$ and $u_1 = -1$ (both at lower bounds).

Solution: $\mathbf{u}^* = (-1, -1)$

Check Multipliers

At $\mathbf{u}^* = (-1, -1)$, compute $\nabla J$:

\nabla J = H\mathbf{u} + f = \begin{bmatrix} 1.02 & 0.4 \\ 0.4 & 0.7 \end{bmatrix}\begin{bmatrix} -1 \\ -1 \end{bmatrix} + \begin{bmatrix} 3.936 \\ 1.92 \end{bmatrix}

= \begin{bmatrix} -1.42 \\ -1.1 \end{bmatrix} + \begin{bmatrix} 3.936 \\ 1.92 \end{bmatrix} = \begin{bmatrix} 2.516 \\ 0.82 \end{bmatrix}

For the active constraints $g_2: -u_0 - 1 \leq 0$ and $g_4: -u_1 - 1 \leq 0$:

\nabla g_2 = \begin{bmatrix} -1 \\ 0 \end{bmatrix}, \quad \nabla g_4 = \begin{bmatrix} 0 \\ -1 \end{bmatrix}

KKT stationarity: $\nabla J = \lambda_2 \nabla g_2 + \lambda_4 \nabla g_4$

\begin{bmatrix} 2.516 \\ 0.82 \end{bmatrix} = \lambda_2 \begin{bmatrix} -1 \\ 0 \end{bmatrix} + \lambda_4 \begin{bmatrix} 0 \\ -1 \end{bmatrix}

This gives: $\lambda_2 = -2.516$ and $\lambda_4 = -0.82$

Both multipliers are negative! This means both constraints are active but... wait, let me reconsider the sign convention.

Actually, for constraints of the form $g_i(x) \leq 0$, the KKT condition is:

\nabla f + \sum_i \lambda_i \nabla g_i = 0

So:

\nabla J + \lambda_2 \nabla g_2 + \lambda_4 \nabla g_4 = 0

\begin{bmatrix} 2.516 \\ 0.82 \end{bmatrix} + \lambda_2 \begin{bmatrix} -1 \\ 0 \end{bmatrix} + \lambda_4 \begin{bmatrix} 0 \\ -1 \end{bmatrix} = 0

This gives: $\lambda_2 = 2.516 > 0$ ✓ and $\lambda_4 = 0.82 > 0$ ✓

MPC Solution: The optimal control sequence is $u_0^* = -1$, $u_1^* = -1$ (both saturated at the lower bound).

This makes physical sense: starting at $x_0 = 3$, we want to drive to zero as fast as possible, so we apply maximum negative control.

Verify: Predicted Trajectory

$$x_1 = 0.8(3) + 0.5(-1) = 2.4 - 0.5 = 1.9$$

$$x_2 = 0.8(1.9) + 0.5(-1) = 1.52 - 0.5 = 1.02$$

The state decreases: $3 \to 1.9 \to 1.02$. In real MPC, we'd apply only $u_0^* = -1$, measure the new state, and re-solve.

Interactive: MPC Active Set Solution

Initial state x₀: 3.0

Input limit |u|: 1.0

Left: the QP in $(u_0, u_1)$ space. The box shows input constraints. Blue contours are cost levels. Orange dot is the constrained optimum. Right: resulting state trajectory.

Key Takeaways from the Examples

The algorithm is systematic: Start unconstrained, hit boundaries, solve constrained subproblems, check multiplier signs, repeat.
Multipliers tell you if constraints belong: $\lambda_i \geq 0$ means constraint is legitimately active; $\lambda_i < 0$ means remove it.
MPC naturally leads to QPs: Linear dynamics + quadratic cost + linear constraints = Quadratic Program.
Saturation is common: When you want aggressive control, actuator limits often become active.

Summary

Lagrange Multipliers

Transform constrained optimization into finding stationary points of $\mathcal{L}(x, \lambda) = f(x) - \lambda g(x)$. The multiplier $\lambda$ measures the constraint's "cost."

Geometric Meaning

At the constrained optimum, $\nabla f$ is perpendicular to the constraint surface — parallel to $\nabla g$. Moving along the constraint can't improve $f$.

KKT Conditions

For inequalities: stationarity, primal feasibility, dual feasibility ($\lambda \geq 0$), and complementary slackness ($\lambda g = 0$).

Active Set Idea

Guess which constraints are active, solve the equality problem, then refine the guess based on multiplier signs and feasibility.

Why $\lambda < 0$ Means Remove

A negative multiplier means the constraint is "helping" us — but inequalities can only block, not help. So it shouldn't be in the active set.

The KKT System

For quadratic objectives with linear constraints, each subproblem is just a linear system. Standard linear algebra applies!

The Big Picture: Active Set Methods are elegant because they decompose a hard inequality-constrained problem into a sequence of easier equality-constrained problems. The art is in managing the active set — adding constraints we hit, removing ones with negative multipliers. For convex QPs, this converges finitely to the global optimum.