Model Predictive Control: Fundamentals from the Ground Up

Introduction

Model Predictive Control (MPC) is one of the most powerful and widely-used advanced control strategies in modern engineering. From chemical plants to self-driving cars, from building climate control to robotic locomotion, MPC has become the go-to approach when you need optimal control that respects real-world constraints.

What makes MPC special? In a nutshell, MPC uses a model of your system to predict what will happen in the future, then solves an optimization problem to find the best control actions. Unlike simpler controllers that only react to the current error, MPC thinks ahead.

The MPC Philosophy: "If I can predict the consequences of my actions, I can choose the best action by optimizing over those predictions."

This tutorial will build your understanding from the ground up. We'll start with the core idea, develop the mathematical formulation step by step, and build intuition for why MPC works so well in practice.

The Core Idea: Predict, Optimize, Apply

Imagine you're driving a car on a winding road. A simple reactive controller (like cruise control) only looks at your current speed and adjusts the throttle. But you, as a human driver, do something smarter: you look ahead at the road, anticipate the curves, and plan your speed and steering in advance.

MPC works exactly like this. At every moment, it:

The MPC Algorithm (Conceptual)

Measure the current state of the system
Predict future behavior over a horizon using the system model
Optimize a sequence of future control inputs to minimize cost while respecting constraints
Apply only the first control input from the optimal sequence
Repeat at the next time step

The key insight is that while we compute an entire sequence of optimal future controls, we only actually apply the first one. Then we measure again, re-optimize, and repeat. This is called the receding horizon approach, and it's what makes MPC robust to model errors and disturbances.

Why Predict and Re-plan?

You might wonder: why not just compute the optimal control for all future time and be done with it? The answer is that real systems have:

Model uncertainty: Our model isn't perfect — reality will diverge from predictions
Disturbances: Unexpected things happen (wind gusts, load changes, etc.)
Measurement updates: We get new information about the true state

By re-planning at each step, MPC incorporates new information and corrects for any deviations. It's like constantly updating your GPS route based on current traffic.

The System Model

MPC needs a model to make predictions. In discrete-time (the most common setting for MPC), we typically use a state-space model:

x_{k+1} = f(x_k, u_k)

where $x_k \in \mathbb{R}^n$ is the state (everything we need to know about the system at time $k$), and $u_k \in \mathbb{R}^m$ is the control input (what we can manipulate). The function $f$ describes how the state evolves.

Linear Models

For many systems, or as an approximation around an operating point, we use linear models:

x_{k+1} = A x_k + B u_k

Here $A$ is the state transition matrix (how the state evolves on its own) and $B$ is the input matrix (how control affects the state). Linear MPC is computationally efficient because the resulting optimization problem is convex.

What Goes in the State?

The state $x_k$ should capture everything about the system's "memory" — everything needed to predict future behavior. For example:

Mechanical system: positions and velocities
Thermal system: temperatures at various points
Chemical reactor: concentrations and temperatures
Vehicle: position, velocity, heading angle

Model Quality Matters: MPC is only as good as its model. A better model means better predictions, which means better control. This is both a strength (you can incorporate detailed physics) and a challenge (you need to develop and validate that model).

Prediction Over a Horizon

The heart of MPC is predicting the future. Given the current state $x_k$ and a sequence of planned control inputs $u_k, u_{k+1}, \ldots, u_{k+N-1}$, we can roll out the model to predict future states:

\begin{aligned} x_{k+1} &= A x_k + B u_k \\ x_{k+2} &= A x_{k+1} + B u_{k+1} \\ &\vdots \\ x_{k+N} &= A x_{k+N-1} + B u_{k+N-1} \end{aligned}

The number $N$ is called the prediction horizon (or control horizon). It determines how far ahead we look.

Choosing the Horizon

The prediction horizon $N$ is a crucial design choice:

Too short: The controller is myopic — it can't anticipate problems until they're imminent
Too long: Computationally expensive, and predictions far in the future become unreliable anyway
Just right: Long enough to capture the relevant dynamics and constraints, short enough to be tractable

A good rule of thumb: the horizon should cover the system's "settling time" — how long it takes to respond to inputs.

Interactive: Prediction Horizon

Horizon N: 10

System Speed: 0.85

The blue line shows the current state trajectory. The green dashed region shows the prediction horizon — the "window" into the future that MPC uses to optimize control actions.

The Cost Function

Once we can predict the future, we need to define what "good" means. This is done through a cost function (or objective function) that we want to minimize.

A typical MPC cost function penalizes:

State deviation: How far are we from the desired state (setpoint)?
Control effort: How much energy/actuation are we using?
Control changes: How jerky are our control actions?

The standard quadratic cost function over the prediction horizon is:

J = \sum_{i=0}^{N-1} \Big[ (x_{k+i} - x_{ref})^T Q (x_{k+i} - x_{ref}) + u_{k+i}^T R \, u_{k+i} \Big] + (x_{k+N} - x_{ref})^T Q_f (x_{k+N} - x_{ref})

Let's break this down:

$Q \succeq 0$ is the state cost matrix — penalizes deviation from reference $x_{ref}$
$R \succ 0$ is the input cost matrix — penalizes control effort
$Q_f \succeq 0$ is the terminal cost matrix — extra penalty on the final predicted state

Tuning Q and R

The matrices $Q$ and $R$ let you express priorities:

Large Q, small R: "Get to the target quickly, I don't care about control effort" → aggressive control
Small Q, large R: "Save energy, I can be patient" → gentle, smooth control

Different diagonal entries in $Q$ let you prioritize some states over others (e.g., "temperature matters more than pressure").

Why Quadratic Costs? Quadratic costs are mathematically convenient (they lead to convex optimization problems with linear models) and intuitive (they create a smooth "bowl" where being twice as far from the target costs four times as much). They're the most common choice in practice.

Constraints: The MPC Superpower

Here's where MPC truly shines compared to simpler controllers. Real systems have constraints — physical limits that must not be violated:

Input constraints: Actuator limits (max motor torque, valve positions 0-100%)
State constraints: Safety limits (max temperature, min tank level)
Rate constraints: How fast things can change (acceleration limits)

MPC handles these directly in the optimization:

\begin{aligned} u_{min} &\leq u_{k+i} \leq u_{max} & \text{(input bounds)} \\ x_{min} &\leq x_{k+i} \leq x_{max} & \text{(state bounds)} \\ \Delta u_{min} &\leq u_{k+i} - u_{k+i-1} \leq \Delta u_{max} & \text{(rate limits)} \end{aligned}

These constraints must hold for all $i = 0, 1, \ldots, N-1$ in the prediction horizon.

Why Constraints Matter

Consider a classic example: controlling the temperature of a chemical reactor. Without constraints, an optimal controller might command the heater to go to 200% power (impossible) or cool so fast that thermal shock damages the equipment (dangerous).

MPC finds the best control that respects all constraints. It automatically backs off when hitting limits, and it plans ahead to avoid constraint violations before they happen.

Interactive: Control with Constraints

Input Limit |u|: 1.0

Target Position: 2.0

Orange shows the control input (constrained to ±limit). Blue shows the resulting state trajectory. Notice how tighter constraints slow down the response — MPC finds the best achievable trajectory.

Constraint Satisfaction is Guaranteed: Unlike controllers where you add saturation "after the fact," MPC knows about constraints during optimization. It won't ask for the impossible, and it plans trajectories that stay within limits throughout the horizon.

The Optimization Problem

Putting it all together, at each time step $k$, MPC solves:

\begin{aligned} \min_{u_k, \ldots, u_{k+N-1}} \quad & \sum_{i=0}^{N-1} \Big[ x_{k+i}^T Q \, x_{k+i} + u_{k+i}^T R \, u_{k+i} \Big] + x_{k+N}^T Q_f \, x_{k+N} \\[1em] \text{subject to} \quad & x_{k+i+1} = A x_{k+i} + B u_{k+i}, \quad i = 0, \ldots, N-1 \\ & u_{min} \leq u_{k+i} \leq u_{max} \\ & x_{min} \leq x_{k+i} \leq x_{max} \\ & x_k = x_{current} \quad \text{(initial condition)} \end{aligned}

This is a constrained optimization problem. The decision variables are the $N$ future control inputs. The dynamics act as equality constraints linking states to inputs.

Solving the Problem

For linear MPC (linear model, quadratic cost, linear constraints), this is a Quadratic Program (QP) — a well-studied problem class with efficient solvers. Modern QP solvers can handle hundreds of variables in milliseconds, enabling real-time control.

For nonlinear MPC (nonlinear model or constraints), we get a general nonlinear program (NLP). These are harder but still tractable with modern solvers for many applications.

Dense vs. Sparse Formulations

There are two common ways to set up the QP:

Dense: Eliminate state variables using the dynamics, optimize only over inputs. Smaller problem but dense matrices.
Sparse: Keep both states and inputs as variables with dynamics as constraints. Larger problem but sparse structure that specialized solvers exploit.

How MPC Actually Solves the Problem

Now let's get into the mathematical heart of MPC — how do we actually find the optimal control sequence? There are two elegant approaches: Dynamic Programming (working backwards via Bellman's equation) and Batch Optimization (solving one big quadratic program). Both give the same answer, but offer different insights.

Approach 1: Dynamic Programming and the Bellman Equation

Just like in LQR, we can use Bellman's principle of optimality. Define the cost-to-go (or value function) $V_i(x)$ as the optimal cost starting from state $x$ with $i$ steps remaining in the horizon:

V_i(x) = \min_{u} \Big[ x^T Q x + u^T R u + V_{i-1}(Ax + Bu) \Big]

This says: the optimal cost with $i$ steps to go equals the immediate cost plus the optimal future cost, minimized over the current control $u$.

We solve this backwards in time, starting from the terminal condition:

$$V_0(x) = x^T Q_f x$$

This is the terminal cost — when there are zero steps remaining, we just pay the final state penalty.

The Quadratic Structure Propagates Backwards

Here's the beautiful part (just like LQR): if we assume the cost-to-go is quadratic, $V_i(x) = x^T P_i x$, then the Bellman equation preserves this structure. Let's verify by substituting into the Bellman equation:

V_i(x) = \min_{u} \Big[ x^T Q x + u^T R u + (Ax + Bu)^T P_{i-1} (Ax + Bu) \Big]

Expanding the quadratic term $(Ax + Bu)^T P_{i-1} (Ax + Bu)$:

= x^T A^T P_{i-1} A x + 2 x^T A^T P_{i-1} B u + u^T B^T P_{i-1} B u

So our minimization becomes:

\min_{u} \Big[ x^T(Q + A^T P_{i-1} A)x + u^T(R + B^T P_{i-1} B)u + 2x^T A^T P_{i-1} B u \Big]

This is a quadratic function in $u$ with no constraints (for now). To minimize, we take the gradient with respect to $u$ and set it to zero:

\frac{\partial}{\partial u} = 2(R + B^T P_{i-1} B)u + 2B^T P_{i-1} A x = 0

Solving for the optimal control:

u^*_i = -\underbrace{(R + B^T P_{i-1} B)^{-1} B^T P_{i-1} A}_{K_i} x = -K_i x

Key Result: The optimal control at each stage is linear in the state: $u^*_i = -K_i x$. The gain $K_i$ depends on how many steps remain — it's time-varying over the finite horizon.

The Riccati Recursion

Substituting $u^*_i$ back into the cost expression, after some algebra, we find that $V_i(x) = x^T P_i x$ where $P_i$ satisfies the discrete-time Riccati recursion:

P_i = Q + A^T P_{i-1} A - A^T P_{i-1} B (R + B^T P_{i-1} B)^{-1} B^T P_{i-1} A

Starting from $P_0 = Q_f$ and iterating backwards $i = 1, 2, \ldots, N$, we get the sequence of cost matrices $P_0, P_1, \ldots, P_N$ and corresponding gains $K_1, K_2, \ldots, K_N$.

This is exactly the same Riccati equation as LQR, but we run it for a finite number of steps rather than until convergence.

Connection to LQR

What happens if we let $N \to \infty$? The Riccati recursion converges to a steady-state $P_\infty$, and all gains become identical: $K_i \to K_\infty$. This is exactly the infinite-horizon LQR solution!

MPC ↔ LQR Connection: Unconstrained, infinite-horizon MPC with quadratic costs is mathematically equivalent to LQR. The optimal policy is $u = -K_\infty x$ where $K_\infty$ is the steady-state LQR gain. MPC generalizes LQR by handling finite horizons and constraints.

But What About Constraints?

The dynamic programming approach gives us the unconstrained solution elegantly. But MPC's real power is handling constraints. With constraints like $u_{min} \leq u \leq u_{max}$, we can no longer just set the gradient to zero — the optimal $u$ might be at a constraint boundary.

This is where the batch QP formulation becomes essential.

Approach 2: The Batch Quadratic Program

Instead of solving backwards in time, we can stack everything into one big optimization problem. Define the stacked vectors:

\mathbf{U} = \begin{bmatrix} u_0 \\ u_1 \\ \vdots \\ u_{N-1} \end{bmatrix} \in \mathbb{R}^{Nm}, \quad \mathbf{X} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_N \end{bmatrix} \in \mathbb{R}^{Nn}

The dynamics $x_{k+1} = Ax_k + Bu_k$ can be written in matrix form. Starting from $x_0$:

\mathbf{X} = \underbrace{\begin{bmatrix} A \\ A^2 \\ \vdots \\ A^N \end{bmatrix}}_{\mathbf{A}} x_0 + \underbrace{\begin{bmatrix} B & 0 & \cdots & 0 \\ AB & B & \cdots & 0 \\ \vdots & & \ddots & \vdots \\ A^{N-1}B & A^{N-2}B & \cdots & B \end{bmatrix}}_{\mathbf{B}} \mathbf{U}

Or compactly: $\mathbf{X} = \mathbf{A} x_0 + \mathbf{B} \mathbf{U}$

The cost function becomes a quadratic in $\mathbf{U}$:

J = \mathbf{X}^T \bar{Q} \mathbf{X} + \mathbf{U}^T \bar{R} \mathbf{U}

where $\bar{Q} = \text{diag}(Q, Q, \ldots, Q, Q_f)$ and $\bar{R} = \text{diag}(R, R, \ldots, R)$.

Substituting $\mathbf{X} = \mathbf{A} x_0 + \mathbf{B} \mathbf{U}$:

J = (\mathbf{A} x_0 + \mathbf{B} \mathbf{U})^T \bar{Q} (\mathbf{A} x_0 + \mathbf{B} \mathbf{U}) + \mathbf{U}^T \bar{R} \mathbf{U}

Expanding and collecting terms:

J = \mathbf{U}^T \underbrace{(\mathbf{B}^T \bar{Q} \mathbf{B} + \bar{R})}_{H} \mathbf{U} + 2 x_0^T \underbrace{\mathbf{A}^T \bar{Q} \mathbf{B}}_{F^T} \mathbf{U} + \text{const}

This is a quadratic function in $\mathbf{U}$: $J = \mathbf{U}^T H \mathbf{U} + 2 x_0^T F^T \mathbf{U} + \text{const}$

The QP with Constraints

Now we add constraints. Input constraints $u_{min} \leq u_k \leq u_{max}$ become:

\mathbf{u}_{min} \leq \mathbf{U} \leq \mathbf{u}_{max}

State constraints $x_{min} \leq x_k \leq x_{max}$ become (using $\mathbf{X} = \mathbf{A} x_0 + \mathbf{B} \mathbf{U}$):

\mathbf{x}_{min} \leq \mathbf{A} x_0 + \mathbf{B} \mathbf{U} \leq \mathbf{x}_{max}

Rearranging: $\mathbf{x}_{min} - \mathbf{A} x_0 \leq \mathbf{B} \mathbf{U} \leq \mathbf{x}_{max} - \mathbf{A} x_0$

The complete MPC problem is now a standard Quadratic Program (QP):

\begin{aligned} \min_{\mathbf{U}} \quad & \frac{1}{2} \mathbf{U}^T H \mathbf{U} + f^T \mathbf{U} \\[0.5em] \text{subject to} \quad & G \mathbf{U} \leq w \end{aligned}

where $H = \mathbf{B}^T \bar{Q} \mathbf{B} + \bar{R}$, $f = \mathbf{B}^T \bar{Q} \mathbf{A} x_0$, and $G, w$ encode all the inequality constraints.

The QP Structure: MPC with linear dynamics, quadratic costs, and linear constraints reduces to a convex QP. The Hessian $H$ is positive definite (since $R \succ 0$), guaranteeing a unique global minimum. This is why linear MPC is so tractable — we're solving a well-understood convex optimization problem.

Solving the QP: How the Algorithms Actually Work

We've reduced MPC to solving a Quadratic Program at each time step. But how do QP solvers actually find the optimal $\mathbf{U}^*$? Let's build intuition for the three main approaches.

The Geometry of Constrained Optimization

First, visualize what we're doing. The cost function $J(\mathbf{U}) = \frac{1}{2}\mathbf{U}^T H \mathbf{U} + f^T \mathbf{U}$ is a bowl in the space of control sequences. Without constraints, the minimum is at the bottom of the bowl:

\mathbf{U}^*_{unconstrained} = -H^{-1} f

But constraints carve out a feasible region — a polytope (multi-dimensional polygon) defined by the linear inequalities $G\mathbf{U} \leq w$. The constrained optimum is either:

Inside the feasible region: Same as unconstrained (constraints don't matter)
On the boundary: The bowl's minimum is outside, so we slide along the boundary to find the best feasible point

The Key Insight: At the constrained optimum, some constraints are active (binding, satisfied with equality) and others are inactive (satisfied with slack). If we knew which constraints were active, we could solve an easier equality-constrained problem!

Method 1: Active Set Methods

Core idea: Guess which constraints are active, solve the resulting equality-constrained problem, then update our guess.

Imagine you're at a point inside the feasible region. You want to roll downhill on the cost bowl, but walls (constraint boundaries) might block you. The active set method works like this:

Active Set Algorithm (Conceptual)

Start with a feasible point and an initial guess of which constraints are active (the "working set")
Solve the equality-constrained subproblem: minimize $J(\mathbf{U})$ treating active constraints as equalities
Check optimality: Compute Lagrange multipliers $\lambda_i$ for active constraints
- If all $\lambda_i \geq 0$: we're optimal! The active constraints are "pushing" in the right direction
- If some $\lambda_i < 0$: that constraint wants to be inactive — remove it from the working set
Check feasibility: If the solution violates an inactive constraint, add that constraint to the working set
Repeat until optimal

The equality-constrained subproblem is easy to solve. If active constraints are $A_{eq}\mathbf{U} = b_{eq}$, we form the KKT system:

\begin{bmatrix} H & A_{eq}^T \\ A_{eq} & 0 \end{bmatrix} \begin{bmatrix} \mathbf{U} \\ \lambda \end{bmatrix} = \begin{bmatrix} -f \\ b_{eq} \end{bmatrix}

This is just a linear system — solve it with standard linear algebra!

Lagrange multipliers $\lambda_i$ have a beautiful interpretation: they measure the "cost" of a constraint. If $\lambda_i > 0$, the constraint is actively limiting us (removing it would improve the cost). If $\lambda_i < 0$, something's wrong — the constraint is in our working set but shouldn't be active.

Interactive: Active Set Method

Iteration: 0

2D example: minimize a quadratic (elliptical contours) subject to linear constraints (gray region). Green point is current iterate. Red lines show active constraints. Watch how the algorithm walks along constraint boundaries toward the optimum.

Pros: Finite convergence (for convex QPs), exact solution, exploits warm-starting (if constraints don't change much between MPC steps, the active set is similar)

Cons: Worst-case exponential in number of constraints (rare in practice), needs a feasible starting point

Method 2: Interior Point Methods

Core idea: Instead of walking along constraint boundaries, stay strictly inside the feasible region and approach the boundary smoothly.

Interior point methods replace the hard inequality constraints with a soft barrier function that goes to infinity as you approach the boundary:

\min_{\mathbf{U}} \; \frac{1}{2}\mathbf{U}^T H \mathbf{U} + f^T \mathbf{U} - \mu \sum_{i} \log(w_i - g_i^T \mathbf{U})

The log barrier $-\log(w_i - g_i^T \mathbf{U})$ is:

Small when far from the constraint boundary
Huge (approaching $+\infty$) as you get close to the boundary

The parameter $\mu > 0$ controls the "strength" of the barrier. For large $\mu$, the barrier dominates and we stay far from boundaries. As $\mu \to 0$, the barrier weakens and we can approach the true constrained optimum.

Interior Point Algorithm (Conceptual)

Start with a strictly feasible point (inside all constraints) and large $\mu$
Solve the barrier problem using Newton's method — this gives us a point on the "central path"
Reduce $\mu$ (typically $\mu \leftarrow \mu / 10$)
Repeat until $\mu$ is tiny and we're close enough to the true optimum

Newton's method for the barrier subproblem linearizes the optimality conditions. At each Newton step, we solve:

\begin{bmatrix} H + \sum_i \frac{g_i g_i^T}{(w_i - g_i^T \mathbf{U})^2} & G^T \\ S^{-1} G & \Lambda^{-1} \end{bmatrix} \begin{bmatrix} \Delta \mathbf{U} \\ \Delta \lambda \end{bmatrix} = -\begin{bmatrix} \text{gradient} \\ \text{complementarity} \end{bmatrix}

where $S$ is a diagonal matrix of slacks and $\Lambda$ is diagonal Lagrange multipliers. Don't worry about the details — the key point is it's still just a linear system!

Interactive: Interior Point Method

Barrier μ: 2.0

Same problem as above. The blue curve shows the "central path" — the set of barrier-problem solutions as $\mu$ varies. As $\mu \to 0$, the path approaches the true constrained optimum (red star). Interior point follows this path.

Pros: Polynomial complexity (guaranteed efficient), handles many constraints well, numerically stable

Cons: Each iteration is expensive (large linear systems), less suited for warm-starting, needs strictly feasible start

Method 3: ADMM and First-Order Methods

Core idea: Avoid solving big linear systems altogether. Use only matrix-vector multiplications and simple operations.

ADMM (Alternating Direction Method of Multipliers) splits the problem into easier pieces. Rewrite our QP as:

\min_{\mathbf{U}, \mathbf{z}} \; \frac{1}{2}\mathbf{U}^T H \mathbf{U} + f^T \mathbf{U} + \mathcal{I}_{\mathcal{C}}(\mathbf{z}) \quad \text{subject to} \quad G\mathbf{U} = \mathbf{z}

where $\mathcal{I}_{\mathcal{C}}(\mathbf{z})$ is the indicator function of the constraint set: 0 if $\mathbf{z} \leq w$, infinity otherwise.

ADMM forms the augmented Lagrangian and alternates between optimizing over $\mathbf{U}$ and $\mathbf{z}$:

ADMM Algorithm

U-update: $\mathbf{U}^{k+1} = \arg\min_{\mathbf{U}} \frac{1}{2}\mathbf{U}^T H \mathbf{U} + f^T \mathbf{U} + \frac{\rho}{2}\|G\mathbf{U} - \mathbf{z}^k + \mathbf{y}^k\|^2$
→ This is an unconstrained QP: solve $(H + \rho G^T G)\mathbf{U} = -f + \rho G^T(\mathbf{z}^k - \mathbf{y}^k)$
z-update: $\mathbf{z}^{k+1} = \Pi_{\mathcal{C}}(G\mathbf{U}^{k+1} + \mathbf{y}^k)$
→ Just project onto the constraint set! For box constraints: $z_i = \min(w_i, G\mathbf{U}_i + y_i)$
y-update: $\mathbf{y}^{k+1} = \mathbf{y}^k + G\mathbf{U}^{k+1} - \mathbf{z}^{k+1}$
→ Update the dual variable (running sum of constraint violations)
Repeat until convergence

Why is this good?

The U-update solves a fixed linear system $(H + \rho G^T G)$ — we can pre-factorize it!
The z-update is embarrassingly simple: just clipping values
Each iteration is cheap; we trade iteration count for per-iteration cost

Embedded Systems Love ADMM: The operations are simple (matrix-vector multiply, clipping), memory footprint is small, and you can pre-compute the matrix factorization offline. This makes ADMM ideal for microcontrollers running MPC at high rates. OSQP, a popular open-source QP solver, uses ADMM.

Interactive: ADMM Convergence

ρ (penalty): 1.0

ADMM alternates between an unconstrained minimization (moving toward bowl center) and projection onto constraints (snapping back to feasible region). The iterates zigzag but converge to the optimum.

Pros: Simple operations, low memory, excellent for embedded systems, warm-starting friendly

Cons: Slow convergence to high accuracy (but often good enough for MPC), needs tuning of $\rho$

Which Solver Should You Use?

Scenario	Recommended	Why
Small problems, high accuracy needed	Active Set	Exact solution, fast for few constraints
Large problems, many constraints	Interior Point	Polynomial complexity, handles scale well
Embedded systems, real-time control	ADMM	Simple operations, pre-computation, warm start
MPC with similar problems each step	Active Set or ADMM	Warm-starting from previous solution

What the Solution Looks Like

The QP solver returns the optimal control sequence $\mathbf{U}^* = [u_0^*, u_1^*, \ldots, u_{N-1}^*]^T$. We extract and apply only $u_0^*$.

When no constraints are active, the solution is the unconstrained optimum:

\mathbf{U}^* = -H^{-1} f = -H^{-1} \mathbf{B}^T \bar{Q} \mathbf{A} x_0

This is linear in $x_0$, so extracting the first component gives $u_0^* = -K x_0$ — exactly what dynamic programming predicted!

When constraints are active, the solution is more complex — it's piecewise affine in $x_0$. The state space is partitioned into regions, and in each region the optimal control is an affine function of the state (but with different coefficients in each region).

Interactive: Unconstrained vs Constrained Solution

Input Constraint |u|: 1.5

Initial State x₀: 2.0

Blue dashed: unconstrained optimal control (linear in state). Orange solid: constrained optimal (saturates at limits). The constrained solution equals the unconstrained when constraints aren't active.

Summary: Two Views, One Solution

Aspect	Dynamic Programming	Batch QP
Direction	Backwards in time	All at once
Key equation	Bellman + Riccati	QP: min ½U'HU + f'U
Handles constraints	Complex (requires modifications)	Natural (linear inequalities)
Insight	Structure of optimal solution	Practical computation
Result (unconstrained)	$u^* = -K_i x$	$u^* = -K x$ (same!)

Both approaches reveal that MPC finds a control that balances immediate cost against future cost, respecting the system dynamics. The batch QP is what's actually implemented in practice because it handles constraints directly.

The Receding Horizon Principle

Remember: although we compute $N$ optimal control inputs $u_k^*, u_{k+1}^*, \ldots, u_{k+N-1}^*$, we only apply the first one $u_k^*$. Then we:

Wait for the next time step
Measure the new state $x_{k+1}$
Solve a new optimization starting from $x_{k+1}$
Apply $u_{k+1}^*$
Repeat forever

The horizon "recedes" — it always extends $N$ steps into the future from the current time. This is also called moving horizon or rolling horizon control.

Interactive: Receding Horizon Animation

Horizon N: 8

Watch the prediction horizon (green) move forward with each time step. The solid blue line is the actual trajectory; the dashed line is the current prediction. Only the first control action is applied before re-planning.

Why Does This Work?

The receding horizon approach provides feedback. Even though we're solving an open-loop optimization (assuming no future disturbances), repeatedly re-solving creates closed-loop behavior:

Model errors are corrected as we measure the true state
Disturbances are rejected as they appear
Changing setpoints or constraints are immediately incorporated

Implicit Feedback: MPC transforms open-loop optimal control into closed-loop control through repeated re-optimization. This is sometimes called "implicit feedback" because the feedback law isn't given by a formula — it's implicitly defined by the optimization problem.

MPC vs. LQR: What's the Difference?

If you know about the Linear Quadratic Regulator (LQR), you might wonder how MPC compares. They're related but serve different purposes:

Aspect	LQR	MPC
Constraints	Cannot handle directly	Handles explicitly
Horizon	Infinite (or finite with terminal cost)	Finite, receding
Solution	Closed-form: $u = -Kx$	Online optimization at each step
Computation	Offline (compute $K$ once)	Online (solve QP each step)
Nonlinear systems	Requires linearization	Can use nonlinear models
Time-varying setpoints	Requires modification	Natural to include

The Connection

Interestingly, unconstrained linear MPC with infinite horizon is equivalent to LQR. The MPC solution would be $u_k = -Kx_k$ where $K$ is the LQR gain. In this sense, MPC generalizes LQR to handle constraints and finite horizons.

In practice, when constraints aren't active, a well-tuned MPC behaves similarly to LQR. The power of MPC emerges when you need to push the system hard and constraints become active.

Summary: The MPC Recipe

Let's consolidate everything into the complete MPC picture:

1. Model

You need a dynamic model $x_{k+1} = f(x_k, u_k)$ that predicts how your system evolves. Better models give better control.

2. Cost Function

Define what "good" means via $J = \sum (x^T Q x + u^T R u)$. Tune $Q$ and $R$ to balance tracking vs. effort.

3. Constraints

Specify physical limits on inputs and states. MPC will respect these while optimizing performance.

4. Horizon

Choose $N$ long enough to see relevant dynamics but short enough for tractable computation.

5. Solve Online

At each time step, solve the constrained optimization to get the optimal control sequence.

6. Apply First Input

Use only $u_k^*$, then re-measure and re-solve. The receding horizon provides robustness.

The MPC Mindset: MPC is about looking ahead and planning optimally within physical limits. It trades computational effort for performance and constraint satisfaction. When you need to operate a system near its limits while respecting hard constraints, MPC is often the answer.

Where is MPC Used?

MPC has found success in countless applications:

Process industries: Chemical plants, refineries, power generation
Automotive: Adaptive cruise control, autonomous driving, engine management
Aerospace: Spacecraft attitude control, aircraft autopilots
Robotics: Legged locomotion, manipulation, drone control
Buildings: HVAC optimization, energy management
Finance: Portfolio optimization (a form of MPC!)

Next Steps

To deepen your understanding, explore:

Stability guarantees: Terminal constraints and costs that ensure closed-loop stability
Robust MPC: Handling model uncertainty explicitly
Stochastic MPC: Dealing with probabilistic disturbances
Explicit MPC: Pre-computing the solution as a piecewise affine function
Nonlinear MPC: Handling nonlinear dynamics and constraints