- Model Predictive Control
Model Predictive Control
Feedback: because the universe refuses to wait for you to get it right.
Prerequisites
- Linear Algebra
- Differential equations
- probability and statistics
- Control theory :
- First- and second-order system response
- Transfer functions and feedback
- Linear Time Invariant (LTI) Systems
- Signals and Systems :
- Time-domain analysis
- Frequency-domain analysis
- Optimization :
- Convex sets and functions
- Unconstrained and constrained optimization
- Numerical optimization methods
Notations
Throughout this lecture notes, we will use the following subscripts, superscripts, and accents to denote specific meanings:
- $x \in \mathbb{R}^{1\times n}$: vector $x$ of dimension $n$
- $x_i$ with $i \in \mathbb{N}$, $i=1,\cdots,n$ for $x\in\mathbb{R}^{1\times n}$: component $i$ of vector $x$
- $x_k$: value of $x$ at time step $k$ (discrete time)
- $x^+$ or $x_{k+1}$: value of $x$ at next time step (discrete time)
- $\dot{x}$: time derivative of $x$ (continuous time)
- $x^*$: optimal value of $x$
- $x^\top$: transpose of vector or matrix $x$
- $\hat{x}$: estimate of variable $x$ (see chapter on State Estimation)
- $\hat{x}^-$: estimate of variable $x$ before measurement update
- $\tilde{x}$: estimation error of variable $x$
- $x_s$: steady state value of variable $x$
General Motivation
Even though feedback control has been applied for more than two millennia, the systematic analysis of dynamical systems is relatively recent, beginning with James Clerk Maxwell’s pioneering work about 150 years ago. Since then, the field has advanced spectacularly, driven by contributions from mathematicians, engineers, and physicists alike. Laplace, Lyapunov, Kolmogorov, Wiener, Nyquist, Bode, and Bellman are just a few of the towering figures who shaped what we know today as control theory.
Despite these advancements, classical control methods often struggle with complex, high-dimensional systems, particularly those with constraints and uncertainties. This has led to the exploration of more advanced control strategies, such as Model Predictive Control (MPC), which explicitly considers system constraints and optimizes control actions over a prediction horizon.
In the pursuit of optimality one is therefore forced to consider approximate solutions, and this is perhaps the single most important reason behind the phenomenal success of model predictive control (MPC). MPC is arguably the most widely accepted modern control strategy because it offers, through its receding horizon implementation, an eminently sensible compromise between optimality and speed of computation.
MPC has found applications in various fields, including chemical process control, automotive systems, aerospace, and robotics. Its ability to handle multivariable systems and constraints makes it a powerful tool for modern control challenges.
Course Content
Introduction to MPC
In control theory, we usually talk about controllers and regulators. A controller is a device or algorithm that manages the behavior of a system to achieve a desired outcome. Controllers can be simple, like a thermostat that maintains room temperature, or complex, like those used in aerospace applications to stabilize aircraft. A regulator, on the other hand, is a specific type of controller that focuses on maintaining a system’s output at a desired setpoint, often by minimizing deviations from that setpoint. Regulators are commonly used in systems where stability and precision are crucial, such as in power supply systems or industrial processes.
Linear Quadratic Regulator (LQR)
In this introductory chapter, we will explore a specific type of regulator known as the Linear Quadratic Regulator (LQR). The LQR is a fundamental concept in control theory that provides an optimal solution for controlling linear systems with quadratic cost functions. We will delve into the mathematical formulation of LQR, its properties, and its applications in various engineering fields. To help illustrate these concepts, we will refer to a video lecture by Christopher Lum, which provides a comprehensive overview of LQR.
You might ask yourself, why quadratic? The quadratic cost function is chosen for several reasons:
- Mathematical Convenience: Quadratic functions are mathematically tractable, allowing for analytical solutions in many cases. This makes it easier to derive optimal control laws.
- Convexity: Quadratic functions are convex, which ensures that any local minimum is also a global minimum. This property is crucial for optimization problems, as it guarantees that the solution found is the best possible one. (see the mathematical foundation on optimization)
Setting up the optimization problem:
To understand the Linear Quadratic Regulator (LQR), let’s consider a practical example. We will consider a dynamical system with multiple states and multiple control inputs, such as a satellite for example. Our system has several states that need to be controlled or monitored, for example its orientation, position, etc. We represent the state of the satellite as a vector $x(t) \in \mathbb{R}^n$, where $t$ is the time and $n$ is the dimension of the vector, i.e. the amount of state that we identified in the system. We will denote $x_i$ with $i \in \mathbb{N}$, $i = 1, \cdots, n$ the state $i$ of the satellite. The state vector is represented as follow:
The satellite might have also multiple control inputs that we can use to influence its state, for example main thrusters, electrical thrusters, momentum wheels, etc. We represent the control inputs as a vector $u(t) \in \mathbb{R}^m$, with $m$ input commands identified by the components $u_i$ with $i=1,\cdots, m$. The control vector is represented as follow:
We will assume that the dynamics of this system is linear, governed by the following state-space representation: $$ \dot{x}(t) = Ax(t) + Bu(t)\tag{1.1.1}\label{eq:1_1_dynamics} $$
With $A \in \mathbb{R}^{n \times n}$ the state matrix and $B \in \mathbb{R}^{n \times m}$ the input matrix. These matrices define how the state of the system evolves over time and how the control inputs affect the state, they are usually fixed and defined by the physics of the system. The goal of the controller is to determine the appropriate control inputs $u(t)$ that will drive the state $x(t)$ to a desired state, often the origin (zero state), while minimizing a cost function that penalizes deviations from the desired state and excessive control effort.
Positivity semi-definite and definite matrices
Positive semi-definite: $$ x^\top Q x \geq 0, \quad \forall\ x $$
Similarly, positive definite: $$ u^\top R u > 0, \quad \forall\ u $$
The terms $x(t)^\top Q x(t)$ and $u(t)^\top R u(t)$ are scalars. This is because:
- $x(t)^\top$ is a $1 \times n$ vector
- $Q$ is a $n \times n$ matrix
- $x(t)$ is a $n \times 1$ vector
Thus, the multiplication $x(t)^\top Q x(t)$ results in a $1 \times 1$ matrix, which is a scalar. The same reasoning applies to the term $u(t)^\top R u(t)$.
The way the cost function $J$ is set up here leads to the integral always being positive (due to the properties of $Q$ and $R$), for any $x(t)$ and $u(t)$ combination. The matrices $Q$ and $R$ are weighting matrices that allow us to tune the cost function, where these matrices tradeoff between non-zero states and non-zero control inputs. We will be thinking about $Q$ and $R$ as weights to determine how much we value state compared to how much we value control.
The next step is to formulate the optimization problem, which consists in finding the control input $u(t)$ that minimizes the cost function $J$ while satisfying the dynamics of the system \eqref{eq:1_1_dynamics}. This can be expressed mathematically by the value function $V(x)$ defined in the box below.
It is basicly saying that we want to find the control input $u(t)$ that minimizes the cost function $J$ while satisfying the dynamics of the system \eqref{eq:1_1_dynamics}.
If we take an initial non-zero state for our satellite, i.e. $x(0) \neq 0$, the satellite is at some weird orientation and position, non-zero (unwanted state). Thus the term $x(t)^\top Q x(t)$ in the cost function \eqref{eq:cost_func_ct} is non-zero and positive. If left in that state without any control input, the cost function $J$ will blow up to infinity as time goes on. This is because the integral from time 0 to infinity in \eqref{eq:cost_func_ct} accumulates the positive value of $x(t)^\top Q x(t)$ over time.
This is not an optimal solution, we can’t leave the satellite in that state as it will yields a cost value of infinity. Instead, what is better is to try to bring back the system to the origin, i.e. $x(t) \to 0$. This will make the term $x(t)^\top Q x(t)$ in the cost function \eqref{eq:cost_func_ct} decrease over time, thus the integral will converge to a finite value. This is a much better solution as it minimizes the cost function $J$.
$^*$The plots are illustrative and not based on actual numerical simulations.
The flip side of the story is how to bring the state back to zero. In our case, we can use thrusters, momentum wheels, etc. to influence the state of the satellite, meaning we can use the control input $u(t)$. However, using $u(t)$ also comes with a cost, represented by the term $u(t)^\top R u(t)$ in the cost function \eqref{eq:cost_func_ct}. If we use too much control input, this term will become large and will also contribute to increasing the cost function $J$.
The cost function is a combination of how long the system is away from the origin (non-zero state) and how much non-zero control input we are using to bring it back to the origin. The goal is to find a balance between these two competing objectives, minimizing the overall cost function $J$.
The matrices $Q$ and $R$ allow us to determine how much we value the state being zero compared to how much we value the control input to be zero. For example, if we set $Q$ to be very “large” and $R$ to be very “small”, we are saying that we care a lot about bringing the state back to zero, even if it means using a lot of control input. Conversely, if we set $Q$ to be very “small” and $R$ to be very “large”, we are saying that we care more about minimizing the control input, even if it means the state takes longer to return to zero. We say that a control policy is aggressive when $Q$ is large and $R$ is small, and conservative when $Q$ is small and $R$ is large.
Note: the terms “large” and “small” here are relative knowing that $Q$ and $R$ are matrices and can be of different sizes.
Example: 2 states, 2 control system
Let's consider a simple example with 2 states and 2 control inputs. The state vector $x(t)$ and control vector $u(t)$ can be represented as: \[ x(t) = \begin{bmatrix} x_1(t) \\\ x_2(t) \end{bmatrix} \quad u(t) = \begin{bmatrix} u_1(t) \\\ u_2(t) \end{bmatrix} \] In this case, the matrices $Q$ and $R$ will be $2\times 2$ matrices. Let's choose: \[ Q = \begin{bmatrix} q_{11} & 0 \\\ 0 & q_{22} \end{bmatrix} \quad R = \begin{bmatrix} r_{11} & 0 \\\ 0 & r_{22} \end{bmatrix} \] Where $q_{11}, q_{22}, r_{11}, r_{22}$ are positive scalars. Let's compute the integrand of the cost function $J$: \[ x(t)^\top Q x(t) + u(t)^\top R u(t) = q_{11} x_1(t)^2 + q_{22} x_2(t)^2 + r_{11} u_1(t)^2 + r_{22} u_2(t)^2 \] We can identify $q_{11}$ as the penalty or the weight on the non-zero state $x_1(t)$, $q_{22}$ as the penalty on the non-zero state $x_2(t)$. The $Q$ matrix, by tunning the entries appropriately, allows us to tune how much we care about each state being non-zero. Similarly, $r_{11}$ as the penalty on the non-zero control input $u_1(t)$, and $r_{22}$ as the penalty on the non-zero control input $u_2(t)$. Hence, the $R$ matrix allows us to tune how much we care about each control input being non-zero. By adjusting these weights, we can shape the behavior of the optimal control policy to meet our specific requirements.
This allows us to set up the optimization problem that we want to solve and to have a good understanding of what the cost function represents and how the matrices $Q$ and $R$ influence the solution.
We will see in the next chapter that in order to solve the optimization problem, we will need to solve the Riccati equation.
While the LQR provides an elegant and powerful solution for unconstrained linear systems with quadratic costs, it falls short in many practical scenarios. Real-world systems often have constraints on states and inputs (such as actuator limits, safety boundaries, or physical restrictions) that LQR cannot handle directly. Moreover, LQR assumes perfect model knowledge and does not account for disturbances or uncertainties.
Model Predictive Control (MPC) extends the ideas of LQR by explicitly incorporating constraints and optimizing control actions over a finite prediction horizon. MPC can handle multivariable systems, constraints, and even nonlinearity, making it a much more versatile and practical approach for modern control problems. Understanding MPC is therefore essential for advancing in control theory and tackling real-world engineering challenges.
The Riccati Equation
Video transcript :
Now that we have set up the optimization problem for the Linear Quadratic Regulator (LQR), we want to solve it. We want to find a control law $u(t)$ which will make the whole system optimal.
It is beyond the scope of this lecture to derive the solution of this optimization problem, but the optimal control law is given by a state feedback law of the form:
Where $K$ is the feedback gain matrix, which is given by: $$ K = R^{-1} B^\top S $$ Where $S$ is the solution of the (continuous time) algebraic Riccati equation (CARE) ($S$ as size $n \times n +$ symmetric):
If we look back only at \eqref{eq:feedback_law_riccati}, we can see that it is only a full state feedback controller, in other words the optimal way to solve the optimization problem is to use a full state feedback controller. But in order to compute the gain matrix $K$, we need go through several steps.
- Define the system dynamics: $A$, $B$ (known from the plant)
- Define the cost function weights: $Q$, $R$ (tuning parameters)
- Solve the Riccati equation \eqref{eq:riccati} for $S$
- Compute the optimal gain matrix: $K = R^{-1} B^\top S$
- Choose the $K$ solution that yield a stable system
Let take an example to illustrate the procedure. Consider a mass damper system, in which a mass $m$ is on a smooth surface (ice for example) and there is some viscous damping with coefficient c between the surface and the block. We will only consider the horizontal position and velocity of the block as states, and a unique control input which is a force $F(t)$ that we can apply to the block. The dynamics of this system are given by newton’s second law of motion:
Where $p(t)$ is the position of the block, $\dot{p}(t)$ is the velocity and $\ddot{p}(t)$ is the acceleration.
Let define the state and control vectors as:
With:
- $x(t) \in \mathbb{R}^2$ because there are two states: position and velocity.
- $u(t) \in \mathbb{R}^1$ because there is a single control input, the force $F(t)$.
We can rewrite the dynamics in state-space form as:
With:
- $A$ is $2\times 2$ because it maps a 2-dimensional state to a 2-dimensional derivative.
- $B$ is $2\times 1$ because it maps a single input to the 2-dimensional state derivative.
Let’s choose some numerical values for the parameters: $m = 1$ and $c = 0.2$. Thus the matrices $A$ and $B$ are given by:
With this, we completed the first step of the procedure.
In step two, we need to define the cost function weights $Q$ and $R$. Let’s choose:
With:
- $Q$ must be $2 \times 2$ to match the 2-dimensional state vector $x(t)$.
- $R$ must be $1 \times 1$ to match the 1-dimensional control vector $u(t)$.
For the sake of this example we choose $Q$ as the identity matrix, meaning that we care equally about both states being zero. We choose $R$ to be a small value, meaning that we are willing to use a lot of control input to bring the states back to zero. We will later solve problems with matrices that are not identities.
The next step is to solve the Riccati equation \eqref{eq:riccati} for $S$. This can be done using numerical methods or software tools like MATLAB, Python, etc. For this example, let’s use Mathematica in order to solve the Riccati equation.
In Mathematica, we need to use function like Transpose, Simplify, and Inverse to manipulate matrices. The Riccati equation is a matrix equation, so we need to express it in a form that Mathematica can understand. After setting up the equation in Mathematica, we can use the Solve function to find the matrix $S$ that satisfies the Riccati equation. As mentioned before, we get several solutions for $S$, but we will only keep the one that yield a stable system.
In order to determine which of the solutions for $S$ yield a stable system, we need to compute the gain matrix $K$ for each solution using the formula $K = R^{-1} B^\top S$. Then, we can analyze the closed-loop system dynamics given by $\dot{x}(t) = (A - BK)x(t)$. A system is considered stable if all the eigenvalues of the matrix $(A - BK)$ have negative real parts. We can compute the eigenvalues in Mathematica using the Eigenvalue function. By checking the eigenvalues for each solution of $S$, we can identify which one leads to a stable closed-loop system.
From running the calculations in Mathematica, we find that the gain matrix $K$ is given by: $$ K = \begin{bmatrix} 10.0 & 10.76 \end{bmatrix} $$
Another way to solve this problem is to use Matlab, which has a built-in function lqr that can directly compute the gain matrix $K$ given the matrices $A$, $B$, $Q$, and $R$. Using Matlab’s lqr function, we obtain as a result the matrices $K$, $S$ and the closed-loop eigenvalues: $$ \begin{bmatrix} K, & S, & E \end{bmatrix} = \text{lqr}(A, B, Q, R) $$
Dynamic Programming
In the previous video, we used the continuous-time algebraic Riccati equation (CARE) to solve the Linear Quadratic Regulator (LQR) problem. However, the derivation of the Riccati equation from the LQR problem was not covered in detail. Here, we will look at the discrete-time version of the LQR problem and derive the discrete-time Riccati equation (DARE) using Dynamic programming.
Key assumptions:
Markovian assumption:
- There exists a Markovian state that evolves according to $$ x_{t+1} = f_t(x_t, u_t) \quad \text{discrete-time dynamical system} $$
- The initial state $x_0$ is known.
- The cost is additive over time $$ J = \sum_{t} g_t(x_t, u_t) $$
Bellman’s principle of optimality
- “An optimal policy has the property that whatever the initial state and initial decisions are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.”
Meaning we can optimize the first decision instead of all the decisions at once and then make the optimal decision at each time step. We can break down the optimization problem into smaller subproblem and solve them recursively using the value function $V_t(x)$ defined in previous section:
Note that $$ \min_{\substack{u_{1}, \ldots, u_{T} \\ x_{1}, \ldots, x_{T}}} \sum_{t=1}^{T} g_{t}(x_{t}, u_{t}) = V_{1}(x_{1}) = V_{1}(f_{0}(X_{0}, u_{0})) $$ The problem are nested minimization, we can solve them recursively.
Derivation of the discrete-time Riccati equation (DARE):
We us Dynamic programming to solve the discrete-time LQR problem. The idea is to break down the optimization problem into smaller subproblem and solve them recursively. We can first look at the last stage of the optimization problem, at that terminal-step, the solution is trivial as there is no future cost to consider (no input anymore). $$ V_T(x_T) = g_T(x_T) $$ We can then move one step back in time and compute the cost function for $T-1$:
That is $$ \min_{u_{T-1}} \left( g_{T-1}(x_{T-1}, u_{T-1}) + V_T(f_{T-1}(x_{T-1}, u_{T-1})) \right) $$
We can continue this process recursively until we reach the initial time step. At each time step, we solve a smaller optimization problem that considers the current state, control input, and the cost-to-go from the next time step. That way the optimal control problem is decomposed into stages problems that can be solved using backward induction.
Now if we introduce the specific quadratic cost function and linear dynamics of the discrete-time LQR problem: $$ V_t(x) = \min_{u_t,\cdots,u_{T-1}} \left( x_t^\top Q x_t + u_t^\top R u_t + V_{t+1}(A x + B u) \right) $$ with the terminal cost $ V_T(x)=x^\top Sx$.
In the following section we will derive the induction step of the dynamic programming algorithm, which will lead us to the discrete-time Riccati equation (DARE).
The assumptions are the following:
- the cost function is quadratic,
- the dynamics are linear,
- there are no constraints on the system.
$$ V_T(x) = x^\top S x $$ $$ V_{T+1}(x) = x^\top P_t x $$ On can show that $V_t(x)=x^\top P_tx$ (see proof in the note below) and then derive a formula for $P_t$
Proof that $V_t(x)=x^\top P_t x$
How to remove $\min_{u}(\cdots)$
Controllability and Observability
Controllability
We consider the following system
$$ \dot{x} = Ax + Bu $$
We want to set the eigenvalues of $A$ to be in the left half plane, so that the system is stable. We can use the feedback control given by the matrix $B$ and the input $u$ to do so. But first, we need to check if the dynamics of the system allow us to manipulate the eigenvalues of $A$ using the input $u$. This is where the concept of controllability comes into play. If the system is controllable, then we can design a control law that will allow us to place the eigenvalues of $A$ in the left half plane.
For linear systems, we know that an optimal control law is given by $u=-Kx$, where $K \in \mathbb{R}^{m \times n}$. Which leads to the dynamics: $$ \dot{x} = Ax - BKx = (A-BK)x $$
The system is controllable if we can chose $u=-Kx$ and it allows us to place the eigenvalues of the system wherever we want in the complex plane. Moreover, a controllable system means that we can drive the state $x$ from any initial state $x(0)$ to any final state $x_f$ in a finite time using an appropriate input $u(t)$.
In most of the systems, the matrices $A$ and $B$ are given and fixed due to the physical properties of the system, but we get to choose the input $u(t)$. The only thing that impacts whether the system is controllable or not are the matrices $A$ and $B$, depending on those matrices, the system can be easy to control or impossible to control.
To check if a system is controllable, we can use the controllability matrix defined as: $$ \mathcal{C} = \begin{bmatrix} B & AB & A^2B & \cdots & A^{n-1}B \end{bmatrix} $$
The system is controllable if the controllability matrix $\mathcal{C}$ has full row rank, i.e. $\text{rank}(\mathcal{C}) = n$. If the rank of $\mathcal{C}$ is less than $n$, then the system is not controllable.
Observability
Similarly to controllability, we consider the following system
$$ \dot{x} = Ax + Bu $$
We also consider the output equation: $$ y = Cx + Du $$
We want to be able to estimate the state $x$ of the system using the output $y$. This is where the concept of observability comes into play. If the system is observable, then we can design an observer that will allow us to estimate the state $x$ using the output $y$.
To check if a system is observable, we can use the observability matrix defined as: $$ \mathcal{O} = \begin{bmatrix} C \\ CA \\ CA^2 \\ \vdots \\ CA^{n-1} \end{bmatrix} $$
The system is observable if the observability matrix $\mathcal{O}$ has full column rank, i.e. $\text{rank}(\mathcal{O}) = n$. If the rank of $\mathcal{O}$ is less than $n$, then the system is not observable.
Exercises
Exercise 1.1: State space form for chemical reaction model
Consider the following chemical reaction kinematics for a two-step series reaction: $$ A \xrightarrow{\;k_1\;} B \quad B\xrightarrow{\;k_2\;} C $$ We wish to follow the reaction in a constant volume, well-mixed, batch reactor. As taught in th undergraduate chemical engineering curriculum, we proceed by writing material balances for the three species giving $$ \frac{dc_A}{dt}=-r_1 \quad \frac{dc_B}{dt}=r_1-r_2 \quad \frac{dc_C}{dt}=r_2 $$ in which $c_j$ is the concentration of species $j$, and $r_1$ and $r_2$ are the rates (mol/(time$\cdot$vol)) at which the two reactions occur. We then assume some rate law for the reaction kinetics, such as $$ r_1=k_1c_A \quad r_2 = k_2c_B $$ We substitute the rate laws into the material balances and specify the starting concentrations to produce three differential equations for te three species concentrations.
- Write the linear state space model for the deterministic series chemical reaction model. Assume we can measure the component A concentration. What are $x$, $y$, $A$, $B$, $C$, and $D$ for this model?
- (Optional) Simulate this model with initial conditions and parameters given by \[ c_{A0}=1 \quad c_{B0}=c_{C0}=0 \quad k_1=2 \quad k_2=1\]
Correction
Note: This correction reflects my interpretation of the exercise as a master’s student. I am not a professional in the field, and while I have done my best, errors may still be present.
import control as ctrl
import numpy as np
import matplotlib.pyplot as plt
# Define the values
c_A0 = 1.0 # Initial concentration of A (mol/L)
c_B0 = 0.0 # Initial concentration of B (mol/L)
c_C0 = 0.0 # Initial concentration of C (mol/L)
k1 = 2.0 # Rate constant for A -> B (1/min)
k2 = 1.0 # Rate constant for B -> C (1/min)
A_0 = 5.0 # Initial concentration of A
T = np.linspace(0, 10, 100) # time vector
U = np.zeros_like(T) # input (no input)
X0 = [A_0, 0, 0] # initial state
# Define the state-space representation
A = [[-k1, 0, 0],
[k1, -k2, 0],
[0, k2, 0]]
B = [[0], [0], [0]] # No input
C = [[0, 0, 1]] # Output is concentration of C
D = [[0]]
# Create the state-space system
system = ctrl.StateSpace(A, B, C, D)
T, yout, xout = ctrl.forced_response(system, T, U, X0, return_states=True)
def plot_dynamics(T, yout, xout):
"""Plot the dynamics of the system."""
plt.figure(figsize=(10, 6))
plt.plot(T, xout[0, :], label='Concentration of A', color='blue', linestyle='--')
plt.plot(T, xout[1, :], label='Concentration of B', color='green', linestyle='--')
plt.plot(T, yout.T, label='Concentration of C', color='red', linewidth=2)
plt.xlabel('Time')
plt.ylabel('Concentration (mol/L)')
plt.title('System Dynamics initial state A=5 mol/L')
plt.legend()
plt.show()
# Plot the results
plot_dynamics(T, yout, xout)
This gives the following result for a starting concentration of component $A_0 = 5$ (mol/L):
% Define the values
c_A0 = 1.0; % Initial concentration of A (mol/L)
c_B0 = 0.0; % Initial concentration of B (mol/L)
c_C0 = 0.0; % Initial concentration of C (mol/L)
k1 = 2.0; % Rate constant for A -> B (1/min)
k2 = 1.0; % Rate constant for B -> C (1/min)
A_0 = 5.0; % Initial concentration of A
% Time vector
T = linspace(0, 10, 100);
U = zeros(size(T)); % Input (none, zero vector)
X0 = [A_0; 0; 0]; % Initial state
% Define the state-space representation
A = [-k1 0 0;
k1 -k2 0;
0 k2 0];
B = [0; 0; 0]; % No input
C = [0 0 1]; % Output is concentration of C
D = 0;
% Create the state-space system
system = ss(A, B, C, D);
% Simulate system response with initial conditions
[yout, T, xout] = initial(system, X0, T);
% Plot dynamics
figure;
plot(T, xout(:,1), '--b', 'DisplayName', 'Concentration of A'); hold on;
plot(T, xout(:,2), '--g', 'DisplayName', 'Concentration of B');
plot(T, yout, '-r', 'LineWidth', 2, 'DisplayName', 'Concentration of C');
xlabel('Time (min)');
ylabel('Concentration (mol/L)');
title('System Dynamics initial state A = 5 mol/L');
legend('show');
grid on;
This gives the following result for a starting concentration of component $A_0 = 5$ (mol/L):
Exercise 1.2: Time to Laplace domain
Take the Laplace Transform of the following det of differential equations and fine the transfer function, G(s) connecting $u(s)$ and $y(s)$, $y=Gu$.
For $x\in\mathbb{R}^n$, $y\in\mathbb{R}^p$ and $u\in\mathbb{R}^m$, what is the dimension of the $G$ matrix? What happens to the initial condition, $x(0)=x_0$?
Solution
We know that $\mathcal{L}(x(t))=X(s)$ and $\mathcal{\dot{x}(t)}=sX(s)-x(0)$, then
- $x\in\mathbb R^n$ so $A$ is $n\times n$.
- $u\in\mathbb R^m$ so $B$ is $n\times m$.
- $y\in\mathbb R^p$ so $C$ is $p\times n$
- $D$ is $p\times m$.
Thus $G(s)=C(sI-A)^{-1}B+D$ is a $p\times m$ matrix (maps m-vectors $U(s)$ to p-vectors $Y(s)$).
The initial state $x_0$ produces the extra term $C(sI-A)^{-1}x_0$ in $Y(s)$. This term is independent of the input $U(s)$ and represents the system’s natural (homogeneous) response due to the initial condition. If the initial condition is zero ($x_0=0$), the extra term vanishes and $Y(s)=G(s)U(s)$.
Exercise 1.3: Sum of quadratic functions
Consider the two quadratic unctions given by: $$ V_1(x)=\frac{1}{2}(x-a)^\top A(x-a) \quad V_2(x)=\frac{1}{2}(x-b)^\top B(x-b) $$ in which $A$, $B>0$ are positive definite matrices and $a$ and $b$ are n-vectors locating the minimum of each function. The figure below displays the ellipses defined by the level set $V_1(x)=\frac{1}{4}$ and $V_2(x)=\frac{1}{4}$ for the following parameters: $$ A=\begin{bmatrix} 1.25 & 0.75 \\\ 0.75 & 1.25 \end{bmatrix} \quad a=\begin{bmatrix} -1 \\\ 0 \end{bmatrix} \quad B=\begin{bmatrix} 1.5 & -0.5 \\\ -0.5 & 1.5 \end{bmatrix} \quad b=\begin{bmatrix} 1 \\\ 1 \end{bmatrix} $$
(a) Show that the sum $V(x)=V_1(x)+V_2(x)$ is also quadratic $$ V(x)=\frac{1}{2}((x-v)^\top H(x-v)+d) $$ in which $$ H=A+B \quad v=H^{-1}(Aa+Bb) $$ $$ d=-(Aa+Bb)^\top H^{-1}(Aa+Bb)+a^\top Aa+b^\top Bb $$ and verify the three ellipses given in the figure above.
(b) Considere the generalization useful in the discussion of the upcoming regulation and estimation problems. Let $$ V_1(x)=\frac{1}{2}(x-a)^\top A(x-a) \quad V_2(x)=\frac{1}{2}(x-b)^\top B(x-b) $$ Derive the expressions for $H$, $v$ and $d$ in this case.
(c) Use the matrix inversion lemma and show that $V(x)$ of part (b) can be expressed also in an inverse form, which is useful in state estimation problems
$$ V(x)=\frac{1}{2}((x-v)^\top\tilde{H}^{-1}(x-v)+\text{constant}) $$ $$ H^{-1}=A^{-1}-A^{-1}C^\top(CA^{-1}C^\top+B^{-1})^{-1}CA^{-1} $$ $$ v=a+A^{-1}C^\top(CA^{-1}C^\top+B^{-1})^{-1}(b-Ca) $$
Matrix inversion lemma
This lemma is presented under the shape of an exercise to be solved. Let the matrix $Z$ be defined as $$ Z=\begin{bmatrix} B & C \\\ D & E \end{bmatrix} $$ and assume that $Z^{1}$, $B^{-1}$ and $E^{-1}$ exist.
(a) Perform row elimination and show that $$ Z^{-1}=\begin{bmatrix} B^{-1} + B^{-1} C (E - D B^{-1} C)^{-1} D B^{-1} & -B^{-1} C (E - D B^{-1} C)^{-1} \\\ -(E - D B^{-1} C)^{-1} D B^{-1} & (E - D B^{-1} C)^{-1} \end{bmatrix} $$ Note that this result is still valid if $E$ is singular.
(b) Perform column elimination and show that $$ Z^{-1}=\begin{bmatrix} (B - C E^{-1} D)^{-1} & -(B - C E^{-1} D)^{-1} C E^{-1} \\\ -E^{-1} D (B - C E^{-1} D)^{-1} & E^{-1} + E^{-1} D (B - C E^{-1} D)^{-1} C E^{-1} \end{bmatrix} $$ Note that this result is still valid if $B$ is singular.
(c) A host of other useful control-related inversion formulas follow from these results. Equate the (1,1) or (2,2) entries of $Z^{-1}$ and derive the identity $$ (A + B C D)^{-1} = A^{-1} - A^{-1} B (DA^{-1} B + C^{-1})^{-1} D A^{-1} \tag{1.54} \label{1.54} $$ A useful special case of this result is $$ (I+X^{-1})^{-1}=I-(I+X)^{-1} $$
(d) Equate the (1,2) or (2,1) entries of $Z^{-1}$ and derive the identity $$ (A+BCD)^{-1} BC= A^{-1} B (D A^{-1} B + C^{-1})^{-1} \tag{1.55} \label{1.55} $$ Equations \eqref{1.54} and \eqref{1.55} prove especially useful in rearranging formulas in least squares estimation.
Solution of the exercise
(a) The sum of two quadratics is also quadratic, so we parametrize the sum as $$ V(x)=\frac{1}{2}((x-v)^\top H(x-v)+d) $$ and solve for $v$, $H$ and $d$. Comparing the expansion of the quadratic of the right- and left-hand sides gives $$ x^\top Hx-2x^\top Hv+v^\top Hv+d=x^\top(A+B)x-2x^\top(Aa+Bb)+a^\top Aa+b^\top Bb $$ Equating terms at each order gives
Notice that $H$ is positive definite since $A$ and $B$ are positive definite. Substituting the values of $a$, $A$, $b$ and $B$ gives $$ H=\begin{bmatrix} 2.75 & 0.25 \\\ 0.25 & 2.75 \end{bmatrix} \quad v=\begin{bmatrix} -0.1 \\\ 0.1 \end{bmatrix} \quad d=3.2 $$ The level set $V(x)=2$ is also plotted in figure above.
(b) Expanding and comparing terms as before, we obtain
Notice the $H$ is positive definite since $A$ is positive definite and $C^\top BC$ is positive semi-definite for any $C$.
(c)Define $x=x-a$ and $y=b-Ca$ and express the problem as $$ V(x)=\frac{1}{2}x^\top Ax+\frac{1}{2}(C(x+a)-b)^\top B(Cx+a-b)=\frac{1}{2}x^\top Ax+\frac{1}{2}(Cx-b)^\top B(Cx-b) $$ Apply the solution from part (b) to obtain $$ V(x)=\frac{1}{2}((x-v)^\top H(x-v)+d) \\\ $$ $$ H=A+C^\top BC \quad v=H^{-1}C^\top Bb\\\ $$ and we do not need to evaluate $d$. From the matrix inversion lemma, use (1.54) on $H$ and (1.55) on $v$ to obtain $$ \tilde{H}=A^{-1}-A^{-1}C^\top (CA^{-1}C^\top +B^{-1})^{-1}CA^{-1} $$ $$ v=A^{-1}C^\top (CA^{-1}C^\top +B^{-1})^{-1}(b-Ca) $$ The function $V(x)$ can be expressed as
where $$ v=a+A^{-1}C^\top(CA^{-1}C^\top+B^{-1})^{-1}(b-Ca) \quad \quad □ $$
Classical MPC
Introduction
Model Predictive Control (MPC) is an advanced method of process control that has been widely adopted in various industries due to its ability to handle multivariable control problems with constraints. The core idea of MPC is to use a model of the system to predict future behavior and optimize control actions over a finite time horizon.
MPC is a form of control in which the control action is obtained by solving, at each sampling instant (online), a finite horizon optimal control problem, using the current state of the plant as the initial state. The optimization yields an optimal control sequence and the first control in this sequence is applied to the plant. This process is repeated at the next sampling instant, creating a feedback loop. MPC differs, therefore, from traditional control methods, which typically rely on a fixed control law derived offline.
The great advantage of MPC is that open-loop optimal control problems often can be solved rapidly enough, using standard mathematical programming algorithms, to permit the use of MPC even though the system being controlled is nonlinear, and hard constraints on states and controls must be satisfied.
In this chapter we study MPC for the case when the state is known. This case is particularly important, even though it rarely arises in practice, because important properties, such as stability and performance, may be relatively easily established.
Invariant Sets
Invariance
The invariant set will provide a set of initial states for which the trajectories will never violate the system constraints. In order to increase the applicability of the MPC algorithm, and in particular to increase the size of the set of initial conditions $x_{0}$ for which the terminal condition $x_{N} \in \mathcal{X}_T$ can be met, it is important to choose the maximal positively invariant set as the terminal constraint set. This set is defined as follows.
Proof
Let consider
then it can be shown that \eqref{eq:mpi_set} holds for some finite $v$ using the definition on maximal positively invariant set to show that the MPI set $\mathcal{X}^{\text{MPI}}$ is equal to $\mathcal{X}^{(v)}$ for finite $v$.
In particular, if $x_{0\mid k} \notin \mathcal{X}^{(n)}$ for a given $n$, then the constraint \eqref{eq:constraints} must be violated under the dynamics of \eqref{eq:u_control} and \eqref{eq:x_dynamics}. By Definition 2.2 therefore, any $x\notin \mathcal{X}^{(n)}$ cannot lie in $\mathcal{X}^{\text{MPI}}$, so $\mathcal{X}^{(n)}$ must contain $\mathcal{X}^{\text{MPI}}$, for all $n\geq0$.
Furthemore, if $(F+GK)\Phi^{v+1}x\leq \mathbf{1}$, for all $x \in \mathcal{X}^{(v)}$, then $\Phi x\in\mathcal{X}^{(v)}$ must hold hold whenever $x \in \mathcal{X}^{(v)}$ (since $x \in \mathcal{X}^{(v)}$ and $(F+GK)\Phi^{v+1} \leq \mathbf{1}$ imply $(F+GK)\Phi^{v+1}(\Phi x) \leq \mathbf{1}$ for $i=0, \cdots, v$). But from the definition of $\mathcal{X}^{(v)}$ we have $(F+GK)x\leq \mathbf{1}$ for all $x\in\mathcal{X}^{(v)}$, and therefore $\mathcal{X}^{(v)}$ is positively invariant \eqref{eq:u_control}, \eqref{eq:x_dynamics} and \eqref{eq:constraints}. From Definition 2.2 it can be concluded that $\mathcal{X}^{(v)}$ is a subset of $\mathcal{X}^{\text{MPI}}$, and therefore equal to $\mathcal{X}^{\text{MPI}}$.
Finally, for $v \geq n_x$, the set $\mathcal{X}^{(v)}$ is necessarily bounded if $(\Phi,F+GK)$ is observable, and, since $\Phi$ is strictly stable, the set
must contain $\mathcal{X}^{(v)}$ for finite $v$; therefore $\mathcal{X}^{\text{MPI}}$ must be defined by \eqref{eq:mpi_set} for some finite $v$.
Pre-Sets
The dynamics of the pendulum can be described as follow
Discretized with forward Euler at 1Hz, where $x_{1}$ is the position, $x_{2}$ is the velocity.
The target set is defined as
Consider the phase diagram below, which shows the target set $T$, the pre-set of $T$ is represented by the red area. The pre-set of $T$ is the set of states that will evolve into the target set $T$ in one step. For example, the state $x=[0, 1]^\top$ is in the pre-set of $T$ since it will evolve into the target set $T$ in one step (see the red arrow). However, the state $x=[1, 0]^\top$ is not in the pre-set of $T$ since it will not evolve into the target set $T$ in one step (illustrated by the orange arrow).
We prove the contrapositive for both the necessary and sufficient conditions.
Necessary condition: Assume that $\mathcal{X} \not\subseteq \text{pre}(\mathcal{X})$. Then there exists an $x \in \mathcal{X}$ such that $x \notin \text{pre}(\mathcal{X})$. From the definition of pre-set, this means that $(F+GK)x \leq \mathbf{1}$ but $\Phi x \notin \mathcal{X}$. Therefore, from Definition 2.1, $\mathcal{X}$ is not positively invariant.
Sufficient condition: Assume that $\mathcal{X}$ is not positively invariant. Then, from Definition 2.1, $\exists x \in \mathcal{X}$ such that $(F+GK)x \leq \mathbf{1}$ but $\Phi x \notin \mathcal{X}$. From the definition of pre-set, this means that $x \notin \text{pre}(\mathcal{X})$. Therefore, $\mathcal{X} \not\subseteq \text{pre}(\mathcal{X})$.
Note that $\mathcal{X} \subseteq \text{pre}(\mathcal{X})$ is equivalent to $\text{pre}(\mathcal{X}) \cap \mathcal{X} = \mathcal{X}$.
\[ \begin{aligned} &\textbf{Input: } ( f, \mathbb{X} ) & \\\ &\textbf{Output: } ( \mathcal{X}_{\infty} ) \\\ & \\\ &\Omega_{0} \leftarrow \mathbb{X} \\\ &\textbf{loop} \\\ &\quad \Omega_{k+1} \leftarrow \text{pre}(\Omega_{k}) \cap \Omega_{k} \\\ &\quad \textbf{if } \Omega_{k+1} = \Omega_{k} \textbf{ then} \\\ &\qquad \text{return } \mathcal{X}_{\infty} = \Omega_{k} \\\ &\quad \textbf{end if} \\\ &\textbf{end loop} \end{aligned} \] The algorithm generates the set sequence $\{\Omega_k\}$ satisfying $\Omega_{k+1} \subseteq \Omega_k$ for all $k \geq 0$. If the sequence converges in finite time, i.e., $\exists k^* \geq 0$ such that $\Omega_{k^*+1} = \Omega_{k^*}$, then the limit set $\mathcal{X}_{\infty} = \Omega_{k^*}$ is the maximal positively invariant set contained in $\mathbb{X}$.
Controlled Invariance
This defines the states for which there exist a controller that will satisfy the constraints at all time.
The concept of pre-sets we saw in the previous section can be extended to control invariant set in order to comput the maximal control invariant set (MCPI).
The maximum control invariant set is the best a controller can do.
From this control invariant set, we can derive a control law, this control law $\kappa(x)$ will guarantee that the system $x^+=f(x,\kappa(x))$ will satisfy the constrains at all time if: $$ f(x, \kappa(x)) \in \mathcal{C} \quad \text{for all } x\in\mathcal{C} \quad \text{with } \mathcal{C} \text{ a control invariant set of the system} $$
We can use this to synthesize a control law from a control invariant set by solving an optimization problem:
Where $g$ is any function.
Why don't we compute maximal control invariant set for all systems !?
We can’t ! We often deal with linear system for which the the dynamics are usually too complex to be able to compute those sets, and when we deal with nonlinear system dynamics, it is almost always too complex to compute those sets.
That’s where MPC comes into play, we will use MPC to approximate the control invariant set such that it is easier to represent and compute.
Model Predictive Control
To study model predictive control, we consider the following system dynamics
where $x\in\mathbb{R}^{n}$ is the state vector and $u\in\mathbb{R}^{m}$ is the control input vector. The function $f$ is assumed to be continuous and continuously differentiable with respect to its arguments.
The goal is to design a control law $u=\kappa(x)$ such that the following points ar respected
- the system satisfy the constraints: ${x_k}\subset \mathbb{X}, {u_k} \subset \mathbb{U}$
- it is stable: $\lim_{k\to\infty}x_k=0$
- it optimizes “performance” (we will develop this idea later on)
- it maximizes teh set of initial states ${x_0}$
- (later) it is robust to noise
- (later) it can be computed efficiently and reliably
The control law with the best properties is the solution of the infinite horizon, constrained optimal control problem for which the cost function is
The function $\ell(x,u)$ is the stage cost, it describes the “cost” of being in state $x$ and applying input $u$. It is assumed to be continuous and continuously differentiable with respect to its arguments. And $g(x,u)$ is a vector of constraint functions, which are also assumed to be continuous and continuously differentiable with respect to their arguments.
The goal of the optimal control problem is to find a control law
that minimizes the cost function while satisfying the system dynamics and constraints, from which the input $u^{*}_0$ will be applied to the system.
The optimal control problem is defined as follows:
If $\ell(\cdot)$ is positive definite, then the optimal cost $V_\infty^{*}(x_0)$ is a Lyapunov function for the closed-loop system obtained by applying the optimal control law $u^{*}(t)$ to the system. The optimal control law $u^{*}(t)$ is therefore stabilizing the system to the origin.
However, solving the infinite horizon optimal control problem is often intractable, especially when the system is nonlinear and/or when there are constraints on the states and inputs. Therefore, we approximate the infinite horizon optimal control problem by a finite horizon optimal control problem, which is more tractable. This leads to the approach:
where $N$ is the prediction horizon, $V_f(x)$ is the terminal cost, and $\mathcal{X}_f$ is the terminal constraint set. The terminal cost and terminal constraint set are used to approximate cost of the system from $N$ to $\infty$ and to ensure stability of the closed-loop system.
Example: Toy problem
In this example we consider a simple linear system with a time horizon of 1 and a scalar integrator system:
- Quadratic problem with linear constraints
- Parametric solution $u_0^{*}(x_0)$
- The equality constraints (model) allows to eliminate $x_1$
- Convex problem $\Rightarrow$ KKT conditions are necessary and sufficient
The optimization problem for the toy problem can be rewritten as
We can identify two cases for the KKT conditions:
- Case 1: $\mu=0$ and $g(u_0)<0$ (constraint not active)
- Case 2: $\mu>0$ and $g(u_0)=0$ (constraint active)
Case 1: $\mu=0$ and $g(u_0)<0$ (constraint not active)
Case 2: $\mu>0$ and $g(u_0)=0$ (constraint active)
The optimal control law is therefore given by
Stability and Convergence
Examples of MPC
Exercises
Exercise 2.1: Maximum invariant
A discret time system is described by the model $x_{k+1}=Ax_k+Bu_k$ with $$ A=\begin{bmatrix} 0.3 & -0.9 \\\ -0.4 & -2.1 \end{bmatrix} \quad B=\begin{bmatrix} 0.5 \\\ 1 \end{bmatrix} $$ where $u_k=Kx_k$ for $K = \begin{bmatrix} 0.224 & 1.751 \end{bmatrix}$, and for all $k=0,1,\cdots$ the state $x_k$ is subject to the constraints $$ \mid\begin{bmatrix} 1 & -1 \end{bmatrix}x\mid \, \leq \, 1 $$
(a) Describe a procedure based on linear programming for determining the largest invariant set compatible with constraints $\mid\begin{bmatrix} 1 & -1 \end{bmatrix}x\mid\,\leq \,1$.
(b) Demonstrate by solving a linear program that the maximal invariant set is defined by
$$ \text{where } F = \begin{bmatrix} 1 & -1 \\\ -1 & 1 \end{bmatrix} \text{ and } \Phi = \begin{bmatrix} 0.42 & -0.025 \\\ -0.16 & -0.35 \end{bmatrix} $$
Solution
(a) The largest invariant set compatible with the constraints is given by
where $v$ is such that $F\Phi^{v+1}x \leq 1$ for all $x \in \mathcal{S}_v$. Since $\mathcal{S}_v$ is symmetric about $x=0$, this condition can be checked by solving the linear program:
(b) To check $v=1$:
gives $\mu=0.224$, so $\mu\leq 1$ as required. The set $\mathcal{S}_v$ is therefore invariant for $v=1$.
Robust MPC - In Progress
Introduction
A Game Theoretic Approach
Prediction Dynamics in Robust MPC
Robust Min-Max MPC
To go further - Tube MPC
Exercises
Economic MPC - In Progress
Stage cost
Horizon
Economic MPC vs MPC
Exercises
Credits
- Saverio Bolognani’s lectures: Computational Control at ETH Zurich in spring 2024
- Colin Jones’ lectures: Model Predictive Control (ME-425) at EPFL in Automn 2024
- Model Predictive Control: Classical, Robust and Stochastic textbook by Basil Kouvaritakis, Mark Cannon, 2016
- Model Predictive Control: Theory, Computation, and Design James B. Rawlings, David Q. Mayne, Moritz M. Diehl, 2nd Edition, 2022, available for free here
- An Introduction to Optimization Edwin K.P Chong, Stanislaw H. Zak, 2th Edition, available for free here