Chapter 3 Convex Optimization
3.1 Optimization in standard form
\[\begin{align} \tag{3.1} \text{minimize}& & f_0(x) & & \\ \text{subject to} & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & h_i(x) = 0 & , & i=1,\dots,p \end{align}\]
A standard form of describing optimization problems is (3.1), where
- \(x\in R^n\) is the optimization variable
- \(f_0: R^n \rightarrow R\) is the objective or cost function
- \(f_i: R^n \rightarrow R\) are the inequality constraints
- \(h_i: R^n \rightarrow R\) are the equality constraints
For such a description, the optimal value is defined as \[\begin{equation} \tag{3.2} p^* = \inf \{f_0(x) | f_i(x)\leq 0, h_i(x) = 0 \}. \end{equation}\]
- When the problem is infeasible, \(p^*=\infty\),
- when it’s unbounded below, \(p^* = -\infty\).
3.2 Optimality and local optimality
- feasible: in domain and satisfy all constraints
- optimal: \(f_0(x) = p^*\)
- locally optimal: exists \(R>0\) that \(x\) is optimal under \(\| z-x \|_2 \leq R\).
3.3 Implicit constraints
((3.1)) has an implicit constraint \[\begin{equation} \tag{3.3} x\in \mathcal D = \bigcap_{i=0}^m \mathbf{dom} f_i \cup \bigcap_{i=1}^p \mathbf{dom} h_i \end{equation}\]
- \(\mathcal D\) is called the domain of the problem
- a problem is unconstrained if it has no explict constraints (\(m=p=0\))
3.4 Feasibility problem
\[\begin{align} \tag{3.4} \text{find}& & x & & \\ \text{subject to} & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & h_i(x) = 0 & , & i=1,\dots,p \end{align}\] can be actually written as \[\begin{align} \tag{3.5} \text{minimize}& & 0 & & \\ \text{subject to} & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & h_i(x) = 0 & , & i=1,\dots,p \end{align}\]
- \(p^*=0\) if constraints are feasible
- \(p^*=\infty\) otherwise (\(\inf \varnothing = \infty\))
3.5 Convex optimization
\[\begin{align} \tag{3.6} \text{minimize}& & f_0(x) & & \\ \text{subject to} & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & Ax=b & & \end{align}\]
Boyd defines convex optimization to be:
- \(f_i\) are all convex, equality constraints are all affine.
- also can define quasiconvex problems where \(f_0\) is quasiconvex, and \(f_i (i>0)\) are convex.
Note that the feasible set of a convex optimization problem is convex.
Example 3.2 \[\begin{align} \text{minimize}& & f_0(x)=x_1^2+x_2^2 & \\ \text{subject to} & & f_i(x)=x_1/(1+x_2^2) \leq 0 & \\ & & h_1(x) = (x_1 + x_2)^2=0 & \end{align}\]
- \(f_0\) is convex, feasible set \(\{ x_1 = -x_2 \}\) is also convex
- however not a convex problem according to (3.6)
- we can rewrite to equivalent (not identical) convex problem \[\begin{align} \text{minimize}& & f_0(x)=x_1^2+x_2^2 & \\ \text{subject to} & & x_1 \leq 0 & \\ & & h_1(x) = x_1 + x_2=0 & \end{align}\]
3.6 Optimality in convex optimization
One can easily prove (3.1) by contradiction.
3.6.1 Optimality criterion for differentiable \(f_0\)
We have seen this before in 2.2.
Proof. Assume \(x\) is the optimum. \(\nabla f_0(x)^T (z-x) \geq 0\) for all \(z\), \(Az=b\). Also \(Ax=b\).
Thus \(z-x \in \mathrm{Null}(A)\), i.e. \(\nabla f_0(x) \perp \mathrm{Null}(A)\).
Consequently, \(f_0(x) \in \operatorname{im} (A^T)\),
Thus \(\nabla f_0(x) = A^Tu\), \(\nabla f_0(x) + A^T(-u) = 0\).Example 3.5 minimization over non-negative orthant (\(R_{++}^n\)):
\[\begin{align} \text{minimize}& & f_0(x) & \\ \text{subject to} & & x \succeq 0 & \end{align}\]
\(x\) is optimal iff \[\begin{align} \end{align}\]Proof. Assume \(x\) is the optimum. \(\nabla f_0(x)^T (z-x) \geq 0\) for all \(z\geq 0\).
Plugging in \(z=0\), \(\nabla f(x)^Tx \leq 0\).
If any of the components (\(i\)-th) of \(\nabla f(x)\) is negative, then we can take \(z_i\rightarrow \infty\), and LHS will be negative, thus we know that \(\nabla f(x) \geq 0\).
Since \(\nabla f(x)^Tx \leq 0\) and \(\nabla f(x) \geq 0\), we know that \(\nabla f(x)_i x_i = 0\) for all \(i\) (complementarity).
3.7 Equivalent convex problems
This section is full of trickery that you will regret if you don’t know.
3.7.1 eliminating equality constraints
\[\begin{align} \text{minimize}& & f_0(x) & & \\ \text{subject to} & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & Ax=b & & \end{align}\] is equivalent to \[\begin{align} \tag{3.6} \text{minimize (over z)}& & f_0(Fz+x_0) & & \\ \text{subject to} & & f_i(Fz+x_0) \leq 0 & , & i=1,\dots,m \\ \end{align}\] where \(F\) and \(x_0\) are such that \[\begin{align} \tag{3.8} Ax=b & \Longleftrightarrow & x=Fz+x_0 \text{ for some }z \end{align}\]
How to get \(F\) and \(x_0\)? Well \(F\) spans the null space of \(A\), and \(x_0\) is any particular solution of \(Ax=b\).
Note affine operation preserves convexity, so the eliminated problem is also convex.
Boyd: An intriguing fact is that in reality not only it’s bad to eliminate constraints, it’s often a good idea to add (un-eliminate) equality constraints!
Boyd: Convex problems (by definition here) only has affine equality constraints.
3.7.2 Introducing equality constraints
\[\begin{align} \text{minimize}& & f_0(A_0x+b_0) & & \\ \text{subject to} & & f_i(A_0x+b_0) \leq 0 & , & i=1,\dots,m \\ \end{align}\] is equivalent to \[\begin{align} \text{minimize over $x$, $y_i$}& & f_0(y_0) & & \\ \text{subject to} & & f_i(y_i) \leq 0 & , & i=1,\dots,m \\ & & y_i=A_ix+b_i & & i=1,\dots,m \end{align}\]
3.7.3 Introducing slack variables for linear inequalities
\[\begin{align} \text{minimize}& & f_0(x) & & \\ \text{subject to} & & a_i^Tx \leq b_i & , & i=1,\dots,m \\ \end{align}\] is equivalent to is equivalent to \[\begin{align} \text{minimize over $x$, $s$}& & f_0(x) & & \\ \text{subject to} & & a_i^Tx+s_i=b_i & , & i=1,\dots,m \\ & & -s_i \leq 0 & & i=1,\dots,m \end{align}\]
Boyd: this looks stupid but is actually useful.
3.7.4 The epigraph form (trick)
The standard form (3.6) is equivalent to \[\begin{align} \text{minimize over $x$, $t$}& & t & & \\ \text{subject to} & & f_0(x) - t \leq 0 & & \\ & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & Ax=b & & \end{align}\]
Note that this actually converts any objective to a linear objective!
3.7.5 Minimizing over some variables
\[\begin{align} \text{minimize}& & f_0(x_1, x_2) & & \\ \text{subject to} & & f_i(x_1) \leq 0 & & i=1,\dots,m \end{align}\] is equivalent to \[\begin{align} \text{minimize}& & \tilde f_0(x_1) & & \\ \text{subject to} & & f_i(x_1) \leq 0 & & i=1,\dots,m \end{align}\] where \(\tilde f_0 (x_1) = \inf_{x_2} f_0(x_1, x_2)\)
a.k.a. dynamic programming preserves convexity.
3.8 Dummy
Lorem
Lorem Lorem
Lorem
\(f: \mathbf R^n \rightarrow \mathbf R\) \(f: \mathbf R^n \rightarrow \mathbf R\) \(f: \mathbf R^n \rightarrow \mathbf R\) \(f: \mathbf R^n \rightarrow \mathbf R\)