Chapter 3 Convex Optimization

3.1 Optimization in standard form

\[\begin{align} \tag{3.1} \text{minimize}& & f_0(x) & & \\ \text{subject to} & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & h_i(x) = 0 & , & i=1,\dots,p \end{align}\]

A standard form of describing optimization problems is (3.1), where

  • \(x\in R^n\) is the optimization variable
  • \(f_0: R^n \rightarrow R\) is the objective or cost function
  • \(f_i: R^n \rightarrow R\) are the inequality constraints
  • \(h_i: R^n \rightarrow R\) are the equality constraints

For such a description, the optimal value is defined as \[\begin{equation} \tag{3.2} p^* = \inf \{f_0(x) | f_i(x)\leq 0, h_i(x) = 0 \}. \end{equation}\]

  • When the problem is infeasible, \(p^*=\infty\),
  • when it’s unbounded below, \(p^* = -\infty\).

3.2 Optimality and local optimality

  • feasible: in domain and satisfy all constraints
  • optimal: \(f_0(x) = p^*\)
  • locally optimal: exists \(R>0\) that \(x\) is optimal under \(\| z-x \|_2 \leq R\).

3.3 Implicit constraints

((3.1)) has an implicit constraint \[\begin{equation} \tag{3.3} x\in \mathcal D = \bigcap_{i=0}^m \mathbf{dom} f_i \cup \bigcap_{i=1}^p \mathbf{dom} h_i \end{equation}\]

  • \(\mathcal D\) is called the domain of the problem
  • a problem is unconstrained if it has no explict constraints (\(m=p=0\))
Example 3.1 \[\begin{align} \text{minimize} & & f_0 = -\sum_{i=1}^k \log (b_i-a_i^Tx) \end{align}\] is an unconstrained problem with implicit constraint of \(a_i^Tx<b_i\).

3.4 Feasibility problem

\[\begin{align} \tag{3.4} \text{find}& & x & & \\ \text{subject to} & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & h_i(x) = 0 & , & i=1,\dots,p \end{align}\] can be actually written as \[\begin{align} \tag{3.5} \text{minimize}& & 0 & & \\ \text{subject to} & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & h_i(x) = 0 & , & i=1,\dots,p \end{align}\]

  • \(p^*=0\) if constraints are feasible
  • \(p^*=\infty\) otherwise (\(\inf \varnothing = \infty\))

3.5 Convex optimization

\[\begin{align} \tag{3.6} \text{minimize}& & f_0(x) & & \\ \text{subject to} & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & Ax=b & & \end{align}\]

Boyd defines convex optimization to be:

  • \(f_i\) are all convex, equality constraints are all affine.
  • also can define quasiconvex problems where \(f_0\) is quasiconvex, and \(f_i (i>0)\) are convex.

Note that the feasible set of a convex optimization problem is convex.

Example 3.2 \[\begin{align} \text{minimize}& & f_0(x)=x_1^2+x_2^2 & \\ \text{subject to} & & f_i(x)=x_1/(1+x_2^2) \leq 0 & \\ & & h_1(x) = (x_1 + x_2)^2=0 & \end{align}\]

  • \(f_0\) is convex, feasible set \(\{ x_1 = -x_2 \}\) is also convex
  • however not a convex problem according to (3.6)
  • we can rewrite to equivalent (not identical) convex problem \[\begin{align} \text{minimize}& & f_0(x)=x_1^2+x_2^2 & \\ \text{subject to} & & x_1 \leq 0 & \\ & & h_1(x) = x_1 + x_2=0 & \end{align}\]

3.6 Optimality in convex optimization

Theorem 3.1 Any local optimal point is optimal globally for a convex problem.

One can easily prove (3.1) by contradiction.

3.6.1 Optimality criterion for differentiable \(f_0\)

Theorem 3.2 \(x\) is optimal iff \(x\) is feasible and \[\begin{align} \tag{3.7} \nabla f(x)^T (y-x) \geq 0 & \text{ for all feasible }y \end{align}\]

We have seen this before in 2.2.

Example 3.3 unconstrained problem: \(\nabla f(x) = 0\)
Example 3.4 equality constrained problem (only \(Ax=b\)): \(x\) is optimal iff there exists \(\nu\) such that \(x\in \mathbf{dom} f_0\), \(Ax=b\), \(\nabla f_0(x) + A^T\nu=0\)

Proof. Assume \(x\) is the optimum. \(\nabla f_0(x)^T (z-x) \geq 0\) for all \(z\), \(Az=b\). Also \(Ax=b\).

Thus \(z-x \in \mathrm{Null}(A)\), i.e. \(\nabla f_0(x) \perp \mathrm{Null}(A)\).

Consequently, \(f_0(x) \in \operatorname{im} (A^T)\),

Thus \(\nabla f_0(x) = A^Tu\), \(\nabla f_0(x) + A^T(-u) = 0\).

Example 3.5 minimization over non-negative orthant (\(R_{++}^n\)):

\[\begin{align} \text{minimize}& & f_0(x) & \\ \text{subject to} & & x \succeq 0 & \end{align}\]

\(x\) is optimal iff \[\begin{align} \end{align}\]

Proof. Assume \(x\) is the optimum. \(\nabla f_0(x)^T (z-x) \geq 0\) for all \(z\geq 0\).

Plugging in \(z=0\), \(\nabla f(x)^Tx \leq 0\).

If any of the components (\(i\)-th) of \(\nabla f(x)\) is negative, then we can take \(z_i\rightarrow \infty\), and LHS will be negative, thus we know that \(\nabla f(x) \geq 0\).

Since \(\nabla f(x)^Tx \leq 0\) and \(\nabla f(x) \geq 0\), we know that \(\nabla f(x)_i x_i = 0\) for all \(i\) (complementarity).

3.7 Equivalent convex problems

This section is full of trickery that you will regret if you don’t know.

3.7.1 eliminating equality constraints

\[\begin{align} \text{minimize}& & f_0(x) & & \\ \text{subject to} & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & Ax=b & & \end{align}\] is equivalent to \[\begin{align} \tag{3.6} \text{minimize (over z)}& & f_0(Fz+x_0) & & \\ \text{subject to} & & f_i(Fz+x_0) \leq 0 & , & i=1,\dots,m \\ \end{align}\] where \(F\) and \(x_0\) are such that \[\begin{align} \tag{3.8} Ax=b & \Longleftrightarrow & x=Fz+x_0 \text{ for some }z \end{align}\]

How to get \(F\) and \(x_0\)? Well \(F\) spans the null space of \(A\), and \(x_0\) is any particular solution of \(Ax=b\).

Note affine operation preserves convexity, so the eliminated problem is also convex.

Boyd: An intriguing fact is that in reality not only it’s bad to eliminate constraints, it’s often a good idea to add (un-eliminate) equality constraints!

Boyd: Convex problems (by definition here) only has affine equality constraints.

3.7.2 Introducing equality constraints

\[\begin{align} \text{minimize}& & f_0(A_0x+b_0) & & \\ \text{subject to} & & f_i(A_0x+b_0) \leq 0 & , & i=1,\dots,m \\ \end{align}\] is equivalent to \[\begin{align} \text{minimize over $x$, $y_i$}& & f_0(y_0) & & \\ \text{subject to} & & f_i(y_i) \leq 0 & , & i=1,\dots,m \\ & & y_i=A_ix+b_i & & i=1,\dots,m \end{align}\]

3.7.3 Introducing slack variables for linear inequalities

\[\begin{align} \text{minimize}& & f_0(x) & & \\ \text{subject to} & & a_i^Tx \leq b_i & , & i=1,\dots,m \\ \end{align}\] is equivalent to is equivalent to \[\begin{align} \text{minimize over $x$, $s$}& & f_0(x) & & \\ \text{subject to} & & a_i^Tx+s_i=b_i & , & i=1,\dots,m \\ & & -s_i \leq 0 & & i=1,\dots,m \end{align}\]

Boyd: this looks stupid but is actually useful.

3.7.4 The epigraph form (trick)

The standard form (3.6) is equivalent to \[\begin{align} \text{minimize over $x$, $t$}& & t & & \\ \text{subject to} & & f_0(x) - t \leq 0 & & \\ & & f_i(x) \leq 0 & , & i=1,\dots,m \\ & & Ax=b & & \end{align}\]

Note that this actually converts any objective to a linear objective!

3.7.5 Minimizing over some variables

\[\begin{align} \text{minimize}& & f_0(x_1, x_2) & & \\ \text{subject to} & & f_i(x_1) \leq 0 & & i=1,\dots,m \end{align}\] is equivalent to \[\begin{align} \text{minimize}& & \tilde f_0(x_1) & & \\ \text{subject to} & & f_i(x_1) \leq 0 & & i=1,\dots,m \end{align}\] where \(\tilde f_0 (x_1) = \inf_{x_2} f_0(x_1, x_2)\)

a.k.a. dynamic programming preserves convexity.

3.8 Dummy

Lorem

Lorem Lorem

Lorem

\(f: \mathbf R^n \rightarrow \mathbf R\) \(f: \mathbf R^n \rightarrow \mathbf R\) \(f: \mathbf R^n \rightarrow \mathbf R\) \(f: \mathbf R^n \rightarrow \mathbf R\)