Main idea
Instead of minimizing $c^Tx$ while avoiding the constraints, we define a barrier function $F(x)$ which blows up to infinity as we approach the barrier. then we minimize $t c^T x + F(x)$ for some scaler $t$, which indicates how much we care about the objective vs the barrier.
Now we can use newton's method and other gradient based methods.
The Lagrangian duel
The KKT conditions state that $\min f(x)$ subject to $g(x) \le 0$ is solved if and only if
We can modify this to $ug(x) = t$ for $t > 0$ then get $u = \frac{t}{g(x)}$ Substituting into the gradient equation we get
The log term is typically called the the barrier function.
Newton's method
The taylor approximation for a multivariate function is
We want to pick $h$ such that the quadratic approximation is minimized^{1} so we take the gradient/differentiate^{2} w/r to h
Equate to zero and solve for $h$
Therefor the update rule is
A key fact about newton's method is if we're close enough to a local optimum we get quadratic convergence.
let $y = Ax$ and $\phi(y) = f(A^{1}y) = f(x)$ Applying newton's method to $\phi(y)$ is the same as applying it to $f(x)$
Preforming a change of basis^{3} we can find the new gradient and hessian
Now if we preform newton's method on $\phi(y)$ after some nice cancellation we get
Which is just preforming newton's method in the $x$ world, then transforming back to $y$.
How much can we increase t
Recall our objective function $tc^Tx + F(x)$, we want to find how large we can set $t'$ such that we're still in the radius of convergence for newton's method.
We define the newton decrement^{4} for some function $f$ as
Losely speaking, this measures the distance from a local optimum^{5}. notice $\lambda(x^*) = 0$
Quadratic convergence^{6} is written as
For our purposes $f_t(x) = t c^Tx + F(x)$
(TODO: Finish analysis/simplify lecture. for now I'm just stealing the t update rule.)
$\nu$ is defined here in the lecture
A result in the lecture is
Which means we can increase $t$ by a factor of $4^{1} \nu^{1/2}$.
What is $\nu$ for the log barrier? (shown below)?
TODO: Find $\nu$
TODO
 Prove the error on the taylor approximation is $O(h^2)$
 Prove quadratic convergence of newton's method
Footnotes

Actually we're going to the closest extreme point, we assume we're close to a minimum though. ↩

We transpose the inverse $(A^{1})^T$ because (TODO) ↩

The reason we restate in terms of the newton decrement is because the standard newton's method analysis isn't invariant under linear transformations. ↩

We don't need to worry about local optimum though since we're optimizing a convex function ↩

This requires the function is self concordant ↩