# Optimizing the KL

## Outline

### Topics

- Black-box / automatic differentiation variational inference (ADVI)
- Coordinate ascent variational inference (CAVI)

### Rationale

We have now identified our objective function, the ELBO. We still need to pick a numerical method to optimize it.

## Overview

- Before ~2015, the user had to do mathematical derivation each time they wanted to apply VI to a new model.
- This changed with the advent of “black box methods” such ADVI.

- In this course we focus on 2 since they are easier to use.
- However, 1 is still useful as it can be much faster in practice.

## Black box methods

**Idea:**use a gradient descent method to minimize \(L(\phi)\).

**Difficulty**: the objective function \(L\) has an integral over \(q\). How to compute its gradient?**Solution**:- approximate the gradient using a Monte Carlo method.
- Feed that gradient into a Stochastic Gradient Descent (SGD) algorithm.
- Convergence guarantees typically ask that this approximation be unbiased.

## References

- See Blei et al., 2018.