Lecture Notes: https://diffusion.csail.mit.edu/docs/lecture-notes.pdf

Generative Modeling

The quality of a generation can be interpreted as the probability than the generated object came from a target distribution. For example, if you are generating images of dogs we want to measure the probability it came from the distribution of dog images.

The probability distribution can be represented as a mapping from vectors to non negative numbers.

$$ p_{\mathrm{data}}: \mathbb{R}^d \rightarrow \mathbb{R}_{\geq0} $$

We can't explicitly define $p_{\mathrm{data}}$ but we can access samples from it (dataset): $z_1, z_2, \ldots, z_n \sim p_{\mathrm{data}}$.

Unconditional generation (sampling from the data distribution):

$$ z \sim p_{\mathrm{data}} $$

Conditional generation (conditioning the sampling with $y$, which can be a prompt):

$$ z \sim p_{\mathrm{data}}(\cdot|y) $$

The data distribution is randomly initialized:

$$ p_{\text{init}} = \mathcal{N}(0, I_d) $$

The generative model transforms this into the data distribution.

Flow Models

Trajectory

This is a function of time mapping to a vector. Time is constrained to be between 0 and 1. We can call each step in the trajectory $X_t$ as a state (similar to RL terminology).

$$ X:[0,1] \rarr \mathbb{R}^d, t\mapsto X_t $$