- Generative models model the probability distribution of each class
- Discriminative models learn the decision boundary between the classes

## Simple Multinomial Generative model

denotes the probability of model M choosing a word w. its value must lie between

## Likelihood Function

For simplicity let’s consider W={0,1}. We want to estimate a multinomial model to generate a document D=”0101″.

For this task, we consider two multinomial models where:

denotes the probability of the Model 1 generating D.

Again, if we consider:

is better than

## Maximum Likelihood Estimate

Consider the vocabulary

Our model M can have 25 parameters to express the probability of each letter.

Let be the parameters of M* then:

## MLE for Multinomial Distribution

Let be the probability of being generated by the simple model described above.

## Stationary Points of the Lagrange Function

Maximizing \mathrm P(D\mid\theta)\mathrm{\ is equivalent to maximizing

We know that:

Define the Lagrange function:

Then, find the stationary points of L by solving the equation

for all

## Predictions of a Generative Multinomial Model

Also, suppose that we classify a new document D to belong to the positive class iff:

The document is classified as positive iff

The generative classifier M can be shown to be equivalent to a linear classifier given by

## Prior, Posterior and Likelihood

Consider a binary classification task with two labels ‘+’ (positive) and ‘-‘ (negative).

Let y denote the classification label assigned to a document D by a multinomial generative model M with parameters for the positive class and for the negative class.

is the posterior distribution

is the prior disctibution

Example

and

## Gaussian Generative models

MLE for the Gaussian Distribution

The probability density function for a Gaussian random variable is given as follows:

Let be i.i.d r.v with mean and variance

Then their joint probability density function is given by:

Taking logarithm of the above function, we get:

MLE for the Mean and variance: