Revisiting MDP Fundamentals

Policy Function and Value Function The goal of the optimal policy function is to maximize the expected discounted reward, even if this means taking actions that would lead to lower immediate next-step rewards from few states. Recall that from the previous lecture that for all s, the (optimal) value function is: where Estimating Transition Probabilities … Read more

Generative vs Discriminative models

Generative models model the probability distribution of each classDiscriminative models learn the decision boundary between the classes Simple Multinomial Generative model denotes the probability of model M choosing a word w. its value must lie between Likelihood Function For simplicity let’s consider W={0,1}. We want to estimate a multinomial model to generate a document D=”0101″. … Read more