Binary classification
Problem description:
Determine whether a person makes over 50K a year
Logistic regression
- Logistic regression is a statistical model that in its basic form uses a logistic function to model a binary dependent variable
- Step 1: Function Set fw,b(x)=σ(∑iwixi+b) Sigmoid function: σ(z)=11+e−z
- Step 2: Goodness of a Function Training data: (xn,ˆ(y)n) L(f)=∑nC(f(xn),ˆ(y)n)
- Step 3: Find the best function wi←wi−η∑n(ˆ(y)n−fw,b(xn))xni
Generative model
- In General, A Discriminative model models the decision boundary between the classes. A Generative Model explicitly models the actual distribution of each class.
- Step 1: Find μ1,μ1,Σ1,Σ2
- Step 2: Compute the corresponding w,b wT=(μ1−μ2)TΣ−1 b=−12(μ1)T(Σ1)−1μ1+12(μ2)T(Σ2)−1μ2+lnN1N2
Result
- Achieved 97/402 (Top 25%) rank in the Kaggle competition