17-08-28 Interesting Facts in Machine Learning (Logistic Regression)

Category: Idea Lists (Upon Request)

Read the original document

<!-- gdoc-inlined -->


1. Two major ways to do multinomial eval:

  1. Softmax Loss
  2. One vs. All with binary (logistic) function
  3. Naming -
    1. “Logistic” regression due to Sigmoid (logistic) function
    2. “Softmax” regression due to softmax function
  4. No closed form solution, despite convexity
  5. Many, many optimizers:
    1. Newton / Newton-CG
    2. BFGS
      1. L-BFGS
    3. IRLS
    4. Trust Region Conjugate Gradient
    5. Gradient Descent
      1. GD + Line Search
    6. Stochastic Average Gradient
  6. Difficult Bayesian Solutions (No convenient conjugate prior)
  7. Discriminative (Learns P(Y|X), rather than first the joint P(Y, X) and then conditioning on X (the generative approach))
  8. Without regularization, the weights will become arbitrary large, damaging generalization. Penalties are more important than in the regression setting.
  9. You can get better generalization with a stochastic solver [https://arxiv.org/pdf/1708.05070.pdf]
  10. The reason scaling can still be important is for the optimizer - even though you technically have a convex model and will get the same solution
  11. Linear generalization is stronger than almost every other form of generalization for unstructured data (trees + networks overfit)
  12. Every relationship between your feature and the label should be as close to linear as possible
  13. You can use boxcox transform to automatically get close to linear

Source: Original Google Doc

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?