Mathematically Proving That They are All the Same — Introduction Very often, data scientists and machine learning practitioners don’t appreciate the mathematical and intuitive relationships between different loss metrics like Negative Log Likelihood, Cross Entropy, Maximum Likelihood Estimation, Kullback-Leibler (KL) divergence, and most importantly Mean Square Error. Wouldn’t you be surprised if I say that KL-Divergence and Mean Square Error…