Posts

Showing posts from March, 2020

MSE Vs Entropy

Image
A loss function is a parameter estimation function which represents the error (loss) of a machine learning (ML) model. The main goal of every ML model is to minimize this error. Loss basically represents some numerical value that tells us how poor the prediction is, considering only one example. In case the prediction is ideal, the loss is going to be 0. In other cases, the prediction is not that good, so the model's loss/error is (much) higher. A loss function is also called  "cost function" ,  "objective function" ,  "optimization score function"  or just  "error function" . So don't get confused! 😀 A loss function is a useful tool with the means of which we can estimate weights and biases that suit the model in the best way. By finding the most appropriate parameters for a ML model, the model will then have a low error in respect to all data samples in the dataset. The function is called "loss" because it penalizes the

Example of Cross-Entropy as a Loss Function

Image
Cross-Entropy as a Loss Function Let’s say we have a dataset of animal images and there are five different animals. Each image has only one animal in it. Source:  https://www.freeimages.com/ Each image is labeled with the corresponding animal using the one-hot encoding. We can treat one hot encoding as a probability distribution for each image. Let’s see a few examples. The probability distribution of the first image being a dog is 1.0 (=100%). For the second image, the label tells us that it is a fox with 100% certainty. …and so on. As such, the entropy of each image is all zero. In other words, one-hot encoded labels tell us what animal each image has with 100% certainty. It is not like the first image can be a dog for 90% and a cat for 10%. It is always a dog, and there will be no surprise. Now, let’s say we have a machine learning model that classifies those images. When we have not adequately trained the model, it may c

Cross-Entropy, Hinge, Huber, Kullback-Leibler, MAE (L1), MSE (L2)

Image
Loss Functions Cross-Entropy Hinge Huber Kullback-Leibler MAE (L1) MSE (L2) Cross-Entropy Cross-entropy loss, or log loss, measures the performance of a classification model whose output is a probability value between 0 and 1. Cross-entropy loss increases as the predicted probability diverges from the actual label. So predicting a probability of .012 when the actual observation label is 1 would be bad and result in a high loss value. A perfect model would have a log loss of 0. The graph above shows the range of possible loss values given a true observation (isDog = 1). As the predicted probability approaches 1, log loss slowly decreases. As the predicted probability decreases, however, the log loss increases rapidly. Log loss penalizes both types of errors, but especially those predictions that are confident and wrong! Cross-entropy and log loss are slightly different depending on context, but in machine learning when calculating error rates between 0 and 1 they re