Cross Entropy

H(p,q) = - ∑ p log(q)

p : true distribution of events q : bits used to encode events (Or the predicted distribution)

H(p,q) = H(p) + D_KL(p || q)

i.e. cross entropy is greater than entropy of true distribution (H(p)), and that the difference is called Kullback-Leibler Divergence (D_KL).