Cross-Entropy
The cross-entropy between two distributions and is defined as
Since the cross-entropy is just the latter term of the KL Divergence, it also measures how is different from . And therefore minimizing the cross-entropy with respect to is equivalent to minimizing the KL divergence, because does not participate in the omitted term.
Whether it is KL divergence or cross-entropy, usually is the actual/data-based/precise distribution, while is the theoretical/approximate distribution. So if we want the theoretical distribution to be as close to the real distribution as possible, we need to minimize the cross-entropy or KL divergence. In these cases, the cross-entropy can be seen as an “error”, or information loss, of the theoretical/approximate distribution over the actual/precise distribution.