Me: how to characterize the difference or distance between two distribution configuration?
Bing: There are different ways to measure the difference or distance between two distribution configurations, depending on what aspects of the distributions you want to compare. Some common methods are:
Kolmogorov-Smirnov distance: This measures the maximum difference between the cumulative distribution functions of two distributions1. It is a metric that ranges from 0 (identical distributions) to 1 (completely different distributions).
Total variation distance: This measures the maximum difference between the probabilities that two distributions assign to any event2. It is also a metric that ranges from 0 (identical distributions) to 1 (completely different distributions).
Hellinger distance: This measures the Euclidean distance between the square roots of two probability density functions or mass functions3. It ranges from 0 (identical distributions) to(completely different distributions).
22
Kullback-Leibler divergence: This measures the amount of information lost when using one distribution to approximate another1. It is not a metric, as it is not symmetric and does not satisfy the triangle inequality. It ranges from 0 (identical distributions) to infinity (very different distributions).
There are many other methods, such as Wasserstein distance, Bhattacharyya distance, Jensen-Shannon divergence, etc. You can read more about them online or in textbooks on statistics or information theory.