Learning Interpretable Features to Compare Distributions
Arthur Gretton, UCL
1-2pm 7th Jun 2017
The goal of this talk is to describe methods for testing whether two sets of samples come from the same probability distribution (a two-sample test). I will present adaptive two-sample tests which learn interpretable features to improve testing power, i.e., to increase the number of true positives. These features will be used in constructing two divergence measures: the maximum mean discrepancy (MMD), and the the mean embedding (ME) distance. In both cases, the key point in choosing meaningful features is that variance matters: it is not enough to have a large empirical divergence; we also need to have high confidence in the value of our divergence. The tests are demonstrated in benchmarking and troubleshooting generative models. We detect subtle differences in the distribution of model outputs and real hand-written digits which humans are unable to find (for instance, small imbalances in the proportions of certain digits, or minor distortions that are implausible in normal handwriting). We also use the linear-time ME test to distinguish positive and negative emotions on a facial expression database, showing that a distinguishing feature reveals the facial areas most relevant to emotion.
Distribution Features with Maximum Testing Power, NIPS, 2016.
Models and Model Criticism via Optimized Maximum Mean Discrepancy, ICLR, 2017.