...
Choosing a drift monitor for a business model depends in practice on the particular model in consideration. For example, a binary classification model can be best monitored for concept drift by running a Summary
test (basic statistics), instead of a 2-sample test, since there are only two possible outcomes, and thus a very small range for the random variable. In addition, feature types (numerical vs categorical - also referred to in MOC terminology as dataClass
) play an important role in choosing the right monitor. Some monitors, such as Kullback-LibelerLiebler
(KL) accommodate both numerical and categorical data, whereas others (usually 2-sample tests such as Kolmogorov-Smirnov
or Epps-Singleton
) work only on numerical features.
...
If the output of the Epps-Singleton test on two distributions is a p-value that is less than a certain threshold (iei.e. 0.05), then we can reject the null hypothesis that the two samples come from a similar underlying distribution. When applied to a feature (or a target variable) of a dataset, we can determine if there is drift between a baseline and a sample dataset in that feature (or target variable).
Remarks:
Null values in the samples will cause the Epps-Singleton test to fail. As such, null values are dropped when calculating the Epps-Singleton test.
The Epps-Singleton test will fail when there are less than five values in each sample. In such cases, the Epps-Singleton test will return a
null
metric
Kolmogorov-Smirnov 2-Sample Test
...
If the output of the Kolmogorov-Smirnov test on two distributions is a p-value that is less than a certain threshold (iei.e. 0.05), then we can reject the null hypothesis that the two samples have an identical underlying distribution. When applied to a feature (or a target variable) of a dataset, we can determine if there is drift between a baseline and a sample dataset in that feature (or target variable).
...
Computes the Jensen-Shannon distance between two distributions, which is the square root of the Jensen-Shannon divergence metric.
The output of the Jensen-Shannon distance calculation is not a p-value, like the Epps-Singleton or the Kolmogorov-Smirnov tests, but a distance. As such, there is not a one-case-fits-all or a universally accepted value that shows that the two distributions are significantly different. However, it is useful over time to keep track of how the distances of two distributions might change.
Kullback-Leibler Divergence
https:Remarks:
Null values in the samples will cause the Jensen-Shannon distance to fail. As such, null values are dropped when calculating the Jensen-Shannon distance.
Because the Jensen-Shannon distance attempts to fit a Gaussian KDE on the samples, an error occurs when there is little to no variance in the samples (i.e. all constant values). In such cases, the Jensen-Shannon distance will return a
null
metric.
Kullback-Leibler Divergence
https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.kl_div.html
Computes the Kullback-Leibler divergence metric (also called relative entropy) between two distributions. Computes by bucketing the samples, computing the element-wise Kullbkac-Leibler divergence metric, then sums each bucket for the final divergence metric over the samples. Because the Kullback-Leibler divergence is asymmetric, the order in which the samples are input into the calculation might output slightly differing results. Also, it is possible that the metric might return a value of Inf
. In such a case, the samples are automatically reversed and calculated again.
The output of the Kullback-Leibler divergence calculation is a not a p-value (like the Epps-Singleton and Kolmogorov-Smirnov tests), nor is it a distance (like the Jensen-Shannon distance), but rather a metric to inform how diverged two distributions might be. Like the Jensen-Shannon distance, there is no one-case-fits-all or universally accepted value to determine if two distributions are significantly different, but the Kullback-Leibler divergence provides one more option in detecting possible drift/generated/scipy.special.kl_div.html
Computes the Kullback-Leibler divergence metric (also called relative entropy) between two distributions. Computes by bucketing the samples, computing the element-wise Kullback-Leibler divergence metric, then sums each bucket for the final divergence metric over the samples. Because the Kullback-Leibler divergence is asymmetric, the order in which the samples are input into the calculation might output slightly differing results.
The output of the Kullback-Leibler divergence calculation is a not a p-value (like the Epps-Singleton and Kolmogorov-Smirnov tests), nor is it a distance (like the Jensen-Shannon distance), but rather a metric to inform how divergent two distributions might be. Like the Jensen-Shannon distance, there is no one-case-fits-all or universally accepted value to determine if two distributions are significantly different, but the Kullback-Leibler divergence provides one more option in detecting possible drift.
Remarks:
It is possible that the Kullback-Leibler Divergence will return a value of
Inf
(when the support of one sample is not contained within the support of the other sample, or when one sample distribution has a much “wider tail” than the other). In such cases, the order of the samples will be reversed and the Kullback-Leibler Divergence will be recalculated (with an appropriatelogger.warning
raised). However, in the case that even the reversed order of samples returnsInf
, the Kullback-Leibler Divergence will return anull
metric.
Model Assumptions
Business Models considered for drift monitoring have a couple of requirements:
...