Critical Values Analysis
- mlbugdetection.critical_values.find_critical_values(model, sample, feature: str, start: int, stop: int, step: float = 1, keep_n: int = 3)[source]
- Critical Values Finder
Finds highest changes (positive or negative) in predict_proba over an specified inteval [start, stop].
- Parameters
model (sklearn model or str) – Model already trained and tested from scikit-learn. Could be a model object or a path to a model file.
sample (pandas DataFrame) – A single row of the dataframe that will be used for the analysis.
feature (str) – Feature of dataframe that will be analysed.
start (int) – The starting value of the feature’s interval.
stop (int) – The end value of the feature’s interval.
step (float, default=1) – Size of the step between ranges “start” and “stop”. Ex: step = 0.1 between ranges 0 and 1 will result in [0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9].
keep_n (int, default=3) – Number of values that are to be keeped in each list.
- Returns
AnalysisReport object with following attributes – For more information: >>> from mlbugdetection.analysis_report import AnalysisReport >>> help(AnalysisReport)
model_name (str) – Name of the model being analysed.
analysed_feature (str) – Name of the feature being analysed.
feature_range (tuple) – Range of values of the feature being analysed: (start, stop).
metrics (dictionary) – Dictionary with all the calculated metrics, such as:
- ’positive_changes_proba’List
List of feature ranges that resulted in the biggest positive changes in the model`s prediction probability.
- ’positive_changes_proba’List
List of biggest positive variations in the model`s prediction probability.
- ’negative_changes_ranges’List
List of feature ranges that resulted in the biggest negative changes in the model`s prediction probability.
- ’negative_changes_proba’List
List of biggest negative variations in the model`s prediction probability.
- ’classification_change_ranges’List
List of feature ranges that resulted in a change of the model`s classification.
- ’classification_change_proba’List
List of prediction probability values before and after the classification change.
graphs (List) – List of all the figures created.
- mlbugdetection.critical_values.find_several_critical_values(model, samples, feature: str, start: int, stop: int, step: float = 1, bins: int = 15, keep_n: int = 5, log: bool = False)[source]
- Critical Values Finder in Several Samples
Finds mean, median, standard deviation, variation of the critical values found in the samples over an specified inteval [start, stop].
- Parameters
model (sklearn model or str) – Model already trained and tested from scikit-learn. Could be a model object or a path to a model file.
samples (pandas DataFrame) – Two or more rows of the dataframe that will be used for the analysis.
feature (str) – Feature of dataframe that will be analysed.
start (int) – The starting value of the feature’s interval.
stop (int) – The end value of the feature’s interval.
step (float, default=1) – Size of the step between ranges “start” and “stop”. Ex: step = 0.1 between ranges 0 and 1 will result in [0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9].
bins (int, default=15) – It defines the number of equal-width bins in the range.
keep_n (int, default=5) – Number of the highest values to use for mean, median, std, var calculation.
log (bool, default=False) – If True, the histogram axis will be set to a log scale.
- Returns
AnalysisReport object with following attributes – For more information: >>> from mlbugdetection.analysis_report import AnalysisReport >>> help(AnalysisReport)
model_name (str) – Name of the model being analysed.
analysed_feature (str) – Name of the feature being analysed.
feature_range (tuple) – Range of values of the feature being analysed: (start, stop).
metrics (dictionary) – Dictionary with all the calculated metrics, such as:
- ’positive_means’dictionary
Contains the following:
- ’mean’float
Mean of the all the positive changes means
- ’median’float
Median of the all the positive changes means
- ’std’float
Standard Deviation of the all the positive changes means
- ’var’float
Variation of the all the positive changes means
- ’negative_means’dictionary
Contains the following:
- ’mean’float
Mean of the all the negative changes means
- ’median’float
Median of the all the negative changes means
- ’std’float
Standard Deviation of the all the negative changes means
- ’var’float
Variation of the all the negative changes means
graphs (List) – List of all the figures created.
- mlbugdetection.critical_values.highest_and_lowest_indexes(predictions: list, keep_n: int = 3)[source]
- Return indexes of highest changes (positive or negative)
in predictions
- Parameters
predictions (list) – Array that contains predictions to be analysed
keep_n (int) – Number of values that are to be keeped in each list