Audit Report: Summary

This document introduces on how to use mlr3fairness to create audit reports with different tasks throughout the fairness exploration process.

There are three main sections for this document. Which describe the details for the task, the model and the interpretability of the parameters.

Jump to section:

Task details

In this fairness report, we investigate the fairness of the following task:

#> <TaskClassif:compas> (6172 x 12)
#> * Target: two_year_recid
#> * Properties: twoclass
#> * Features (11):
#>   - fct (6): age_cat, c_charge_degree, is_recid, race, score_text, sex
#>   - int (5): age, days_b_screening_arrest, decile_score,
#>     length_of_stay, priors_count

The General Task Documentation for the task:

Here we could get the basic details for the task.

Value
Audit Date: 2022-09-29
Task Name: compas
Number of observations: 6172
Number of features: 12
Target Name: two_year_recid
Feature Names: age, age_cat, c_charge_degree, days_b_screening_arrest, decile_score, is_recid, length_of_stay, priors_count, race, score_text, sex
The Protected Attribute: sex

Exploratory Data Analysis:

We could also report the number of missing values, types and the levels for each feature:

Count_of_Missing_Value type levels fix_factor_levels
two_year_recid 0 integer NULL FALSE
age 0 factor 25 - 45 , Greater than 45, Less than 25 FALSE
age_cat 0 factor F, M FALSE
c_charge_degree 0 integer NULL FALSE
days_b_screening_arrest 0 integer NULL FALSE
decile_score 0 factor 0, 1 FALSE
is_recid 0 integer NULL FALSE
length_of_stay 0 integer NULL FALSE
priors_count 0 factor African-American, Asian , Caucasian , Hispanic , Native American , Other FALSE
race 0 factor High , Low , Medium FALSE
score_text 0 factor Female, Male FALSE
sex 0 factor 0, 1 FALSE

We first look at the label distribution:

Model details

We could see the model that has been used in resample_result:

#> <LearnerClassifRpart:classif.rpart>: Classification Tree
#> * Model: -
#> * Parameters: xval=0
#> * Packages: mlr3, rpart
#> * Predict Types:  response, [prob]
#> * Feature Types: logical, integer, numeric, factor, ordered
#> * Properties: importance, missings, multiclass, selected_features,
#>   twoclass, weights

Fairness Metrics

We could report more than one fairness metric, but keep in mind. Below the metrics are the mean of all the resample results.

x
fairness.acc 0.0135012
fairness.equalized_odds 0.0158807
fairness.fnr 0.0317613
fairness.fpr 0.0000000
fairness.npv 0.0195894
fairness.ppv 0.0000000
fairness.tnr 0.0000000
fairness.tpr 0.0317613

We could also use visualization to report the fairness. For example, the fairness and accuracy trade off, compare metrics visualization and the fairness prediction density of the model classif.rpart . For more detailed usage and examples, you may want to check the visualization vignette.

Interpretability

Finally, we could use the external package to analyze the interpretability of the parameters. For the following example we choose iml as a demonstration. We need first extract the learner from resample_result and retrain it.

You could generate the variable importance plot like this