Bias Audit and Mitigation in Julia with Fairness.jl

Ashrya Agrawal

JuliaCon 2021

Slides also available at

www.ashrya.in/juliacon

using Fairness, MLJ

X, y = @load_compas

@load NeuralNetworkClassifier

model = @pipeline ContinuousEncoder 
		NeuralNetworkClassifier

evaluate(wrappedModel2,
  X, y,
  measures=[
    Disparity(
    	false_positive_rate, 
    	refGrp="Caucasian", 
        grp=:race),
    MetricWrapper(
    	accuracy, 
    	grp=:race)])

wrappedModel = ReweighingSamplingWrapper(
	classifier=model, 
	grp=:race)

wrappedModel2 = LinProgWrapper(
	classifier=wrappedModel, 
	grp=:race, 
	measure=false_positive_rate)

Evaluating over 6 folds: 100%[=========================] Time: 0:03:26
(measure = [FalsePositiveRate @609_disparity, Accuracy @043],
 measurement = [Dict("Caucasian" => 1.0,
                     "Asian" => 0.6197066145130318,
                     "Other" => 1.2464987907810878,
                     "overall" => 1.114471136936518,
                     "Native American" => 0.28708622756683033,
                     "African-American" => 1.1550342115441452,
                     "Hispanic" => 1.3381981057630148),
                Dict("Caucasian" => 0.5639544393597365,
                     "overall" => 0.591280830678637,
                     "Asian" => 0.721031746031746,
                     "Other" => 0.5373151188558659,
                     "Native American" => 0.7666666666666666,
                     "African-American" => 0.6263915433136312,
                     "Hispanic" => 0.5145313984977116)],
 per_fold = [[Dict("Caucasian" => 1.0,
                   "Asian" => 0.8063380281690136,
                   "Other" => 0.6143527833668678,
                   "overall" => 0.8997877996455554,
                   "Native American" => 0.8063380281690136,
                   "African-American" => 0.8875968992248061,
                   "Hispanic" => 0.7949811545328308),
              Dict("Caucasian" => 1.0,
                   "Asian" => 1.3742690058479528,
                   "Other" => 1.0923676713150396,
                   "overall" => 1.0623461064943058,
                   "Native American" => 0.9161793372319684,
                   "African-American" => 1.0983934433357248,
                   "Hispanic" => 1.109059197701857),
              Dict("Caucasian" => 1.0,
                   "Asian" => 0.0,
                   "Other" => 1.862348178137652,
                   "overall" => 1.4039028817789883,
                   "Native American" => 0.0,
                   "African-American" => 1.6301855326245573,
                   "Hispanic" => 1.8282051282051288),
              Dict("Caucasian" => 1.0,
                   "Asian" => 0.45599999999999996,
                   "Other" => 1.1551999999999998,
                   "overall" => 1.066387434554974,
                   "Native American" => 0.0,
                   "African-American" => 1.0915275590551181,
                   "Hispanic" => 1.2377142857142858),
              Dict("Caucasian" => 1.0,
                   "Asian" => 1.0816326530612241,
                   "Other" => 1.802721088435374,
                   "overall" => 1.2562119584675977,
                   "Native American" => 0.0,
                   "African-American" => 1.2457714603351067,
                   "Hispanic" => 1.9355531686358753),
              Dict("Caucasian" => 1.0,
                   "Asian" => 0.0,
                   "Other" => 0.9520030234315947,
                   "overall" => 0.9981906406776871,
                   "Native American" => 0.0,
                   "African-American" => 0.9767303746895583,
                   "Hispanic" => 1.1236756997881119)],
             [Dict("Caucasian" => 0.5367088607594936,
                   "overall" => 0.6145833333333334,
                   "Asian" => 0.75,
                   "Other" => 0.6615384615384615,
                   "Native American" => 0.6,
                   "African-American" => 0.6591304347826087,
                   "Hispanic" => 0.6296296296296297),
              Dict("Caucasian" => 0.48866498740554154,
                   "overall" => 0.5377932232841007,
                   "Asian" => 0.14285714285714285,
                   "Other" => 0.43859649122807015,
                   "Native American" => 0.5,
                   "African-American" => 0.6037099494097807,
                   "Hispanic" => 0.41935483870967744),
              Dict("Caucasian" => 0.6598984771573604,
                   "overall" => 0.635968722849696,
                   "Asian" => 1.0,
                   "Other" => 0.6142857142857143,
                   "Native American" => 1.0,
                   "African-American" => 0.6348408710217756,
                   "Hispanic" => 0.5402298850574713),
              Dict("Caucasian" => 0.5412621359223301,
                   "overall" => 0.5873153779322329,
                   "Asian" => 0.8333333333333334,
                   "Other" => 0.54,
                   "Native American" => 1.0,
                   "African-American" => 0.634974533106961,
                   "Hispanic" => 0.4891304347826087),
              Dict("Caucasian" => 0.6481012658227848,
                   "overall" => 0.6020851433536055,
                   "Asian" => 0.6,
                   "Other" => 0.45161290322580644,
                   "Native American" => 0.5,
                   "African-American" => 0.612736660929432,
                   "Hispanic" => 0.45544554455445546),
              Dict("Caucasian" => 0.509090909090909,
                   "overall" => 0.5699391833188532,
                   "Asian" => 1.0,
                   "Other" => 0.5178571428571429,
                   "Native American" => 1.0,
                   "African-American" => 0.6129568106312292,
                   "Hispanic" => 0.5533980582524272)]],
 per_observation = [missing, missing],
 fitted_params_per_fold =
     [(fitresult = [[[0.4811044092793654,
                      0.06147223597164733,
                      0.5278109857658507,
                      0.3046377224179335,
                      0.018528882735561157,
                      0.0002801118924418782],
                     [0.303965142150392,
                      0.6904750642640544,
                      0.4427945875213156,
                      0.3431853738004899,
                      0.30956315697359355,
                      0.2567525971191049]],
                    (predict = Node{Machine{Pipeline402}} @279,),
                    [0, 1]],),
      (fitresult = [[[0.5588051604392756,
                      0.9556043818169735,
                      0.30639748848311293,
                      0.348612754649609,
                      0.4365068766941083,
                      0.4043445298515746],
                     [0.6882039866276403,
                      0.9920755748997635,
                      0.5849444936357443,
                      0.6792907535014425,
                      0.6099969142272105,
                      0.7289919307542917]],
                    (predict = Node{Machine{Pipeline402}} @725,),
                    [0, 1]],),
      (fitresult = [[[0.40708147682432333,
                      0.5472590139999254,
                      0.00955587838555424,
                      0.5293113191526267,
                      0.3552522591071384,
                      0.4478887177634008],
                     [0.2977844036279724,
                      0.7682509248063255,
                      0.05509930942045507,
                      0.49756351369147145,
                      0.035100375228515814,
                      0.49015757611103855]],
                    (predict = Node{Machine{Pipeline402}} @689,),
                    [0, 1]],),
      (fitresult = [[[0.999925812049108,
                      0.9802483169919184,
                      0.2054159403529095,
                      0.839965280106501,
                      0.49298770252097157,
                      0.8249553395297932],
                     [0.4922045406296876,
                      0.5332368553861355,
                      0.40410283450061674,
                      0.6090512967687178,
                      0.5041069457300733,
                      0.6026248260664203]],
                    (predict = Node{Machine{Pipeline402}} @717,),
                    [0, 1]],),
      (fitresult = [[[0.2500591412753533,
                      0.14463186095615735,
                      0.000675110602308825,
                      0.996812375697747,
                      0.06304608072440798,
                      0.9883761052997926],
                     [0.3944706843277948,
                      0.6190896139428711,
                      0.3552048695305871,
                      0.8320196710497626,
                      0.3052623265756199,
                      0.7936049831530788]],
                    (predict = Node{Machine{Pipeline402}} @829,),
                    [0, 1]],),
      (fitresult = [[[0.518636286340774,
                      0.1877741203729742,
                      0.5192511278401304,
                      0.9747796181144148,
                      0.06419120209953995,
                      0.40781493654940687],
                     [0.4702239197114813,
                      0.6825522720605911,
                      0.5473230730072678,
                      0.7427667489585572,
                      0.22103521074555144,
                      0.5441393848904584]],
                    (predict = Node{Machine{Pipeline402}} @768,),
                    [0, 1]],)],
 report_per_fold = [nothing, nothing, nothing, nothing, nothing, nothing])

Fairness.jl workflow

Bias Audit and Mitigation in Julia with Fairness.jl Ashrya Agrawal JuliaCon 2021 Slides also available at www.ashrya.in/juliacon

	TruePositive, TrueNegative, FalsePositive, FalseNegative, TruePositiveRate,
	TrueNegativeRate, FalsePositiveRate, FalseNegativeRate, FalseDiscoveryRate,
	Precision, NPV, TPR, TNR, FPR, FNR, FDR, PPV,
	truepositive, truenegative, falsepositive, falsenegative,
	true_positive, true_negative, false_positive, false_negative,
	truepositive_rate, truenegative_rate, falsepositive_rate,
	true_positive_rate, true_negative_rate, false_positive_rate,
	falsenegative_rate, negativepredictive_value, false_negative_rate,
	negative_predictive_value, positivepredictive_value, positive_predictive_value,
	tpr, tnr, fpr, fnr, falsediscovery_rate, false_discovery_rate, fdr, npv, ppv,
	recall, sensitivity, hit_rate, miss_rate, specificity, selectivity, f1score, fallout
	PredictedPositiveRate, PP, predicted_positive_rate,
	ppr, FalseOmissionRate, FOR, false_omission_rate, foar
	TruePositiveRateDifference, TPRD, true_positive_rate_difference, tprd,
	FalsePositiveRateDifference, FPRD, false_positive_rate_difference, fprd,
	FalseNegativeRateDifference, FNRD, false_negative_rate_difference, fnrd,
	FalseDiscoveryRateDifference, FDRD, false_discovery_rate_difference, fdr,
	FalsePositiveRateRatio, FPRR, false_positive_rate_ratio, fprr, FalseNegativeRateRatio,
	FNRR, false_negative_rate_ratio, fnrr, FalseOmissionRateRatio, FORR, false_omission_rate_ratio,
	forr, FalseDiscoveryRateRatio, FDRR, false_discovery_rate_ratio, fdrr, AverageOddsDifference, AOD,
	average_odds_difference, aod

Bias Audit and Mitigation in Julia with Fairness.jl

So, who's the culprit here?

Why is there so much fuss about it?

Can't you simply remove the protected attribute from Dataset?

No!

How hard could it really be?

Fairness.jl workflow

Our research on generalization in fairness

Outlook

References

Thank You