As AI becomes increasingly integrated in our day-to-day lives, the implications of bias in AI grow more and more worrisome. Training data that appears impartial is often influenced by historical and socioeconomic factors that render it biased, sometimes to the detriment of marginalized groups, and especially in AI applications in sectors like healthcare and criminal justice. Now, LinkedIn is introducing a tool to help combat these biases: the LinkedIn Fairness Toolkit, or LiFT.
LiFT is an open-source Scala/Spark library that LinkedIn says “enables the measurement of fairness, according to a multitude of fairness definitions, in large-scale machine learning workflows.” LinkedIn says that LiFT is both flexible and scalable, enabling use in scenarios ranging from exploratory analysis to production workflows and allowing the distribution of workloads over several nodes when handling large datasets.
“It can be deployed in training and scoring workflows to measure biases in training data, evaluate different fairness notions for ML models, and detect statistically significant differences in their performance across different subgroups,” explained LinkedIn machine learning researchers Sriram Vasudevan, Cyrus DiCiccio, and Kinjal Basu in a blog post announcing LiFT. “It can also be used for ad hoc fairness analysis or as part of a large-scale A/B testing system.”
Currently, LiFT supports metrics including distances between observed and expected distributions, fairness metrics like demographic parity and equalized odds, and other fairness measures like Atkinson’s index. LiFT also supports access to higher- and lower-level APIs for more granular developer access.
LinkedIn has already deployed LiFT internally, using the tool to measure the fairness of multiple training datasets prior to their use in model training and using it for the last year while developing its prototype anti-harassment classification systems.
“We looked at the model’s performance across geographic regions and gender,” the researchers wrote. “While the model showed the highest precision when identifying harassing content in messages from men, across all regions, we found that the model was slightly more precise among English-speaking female members in the U.S. versus those in the U.K. The system relies on human-in-the-loop verification of flagged content, so this was determined to be an acceptable design trade-off.”
For LinkedIn, LiFT is part of an ongoing effort to improve fairness both in and outside its organization – which for a company of its nature, is a pressing concern. “Our imperative is to create economic opportunity for every member of the global workforce,” the researchers wrote, “something that would be impossible to accomplish without leveraging AI at scale.”