Follow Datanami:
April 6, 2023

John Snow Labs Puts Responsible AI to Practice with Release of the NLP Test Library

LEWES, Del., April 6, 2023 — John Snow Labs, the healthcare AI and NLP company and developer of the Spark NLP library, has announced the release of NLP Test, an open-source Python software library that enables data scientists to more easily deliver reliable, safe and effective models.

The need for safer, more equitable, and robust AI models is clear, but there are few tools available to help data scientists achieve this. As a result, current Natural Language Processing (NLP) models in production are not living up to their promise. Instead, some of the best-known models fail on important aspects like leaking personally identifiable information, reversing their answer due to typos or capitalization changes,  to showing biases around race, gender, physical appearance, disability, and religion. These issues are prevalent in some of the most popular state-of-the-art models in use today.

Rigorous and frequent testing is the antidote, and good tests should be specific, comprehensive, and easy to maintain. Additionally, they should be versioned and executable to make them part of an automated build or MLOps workflow. John Snow Labs’ NLP Test Library offers a simple framework to make this simpler. It does this in several ways: it is open source, lightweight, extensible, includes support of multiple libraries, and offers a comprehensive testing strategy for both models and data.

The NLP Test library can automatically generate and run 50+ test types out-of-the-box, covering accuracy, fairness, bias, representation, and robustness. Multiple NLP tasks can be tested across 3 of the most popular open-source NLP libraries: Spark NLP, transformers, and spacy. The NLP Test library also provides automated data augmentation, which in some cases can automatically improve failing models, especially for issues around robustness and fairness.

The news comes on the first day of the company’s annual Healthcare NLP Summit, a free, virtual event focused on NLP applications in healthcare and life sciences, and will be explored in a keynote session titled “Introducing the Open-Source Testing Library for NLP Models.”

“Despite their hype, many AI systems simply don’t work. It is time to set higher standards for engineering AI systems: They should work reliably, and you should be able to prove this to yourself, your customers, and your regulator,” said David Talby, CTO of John Snow Labs. “The NLP Test library provides the open-source community with a free, production-grade resource that lets data scientists apply best practices for Responsible AI, and embodies a lot of what we’ve learned over the years in delivering regulatory-grade NLP systems.”

NLP Test Library is now live and freely available. To get started, visit nlptest.org. With a full development team allocated to the project, John Snow Labs is committed to improving the library with frequent releases of new test types, tasks, languages, and platforms.

About John Snow Labs

John Snow Labs, the AI and NLP for healthcare company, provides state-of-the-art software, models, and data to help healthcare and life science organizations put AI to good use. Developer of Spark NLP, the world’s most widely used NLP library in the enterprise, John Snow Labs’ award-winning healthcare NLP software powers the world’s largest pharmaceutical companies, healthcare systems, and health IT providers. The company is the creator and host of The NLP Summit, further educating and advancing the NLP community.


Source: John Snow Labs

Datanami