Follow Datanami:
July 20, 2016

Melania Trump and the Anti-Plagiarism Algorithm

(a katz/Shutterstock)

Donald Trump’s presidential campaign ran into controversy this week when it was revealed that his wife, Melania, plagiarized parts of Michelle Obama’s 2008 campaign speech. With anti-plagiarism algorithms patrolling the waters, stealing a line or two has never been more problematic.

The revelation that Melania Trump’s speech at the Republican National Convention Monday evening contained several phrases that were first uttered by Michelle Obama in a 2008 speech caused a media uproar. It was also quite an embarrassment to Trump’s campaign, which is seeking to frame the Obama presidency as a failure.

But this begs the question: Without reading thousands of speeches overnight, how did somebody discover that Trump lifted a 23-word phrase from Obama’s speech whole hog, and borrowed heavily from about 40 more?


Examples of “find and replace” plagiarism in Melania Trump’s speech (Source: Turnitin blog)

The answer, of course, is they didn’t: high-speed computer algorithms. A human could theoretically have done it, but it likely would have taken many days. But for algorithms working with a digitized transcript of the speech, it’s like shooting fish in a barrel.

It also alerted us to the existence of anti-plagiarism algorithms. One company making such products is Turnitin. The company developed a tool called the Plagiarism Spectrum with the goal of providing teachers and professors a way to automatically identify plagiarism.


The “Writeprint” for Melania Trump’s 2016 speech, according to Expert System

In a blog post Tuesday, the Oakland, California company broke down its analysis of Trump’s speech. The company found that Trump’s speech used a “cloning” type of plagiarism (or copying word-for-word) again a 23-word segment of Obama’s work. Trump (or Trump’s speechwriter, who has taken the fall for the mistake), also used the “find and replace” form of plagiarism in other segments.

While parts of Trump’s speech were plagiarized, most of it was original, according to an analysis of the two speeches by Expert System, a provider of text analytic and cognitive computing tools.

“This will likely be surprising information to many out there, but our analysis shows that in fact there are stark differences between First Lady Michelle Obama’s and Melania Trump’s speeches,” said Daniel Mayer, CEO of Expert System. “We found differences in their emotions, topics of importance, main concepts and main speech elements.”


The “Writeprint” for Michelle Obama’s 2016 speech, according to Expert System

The company also analyzed the types of words and sentence structure. Trump’s speech was easier to understand, with a 39.9% readability score versus 23.35% for Michelle Obama, who is now the First Lady. According to the analysis, Trump’s sentences are 45% shorter than Obama’s sentences, and 68% more easy to understand.

“In simple terms,” Mayer says, “these speeches had very different messages and it really comes down to truly understanding the context, not just the words. That is the beauty of data science: we can test assumptions and prove otherwise.”

Related Items:

Cognitive Platform Sharpens Focus on Unstructured Data

Unstructured Data Miners Chase Silver with Deep Learning