Follow Datanami:
October 21, 2019

The Trade War, Supply Chain Risk, and AI


When President Trump implemented new tariffs on Chinese imports earlier this year, it set off a frenzy of activity as manufacturers scoured the globe looking for new suppliers. From solar panels to hand bags to cellular radios, we’re in the midst of major disruption in global supply chains. But who, exactly, are these new suppliers, and who takes the profit? That’s not always an easy question to answer, but thanks to a new generation of machine learning and AI technology, it’s at least becoming possible.

In the past, the creation of business relationships was a simpler affair. Two representatives may sit down for a meal, discuss the deal, and seal it a handshake (and maybe even a cigar). Many successful partnerships have been formed that way, but times have changed. It may sound odd, but it can be downright difficult for companies today to know who exactly they’re doing business with – particularly for retailers with a global footprint.

Take Walmart, the world’s largest company, for example. More than 275 million people shop at Walmart each week, buying products sourced from 2,800 suppliers, according to CEO Doug McMillon. That would be a big Excel spreadsheet to look at, but nothing too extraordinary. However, when you go down one more level, and try to find out who those 2,800 suppliers do business with – that is, what companies supply them with raw materials, bulk goods, or finished products – you’ll find many of them have a similar number of suppliers.

Depending on the product, some supply chains may go down to seven levels, each with hundreds or thousands of suppliers of their own. Now instead of a few thousand entities, you’re dealing with millions of entities spread around the world. And then you realize how utterly complex and connected the global supply chain really is.

De-Risking Supply Chains

One of the companies tasked with figuring out how to untangle the global supply chain is EastBanc Technologies. The Washington D.C.-based software development firm has created an array of solution for private and public companies, as well as defense contractors.

To adequately assess supply chain risk, a company must identify the originators of parts or services, which can be tough to do in a huge supply chain, says EastBanc Tec Chairman Wolf Ruzicka. For a buyer like the US Navy, which has contracted with Lockheed Martin to build hundreds of F35 Lightening II aircraft, understanding what entities are supplying parts and services is a matter of national security.


“Lockheed Martin, as they’re building out F35 bombers, they may have 20 to 30 maybe 100 suppliers of the part and services and maintenance contracts that need to flow into what it takes to build an F35 bomber,” Ruzicka says. “Those dozens or hundreds of companies, they have their own stock suppliers. And when you go down seven levels deep into supply chain you’re suddenly confronted with a lot of complexity really, really quickly.”

Figuring out who is supplying the suppliers for the F35 requires quite a bit of raw textual data, often sourced from public documents on the Internet, and the ability to make sense of that data

“To really understand risk,” Ruzicka says, “you need to understand not only hard data and structured data, but  you also need to understand soft factors, like relationships between companies or ownership structure or locations, maybe social media emotions around people that are associated with the supply chain.”

One of Ruzicka current contracts is to build an entity resolution and relationship mapping solution to help an unnamed client untangle global supply chains to better understand risk. Polina Reshetova, an EastBanc Tech data scientist, recently talked with Datanami about the work that went into the undertaking.

“What we came up with was a model or a set of model or algorithms that allow you to discover relationships from different structures, or text, on the Internet — articles, news — anything you can put your hands on,” she said. “So whenever there’s a mention of the company and their relationship, we catch it.”

EastBanc’s solution doesn’t scan the entire Internet for this information – at least, not yet. But the data volumes are big enough to warrant a careful approach to automation.

A Model Approach

The first step in the supply chain de-risking contract involves amassing this raw data in an Elasticsearch database running in AWS. The corpus that EastBanc is working with has on the order of 2 million articles.


The next step is to perform some entity resolution on the data to ascertain the exact identity that’s mentioned in the scraped data. In EastBanc’s case, it was identifying something on the order of 200,000 entities. This may sound like a straightforward process, but it comes with its own set of challenges, Reshetova said.

Let’s say you are looking for one company, like Microsoft,” she said. “It can be mentioned in many ways, for example, “Microsoft”, “MS”, “Microsoft Corporation”. A mention also can refer to a company product: “MS Word” , “Microsoft Azure”, etc.  So you not only have to find mentions in the text, but you also have to resolve problems like, if a particular mention means Microsoft company or it means something else.

The next step is the hardest — figuring out the relationships that a given known entity may have with other discovered entities. This is where the actual supply chain starts getting fleshed out. But because of the data volumes, it’s a task that cannot be done manually. Today, Reshetova is using the latest machine learning and neural network techniques to get the work done.

The task complexity, high computational requirements and amount of data forced Reshetova to take a certain approach to the relationship discovery, which she calls minimum viable prediction (MVP). “We start with a problem that we can solve, as small as possible, and then we go from there,” she said. “We iterate. We try to use the data that we have and see how we can solve it.”

MVP requires each iteration of the machine learning model for understanding relationships to go through the entire corpus of data 2 million articles and 200,000 known entities. That’s no small task, and it can take the better part of a day running on a handful of AWS machines.

“MPV reminds us that our final task is not to make entity recognition perfect. Our final task is to discover relationships, so we always have to get to the end,” Reshetova said. “So while the first iteration can be very simple, it assess just maybe one model. When we build the next one, we try to solve each particular problem, and in the end you can have a set of many models — one model to resolve entity recognition problem, and maybe 10 different models to discover relationships.”

AI In Action

Once the relationships are discovered, they can be loaded into a relational database so they can be queried. According to Reshetova, there are upwards of 1 million distinct relationships among those 200,000 entities. Having those relationships available via SQL to a business analyst is the end product that EastBanc’s client wants, which is something that actually doesn’t exist on the market, Ruzicka said.

“There are others that are attempting to do this, but they don’t have the capability of going deeper,” he said. “The data volumes and the complexities are exponential, the deeper you go. It’s not linear.  It really goes up almost like a hockey stick. So you face huge volumes and complexities really quickly.”

Being able to trace the beneficial ownership of overseas entities is of critical importance as companies look to avoid doing business with other companies that face tariffs or are otherwise being sanctioned by a governmental entity. When tariffs on Chinese manufacturers were raised for solar panels, for example, the supply chain shifted south to Vietnam. Importers need to vet these new suppliers to ensure they’re staying in compliance with the law.

Ruzicka recalls a Turkish supplier “that suddenly had a majority ownership by a Russian company that should at least raise some yellow flags, if not a red flag,” he said. “It may be a non-issue. But if you don’t know of the issue, you don’t know what kind of flag you should raise.”

Related Items:

Why Big Data and Data Scientists Are Overrated

From Big Beer to Big Data: Inside AB InBev’s Digital Transformation

AI-Driven Decision Platforms Live in the Moment