How Four Financial Giants Crunch Big Data
As many know already, the term “big data” certainly isn’t just referring to the mounting volumes of information companies are collecting, but instead, it is just as much about the variety and velocity matters—in other words, the massive amounts of unstructured data that needs to be stored, managed, cleaned, then dashed around to talk to other data or move in near real-time.
That aside, one cannot overlook the issue of volume; estimates contend that financial and securities organizations are juggling around 3.8 petabytes per firm. Following behind the investment institutions, the banking industry is contending with around 1.9 petabytes.
According to Allen Weinberg, co-leader of the Banking and Securities Technology and Operations practice at research group, McKinsey & Co, financial services organizations often scratch their heads when it comes to rationalizing investments in big data. After all, they argue, isn’t this something they’ve been investing in for years—albeit without the fancy buzzword?
He says this the big data era is absolutely something new for banks; that volume alone is only a small part of the equation. The variety and need for speed on this data is the real crux of the issue—and big banks are starting to see clearer paths as they look to advanced analytics platforms outside of the traditional databases as well as to frameworks like Hadoop.
To put the real challenges in some context, today we’ll take a look at a few examples of large financial and banking institutions that are hitting the upper limits of their traditional systems and looking beyond to new analytics and framework solutions.
You’ll notice a trend as you read here with the prevalence of Hadoop emerging in a number of examples. This is not because we set about looking for Hadoop case studies in the financial arena—even a deep dig into the operations at banks large and small reveals that this is a movement that is gaining momentum; even though the tools backing it haven’t pushed it into the mainstream quite yet.
Morgan Stanley’s Big Data Approach
As one of the largest global financial services organizations in the world with over $300 billion in assets under its care, Morgan Stanley keeps close tabs on new frameworks and tools to manage the complex information pools that back high-stakes decisions.
The financial services giant has been vocal about how it is solving the challenges of the industry, most recently by looking to the Hadoop framework. What started off as a small departmental experiment on a small Hadoop cluster has blossomed into a growing reliance on the mighty elephant for mission-critical investment projects.
According to Morgan Stanley’s Gary Bhattachariee, who directs the company’s enterprise information management operations, the limitations of the traditional databases and grid computing paradigms that served the financial giant for years were stretched to the limit.
Like several other investment banks, Morgan Stanley started to look to Hadoop as the framework of choice to support growing data size, but more importantly, data complexity and the need for solid speed. Bhattachariee said that the adoption of Hadoop allowed Morgan Stanley to “bring really cheap infrastructure into a framework” that let them install Hadoop and let it handle the tasks.
The experiment started with 15 tired commodity servers, which they strung together and tested before shifting to more cores. He said that the company now has a “very scalable solution for portfolio analysis. Bhattachariee says that eventually the company might extend beyond the mission-critical portfolio analysis operations on Hadoop to include more functions, including the management of customer information.
At the core of the Hadoop future at Morgan Stanley is the matter of scalability. “The differentiator that Hadoop brings is that now you can do the same things on a much larger scale and get better results. It allows you to manage petabytes of data, which is unheard of in the traditional database world.”
Bank of America Tackles Big Data
As one of the largest banks in the States, Bank of America has been in good company with others of the same ilk that are seeking to tap into Hadoop to manage large amounts of transaction and customer data.
Abhishek Mehta serves as managing director for big data and analytics at Bank of America and has been one of the more vocal proponents of the power of big data capabilities for financial services firms. He thinks that big data will create a new era for businesses of all types, spawning what he calls a “second Industrial Revolution” which will be driven by open source frameworks, including (surprise) Hadoop.
Mehta says there are many parallels between what Hadoop is today and the way Linux was 20 years ago. It is in the exact same place. He claims that “Hadoop will be equally disruptive, not just to existing systems, but it will enable you to do things you couldn’t do before. It’s good to be occupying the front seat with it and being the leader in thinking about it,” he says.
The Bank of America big data lead says that Hadoop will be a massive disruptor, “because be it Bank of America, be it Wal*Mart, Be it Verizon, they are all data companies. You don’t push cash around, it’s moving bits & bytes. And we realize that, we want to be good custodians of it, & increase transparency that we have in the bank & in the larger system to drive positive change.”
Credit Suisse Eyes Big Data Trends
Even though the financial firm certainly has what can be classified as “big data” (in terms of volume, variety and the need for velocity) the company’s VP, Ed Dabagian-Paul says that it’s all still just a marketing term…and that size alone doesn’t necessitate all the hype that is being thrown around the buzzword.
The Credit Suisse VP told an audience at an HPC conference that the company indeed has tens of petabytes of structured data stored in relational databases. “We’re not analyzing Twitter feeds or websites,” he says, “we’re looking at financial transactions.”
Still, these financial transactions require complex compute and data-intensive systems. Currently, the company is dividing its storage across a continuum, which consists of bulk storage, big data wells, analytics operations, relational databases and in-memory databases.
When it comes to leveraging big data, Dabagian-Paul says that the company uses “bulk storage for giant flat files consisting of emails and other files that are stored on backup tapes,” while they use Hadoop for big data that might be structured, unstructured or semi-structured.
He says that data that is used in high frequency trading is stored in in-memory databases while long-term storage is typically stored in something less expensive. He claims that when evaluating big data requirements, other firms need to “Consider the size of your data set, your needs and how it needs to be processed…Big data tends to be very batch-oriented.”
Buzzword or not, Dabagian-Paul says that one challenge the financial firm is facing is finding ways to make good use of the streams of information available. At the heart of this more general challenge is the matter of big data management—and while the need is there, Dabagian-Paul thinks that the tools are still catching up with demand.
Next — ANZ Bank Bonds on Big Data >
ANZ Bank Bonds on Big Data
When it comes to big banking data, it’s not all about Hadoop and financial risk decision-making. This week ANZ Bank said its tapping into its big data wells to bond with customers outside of the traditional in-person meetings.
According to Leslie Howatt, who serves as the Australian bank’s technology lead, “One of the big shifts for us is the sale side of relationship management. Our customers expect us to turn up fully understanding everything about their business, and with a value proposition for them that’s already tailored, before you walk in the door.”
One of the advantages banks have is that – being the place where their customers bank – they automatically have access to a wealth of data on their customer’s finances. So, to meet their customer’s expectations, ANZ now employs a team of big data analysts to study every aspect of their finances, including both their businesses’ expenditure and their personal outgoings, using a combination of analytics products and services, and of course, smart people.
Howatt says it used to take several lengthy in-person meetings with business owners to determine how to proceed with business plans for corporate customers. Now, however, the company has analytics groups at their headquarters and off-site to “crunch a lot of that data” so that teams can go to business customers with tailor-made plans well in advance.