Being Realistic about Data Analytics

It’s January, which means that it’s time for putting new plans and resolutions to work. What will banks, credit unions, and other financial institutions do differently this year? What should they do differently? What goals should they set? What should they invest in, and what should they leave alone?

Jim Marous, as usual, has a good list of do’s and don’ts for the year on his blog, Bank Marketing Strategy. One of his suggested resolutions for bankers for 2014 is not to get distracted by “big data.” For most institutions, that’s probably good advice.

Some larger financial institutions (FIs) probably have the resources to tackle adventurous projects involving some of the technologies commonly associated with Big Data, such as Hadoop clusters that create on-the-fly data stores for massive number-crunching and rapid business intelligence. There’s nothing wrong with these “shiny new” projects, but Marous is absolutely right that the glitziest of these projects are beyond the technical means and constrained budgets of most FIs.

Instead Marous advises FIs “to capitalize on the ever growing silo of data already available within your organization.” He notes that “competitive advantage is achievable through the better use of data in the development of lifestage trigger cross-sell programs, optimal branch configuration, pricing decisions and risk and fraud monitoring.”

We agree with Marous’s advice with one important qualification. We think that FIs should capitalize on data that’s readily available to them. Not all of it is available “already”—but fortunately data that’s useful but not available can be obtained pretty easily without major investments in in-house Big Data projects. We’ll say more about this below.

First, we’d like to take a short detour and propose a simple thought exercise. Even if FIs prudently choose to avoid cutting-edge Big Data projects, we think they could benefit from evaluating their use of and need for data in terms of some Big Data metrics. By breaking “Big Data” down to some of its constituent components, FIs might find a useful road map for upgrading some critical IT services without having to embark on bold, expensive, and risky IT initiatives.

To begin, we need to get beyond that catch phrase, Big Data. It’s time to dispense with the hype and ask what Big Data really is.

What is Big Data? Meet the Three V’s

What is Big Data and how is it different from the databases and business intelligence tools of yore? Undoubtedly, some of the revolutionary claims about this technology are hype, but in many cases so-called Big Data really does outpace and outperform traditional data analysis. Big Data differs from previous data technologies in terms of three key attributes—the so-called three V’s:

  • Volume
    Big Data enables organizations to analyze exponentially larger volumes of data. This is true not just in scientific fields like astronomy and particle physics, but also in areas like retail. For example, Walmart stores its business transactions in databases that have now reached 2.5 petabytes (2560 terabytes), roughly 167 times the amount of data found in all the books in the Library of Congress. That’s obviously well beyond the scale of typical enterprise relational databases from a decade ago.
  • Variety
    Big Data often involves collecting and analyzing more types of data that were collected and analyzed before. This includes not only a broader range of structured data but also unstructured data as well, such as images, videos, and sensor data. Think about Facebook, the service that ended up launching many important Big Data technologies such as Apache Cassandra, which was originally developed as a data store for Facebook Mail. Facebook routinely manages text, photos, videos, games, ads, and polls. Facebook’s data set is far more varied that those found in traditional databases.
  • Velocity
    Big Data enables calculations and analysis to proceed at jaw-dropping speeds. Working with the most advanced data analysis technology available at the time, it took scientists about a decade to decode the first human genome. Now scientists can analyze a human genome in about a week. That’s about a 500X improvement.

Applying the Three V’s to Your Institution

Now that we’ve discussed the three V’s, we’re ready for the thought experiment.

Instead of asking whether your institution needs Big Data, ask whether your institution could benefit from any of the three V’s.

Let’s take each of them in turn.


You can break volume into two parts: there’s the volume going into your IT systems, and the volume of data coming out. Do you need more data coming into your systems, data coming out, both, or neither?

Most FIs are probably struggling with the idea of increasing data inputs. Mobile apps, social media, and the growing popularity of online banking have given them more than enough input to handle. It’s fine if they don’t want to add additional inputs for now.

But let’s consider data output—whether you choose to call it information, insights, or business intelligence. Does your institution have all the information it needs from its IT systems and the data they already collect? Would you like more information about customers and potential customers, for example?

Specifically, do you have all the information you could possibly want about customers and potential customers when you:

  • Market to them
  • Open an account
  • Cross-sell to them (or pass on cross-selling to them?)

If your institution is like most, you probably would like to get more insights about your customers, especially if you could get those insights from the data you already have.

For example, CIP compliance requires that when you open accounts for customers, you record their names, current addresses, dates of birth, and identification numbers (usually a tax identification number, which for most people is a Social Security Number). Are you 100% sure you have access to all the business intelligence that can possibly be derived from these simple inputs?

When we analyze account-opening data for financial institutions (not just banks and credit unions), we find we can predict account profitably and likelihood of losses from charge-offs and fraud using analysis that begins with just these four data types.

We find that most institutions have data blind spots, and these blind spots are costly. Sometimes they lead to institutions turning down applicants who would have been profitable. Sometimes they lead to institutions offering the wrong product to a customer; for example, offering a DDA account that includes overdraft protection, when the FI would have reduced losses by offering this customer a more limited “Second Chance” account that omitted overdraft protection. Will a customer still be a customer a year later? We can predict the answer to that question, too, with a high degree of accuracy.

Without changing data inputs at all, these institutions are able to achieve better business results by applying insights gained from having more information about customers.

So, how about your institution:  would you like more information about applicants and customers, or do you have enough?


Data variety, too, can be broken down into data input and data output.

Even if you’re content with—or overwhelmed by— the variety of data your institution is collecting, are you content with the variety of data you’re relying on to make business decisions? Would you like to know different sorts of things about applicants, customers, accounts, and potential markets?

New data types could include risk/confidence scores that are far more granular, instructive, and actionable than the simple approved/disapproved ratings that many ID verification services provide.

We’ve found cases where the profitability profile of a customer pool is surprisingly complex and in some cases not at all intuitive. By moving beyond binary (yes/no) data types to more nuanced metrics, many institutions will be able to gain insights that enable them to grow profits by 10-15% or more annually.


This is an area where almost every institution could improve. A great deal of finance still relies on overnight batch processes. Marous is absolutely right to recommend making more use of real-time data services.

Real-time data analysis is critical for operations like account opening. If your institution gets critical data about profitability and risk only when several days have already passed since an account has been opened and checks and debit cards issued, it’s too late to tactfully or cost effectively change or reverse the decisions that were made when the customer was meeting face to face with a banker.

Fortunately, real-time data services are readily available for bankers. Many of them can even be accessed securely in a Web browser, enabling institutions to choose the pace and scope of any eventual integration.

Summing Up

As you begin 2014, it might be best to forget about “Big Data” and instead ask questions about:

  • The volume of data you’re getting: Is it enough?
  • The variety of data you’re getting: Does it let you answer the questions that matter?
  • The velocity of data delivery: Is it fast enough for employees to make the best decisions?

By ignoring the hype and focusing on the three V’s, you might be able to quickly identify some worthwhile projects that are manageable, affordable, and profitable.

And that would be an excellent way to begin the New Year.