Why Data Lakes Are Prime Environments for AI

By: Eric Truntz, Senior Principal

The ability to produce purposeful information from the correct data can be a differentiator for any organization, reducing the time to realize meaningful business results using modern technology. As business leaders look to add an Artificial Intelligence (AI) strategy or enrich an existing one, it’s imperative to first look at how business information is produced from your data assets.

The proliferation of AI in our industry has shifted the automation paradigm. In many ways, the balance of the business value has shifted from the applications developed to the data assets that the applications establish and feed. Well governed data is widely considered a standalone business asset.

Whether it’s financial, claims, policy, social media, or customer data, easy access to data via a data lake or data warehouse can be a critical competitive advantage for any insurance company. And the ability to assemble a lot of data—with controls that normalize and model it—is especially important for Artificial Intelligence.

Different Types of Data Storage Architectures

When it comes to storing data, many businesses opt to have both a data lake and a data warehouse in their ecosystems. To some, data lakes are simply (and incorrectly) regarded as vast pools chock full of raw data with an undefined purpose and structure. Whereas with a data warehouse, the data is structured and defined and typically highly processed from the point of ingestion onward for a specific pre-determined purpose.

Which Type is Best for AI?

The primary criteria for selecting a data architecture for AI is its ability to support frequently changing business “purposes.” It’s often only a matter of time before even the most thought out warehouse becomes orphaned as the business’s “purpose” moves in a direction beyond that which the warehouse’s rigid structure can support. Typically, warehoused data requires end-to-end design before it can be made available to users. Complex changes can require comprehensive refitting of an existing warehouse. When implementing a data lake approach, the “rigidity and structure” is less in the data sets produced and more in the patterns and mechanisms to curate and deliver them. In practice, when business needs change in a data lake approach, those mechanisms and patterns are brought to bear to acquire new data elements and produce new data sets quickly and efficiently. Additionally, while the boundaries between curated “zones” are well defined, the data’s structure doesn’t have to be before the data is available for controlled use. This unique ability to acquire, curate, and deliver high-quality data sets regularly and efficiently set the data lake architecture apart in AI applications.

How AI Leverages Data Lakes

Predictive analytics and machine-learning both need a quantity of quality data to be effective. How to best feed them? Dump your data into a lake. Data lakes give the ability to store a vast amount of any digital data in its raw format, which allows for consumption immediately provided that the expectations for usage and quality are managed. As the data passes through the various zones and is curated, cleaned and contextualized, it’s suitability for broader usage increases. Scientists can run AI against raw data in the lake for experimentation and discovery. As the insights from AI are realized, implementations can be hardened to run against more matured data in the subsequent zones.

Support Business Outcomes First, AI Second

Many insurers have made substantial investments in applying AI in a variety of forms with a variety of results. In general, AI investments will run afoul when an insurer sets out to apply their business to a technology instead of the other way around. The best results are when insurers start with the desired business outcome and then apply technologies if and only if they are fit for purpose. While this seems to be an adage that has undoubtedly stood the test of time, it is often surprising to how often this principle is overlooked.

As the technology ecosystem grows, beware of the “tech hype” cycle. Step back and understand first how AI can fuel your business to more efficiently connect to the desired business outcome—and then recognize the data repository needed to support it.

With the right data in place, AI is serving as a key differentiator for insurance companies looking to provide a faster, more efficient, and better user experience for customers and employees.

Ready to add AI/ML to your business? Consult first with a team of experts with a proven track record in data lakes. Let’s get started.

You may also be interested in The Five Zones Every Data Lake Should Consider



Leave a Reply