By Telha Ghanchi and Steve Molsberry
The upstream oil and gas industry is primed to deploy artificial intelligence (AI) and machine learning (ML) solutions for time-series data analysis. Cloud-based data management platforms like Snowflake are accelerating the adoption of these transformative tools.
What is Time-Series Data?
Time-series data is a stream of historical data points that are captured at a set interval – days, weeks, minutes, or seconds. As related to the oil and gas industry, sensors and other IoT devices in the field have made it feasible to record an enormous amount of raw data from across the E&P ecosystem.
Raw time-series data is referred to as “raw” for a reason: it’s a stream of data points that can quickly become a massive headache from both a data management and analytics perspective. Data wrangling and initial quality checking is required before the newer AI and machine learning tools can be applied to oil and gas use cases such as predictive maintenance or completion optimization.
Some common data wrangling and analytic techniques for time-series data include:
- Segmentation: splitting a time series into several “meaningful” segments.
- Clustering: finding natural groupings of time series or time-series patterns.
- Classification: assigning given time series or time-series patterns to one of several predefined classes.
- Summarization: generating a short description of a time series while retaining the essential features of a considered problem.
- Anomaly Detection: identifying surprising, unexpected patterns.
- Motif Discovery: isolating frequently occurring patterns.
- Forecasting: predicting time-series values based on time-series history or human expertise.
- Discovery of Association Rules: finding rules relating to patterns in a time series.
Snowflake Enables Cost Effective Time-Series Data Management and Analytics
Managing and analyzing time-series data requires a powerful, scalable, robust data storage and retrieval solution. In the past, organizations built out their own server environments to manage information in relational databases. More recently, they’ve used various flavors of Hadoop. These solutions, however, are relatively expensive and require significant upfront investment, with only “the promise” of valuable future insights.
Snowflake, a cloud-built data platform, has changed the game for time-series data. With Snowflake, extremely large datasets from sensors and IoT devices in the field can now be economically loaded, managed and accessed for analytics. Here are a few of the key capabilities.
- From a data management perspective, Snowflake can serve as the central time-series data repository or data can be loaded in inexpensive cloud storage from Amazon S3, Azure Data Lake Storage or Google Cloud storage.
- From a scalability perspective, Snowflake’s multi-cluster shared data architecture provides superior performance that is only paid for when it is used. Compute resources can be spun as XS or 4XL compute notes based on analytic workloads
- From a data wrangling perspective, Snowflake allows time-series data to be prepared and combined with other content in a common data platform. Repetitive quality checking, restructuring and integrations can be automated as they mature so that data scientists spend more time analyzing rather than manipulating data
- From an analytics and data science perspective, Snowflake has connectors to a wide variety of tools to support exploratory activities. Tools such as Power BI and Spotfire are supported out of the box
- From a training perspective, Snowflake can query both structured and machine-generated, semi-structured data (i.e., JSON, Avro, XML) using familiar SQL operators and natural extensions.
Final Thoughts re: Snowflake and Time-Series Data Analysis
The Snowflake data platform allows E&P organizations to begin extracting insight from time-series data with minimal upfront investment. As initial analytics results come in, however, it’s also wise to recall the words of James R. Evans, from his book Business Analytics: “Qualitative and judgmental techniques rely on experience and intuition.” Indeed, both qualitative and quantitative analysis are required in order to make good judgments on time-series data. While AI and ML will certainly supplement the intuition of the oil and gas data analyst, these tools won’t entirely replace the human factor.
NEXT STEP: To learn how Stonebridge can help you transform your operations with innovative data management solutions like Snowflake, contact us.