NOTE: This article on oil and gas data science was also published in Foundations, the official publication of the Professional Petroleum Data Management Association (PPDM).
The Oil and Gas industry is missing the boat when it comes to Data Science—that is, reimagining data and its inherent value as a strategic asset. As in every industry today, the Oil and Gas industry seek ways to improve efficiencies and thus reducing operating costs and increasing revenues. However, unlike many industries, Oil and Gas organizations also face unique safety, environmental and regulatory reporting requirements. Data Science offers numerous advantages that, when embraced by our industry, will be instrumental in improving data efficiencies and increasing revenues.
Let’s start with the basics: What is Data Science? For sure, Data Science is an overused and confusing buzzword used to promote concepts like Big Data and digital transformation. It is often thrown around as a catchphrase for anything data or analytics related. High level, it is more accurately defined as a progressive approach to data, using analysis of past and current data to predict future outcomes. This ability to utilize the past and present to better understand the future can identify data value that can be translated into business value.
The principles and tools behind Data Science have been around for decades, including:
- Computer Science
- Machine Learning
Today, the term Data Science is the unifying umbrella encompassing these principles and applying them to data. When we discuss Data Science, we are referring to these principles and tools from various sciences to explore a company’s past and current data to find patterns and then use those patterns to develop models or algorithms to predict future outcomes of a business.
Historically, business data was structured and limited in quantity. It was not uncommon for data to either be maintained in spreadsheets or to be entered manually into spreadsheets. Some business data was encrypted in proprietary databases and not even available for download. Companies were able to manage this limited and structured data by using Business Intelligence tools to analyze the available data.
This is not possible today. More and more of our business data is unstructured and huge in volume. It is generated from diverse data sources—from text files, financial records, multimedia, instrument sensors, etc. More complex and advanced analytic tools and algorithms are required for processing and analyzing this data. Digital tools can provide Oil and Gas companies an avenue upon which they can define, connect and use their data regardless of data source.
A common misconception is that Data Science is the same thing as Business Intelligence. Business Intelligence is the process of using technology to analyze data for the presentation of deliverables such as graphs, charts, reports, spreadsheets, etc. Business Intelligence asks the question “What happened and what should be changed?” Data Science asks, “Why did it happen and what can happen in future?” It’s the difference in “What”, “Why” and “How” that differentiates Business Intelligence and Data Science.
Why We Need Data Science
Data volume in the Oil and Gas industry has grown exponentially through the advancement of information technology. This includes everything from recording sensors in exploration, drilling, production and seismic operations to Logging While Drilling (LWD) technology, allowing drilling data to be recorded real-time. It also includes fiber optic solutions providing a wide range of data about environmental conditions such as temperature, oil reserve levels and equipment performance or status. Managing this data and using it as a strategic asset significantly impacts the financial performance of the company.
Business Intelligence tools are no longer capable of providing the level of analysis required. Applying Data Science, mathematics, statistics, computer science, machine learning, and probability can make the data manageable. Data Science can help in moving organizations from reactive remedial solutions to proactive decision making. This is enabled through integrating different types of data into predictive models, which can then be used to predict future outcome.
Predictive models are statistical models used to predict outcomes – data is collected, a predictive model is defined, predictions are made, and the model is validated or revised as new data is available. Data Science uses predictive models to interpret and organize big data.
The oil-price slump has forced Oil and Gas companies to look beyond traditional methods and to seek broader business practice changes to increase performance and cut costs. Better data analytics and technology provides the key in determining whether Oil and Gas companies thrive.
Standard Life Cycle of Data Science Projects
Data Science is an ever-evolving field positioning the Data Science project life cycle to be open to interpretation and customization. Until standards have been defined and accepted, a basic iterative Data Science life cycle is recommended as a starting point.
- Business Understanding
- Identify problems – These will become the target model.
- Define business goals
- Regression – How much?
- Classification – Which category?
- Clustering – Which group?
- Anomaly detection – Is this expected or unusual?
- Recommendation – Which option?
- Data Understanding
- Identify data source – Is the required data available? If not, can it be obtained?
- Ingest the data – Import the data into an analytical sandbox.
- Explore the data – Use data summarization to audit the quality of the data.
- Setup a data pipeline – Define process to regularly refresh data
- Group data into training data set and test data model set
- Build models using the training dataset
- Evaluate model results
- Deploy the model to a production or test environment for consumption.
Benefits of Data Science in the Oil and Gas Industry
Here are a few high-level examples of how the Oil and Gas industry can benefit from Data Science:
- Exploration and discovery – Seismic data and geological data, such as rock types in nearby wells, can be used to predict oil pockets.
- Production accounting – Production data can be linked with alarms.
- Drilling and completions – Predictive analytics can make use of geological completion and drilling data to determine between preferred, best, drilling locations.
- Equipment maintenance – Real-time streaming data from rigs can be compared with historical drilling to help predict and prevent problems and better understand operation risks.
These examples demonstrate the operational goal of Data Science in Oil and Gas: to continuously maximize the life cycle value of Oil and Gas assets by real-time monitoring, continuous updating of predictive models with latest data and continuous optimization of multiple long- and short-term decisions.
As with any technological advancement, there are barriers to the successful use of Data Science, including:
- Taxing computing resources – There may not be enough resources to hold and process large amount of structured and unstructured data.
- Poor data quality – Data may be maintained in multiple locations and subject to inconsistent governance.
- Incorrect modeling – The right questions may not have been asked or may have been misunderstood.
- Intransigent corporate culture – C-suite support is imperative from the get-go. Communication between collaborators, SMEs and data scientists is critical.
- Talent gaps: Data Science and data engineering talent is new to the Oil and Gas industry. These skillsets are still developing, and it can be difficult to assemble the right team.
All things said, we are living in a historic era of an explosive period of growth for the Oil and Gas industry, mind-boggling growth in both the production of hydrocarbons and digital data. Data Science and all the new and emerging technologies enable the discovery of new opportunities, generating more efficient workflows, increased safety and significant reductions in operational cost.
As the Oil and Gas industry grows and becomes receptive to big data and the use of Data Science, it can only move forward. Huge volumes of unused and undervalued data that is just stored has little worth. For data to be a true asset, it must be identified, aggregated, stored, analyzed and perfected. This ability to understand insights from large datasets can make the Oil and Gas industry more profitable and efficient.
About the Author
Charity Queret is a senior consultant at Stonebridge Consulting. Charity has over 20 years of experience in designing and developing end-to-end Business Intelligence and data warehousing solutions. Her data management expertise includes Business Intelligence services, such as Cognos and Crystal development, requirements gathering, data verification, data mapping and testing. She also provides documentation of existing systems, user manuals and training and BI roadmaps for future development.