Replication data for: Assessment of Data-Driven Techniques for Flow Rate Predictions in Sub-sea Oil Production

Version 1.0

Neville Aloysius D’Souza; Pfeiffer, Carlos; Mirlekar, Gaurav, 2026, "Replication data for: Assessment of Data-Driven Techniques for Flow Rate Predictions in Sub-sea Oil Production", https://doi.org/10.18710/KIJEWJ, DataverseNO, V1

Learn about Data Citation Standards.

Contact Owner

Dataset Metrics

0 Downloads

Description	The data set consists of simulated time‑series measurements from two gas‑lifted subsea oil wells, used to develop and evaluate data‑driven virtual flow metering (VFM) models for oil and gas flow rate prediction. Purpose: To assess a range of machine learning algorithms (10 methods, including LSTM, MLP, XGBoost, SVR, tree‑based and linear methods) for predicting multiphase flow rates in subsea oil production, and identify which give the lowest prediction error. To study the impact of measurement noise, the effect of noise filtering (median filter), and the quantification of prediction uncertainty (via 95% confidence intervals in XGBoost) in a VFM context. Scope: Two wells (Well 1 and Well 2) are considered, each represented by an open‑loop simulation model of a gas‑lifted oil well derived from Janatian et al. (2022). For each well, 5 762 samples of process data are generated and split into 70% training and 30% test sets using a time‑series split; key input variables include bottom‑hole and wellhead pressures and temperatures plus choke opening, with oil and gas flow rates as targets. The study covers the full workflow: data collection from the simulator, preprocessing (scaling, time‑series splitting, noise injection and filtering), model training and hyperparameter tuning, performance comparison via MAPE, and uncertainty quantification. Nature of the data: Synthetic, model‑generated process data rather than field measurements: data come from a validated dynamic model of gas‑lifted wells, not directly from a physical asset. Multivariate, time‑series data at sample‑level resolution, comprising sensor‑like inputs (pressures, temperatures, choke openings) and corresponding oil and gas flow rates over time for each well. Used primarily as a benchmarking set for supervised learning: different regression algorithms are trained and tested on identical data to compare prediction accuracy, robustness to impulse noise, and the effect of noise reduction and uncertainty quantification techniques.
Subject	Engineering; Computer and Information Science; Mathematical Sciences
Keyword	Machine learning techniques, Data-driven estimations, Uncertainty quantification, Measurement noise, Oil production
License/Data Use Agreement	CC0 1.0

Filter by

	1 to 9 of 9 Files	Download
	00_ReadMe.txt Plain Text - 4.1 KB Published Feb 24, 2026 0 Downloads MD5: d876283a934a820f7703dd26e950e133	Preview "00_ReadMe.txt" Access File File Access Public Download Options Plain Text Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	calculate_output.m MATLAB Source Code - 459 B Published Feb 24, 2026 0 Downloads MD5: bfc538b18e25ef6c92be4f828244e4a7	Access File File Access Public Download Options MATLAB Source Code Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	make_w_gc.m MATLAB Source Code - 1.1 KB Published Feb 24, 2026 0 Downloads MD5: 61bda33eb19fff36c06dc94044f2e65a	Access File File Access Public Download Options MATLAB Source Code Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	ML_dev_part1.ipynb Jupyter Notebook - 255.9 KB Published Feb 24, 2026 0 Downloads MD5: 15e085e50a122a36c8cc28635f956db7	Access File File Access Public Download Options Jupyter Notebook Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	ML_dev_part2.ipynb Jupyter Notebook - 240.4 KB Published Feb 24, 2026 0 Downloads MD5: aec99d2af4b73cb120944552410a6f16	Access File File Access Public Download Options Jupyter Notebook Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	oil_field_model.m MATLAB Source Code - 4.2 KB Published Feb 24, 2026 0 Downloads MD5: 3afdf981934adb3916a638c92b3e0ef4	Access File File Access Public Download Options MATLAB Source Code Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	openloop_simulation.m MATLAB Source Code - 4.4 KB Published Feb 24, 2026 0 Downloads MD5: ab8287accb12c0c577c099c750b6910e	Access File File Access Public Download Options MATLAB Source Code Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	Readme.md Markdown Text - 155 B Published Feb 24, 2026 0 Downloads MD5: 0c05268200c7ad6592a1f7d9edf5d792	Preview "Readme.md" Access File File Access Public Download Options Markdown Text Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	update_states.m MATLAB Source Code - 319 B Published Feb 24, 2026 0 Downloads MD5: 22c8cc35407cf3e6460003bceedc8911	Access File File Access Public Download Options MATLAB Source Code Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX

Citation Metadata

Persistent Identifier	doi:10.18710/KIJEWJ
Publication Date	2026-02-24
Title	Replication data for: Assessment of Data-Driven Techniques for Flow Rate Predictions in Sub-sea Oil Production
Author	Neville Aloysius D’Souzahttps://ror.org/05ecg5h20 University of South-Eastern Norwayhttps://orcid.org/0000-0002-2265-6495 University of South-Eastern Norwayhttps://orcid.org/0000-0002-4070-0795
Point of Contact	Use email button above to contact. Mirlekar, Gaurav (University of South-Eastern Norway)
Description	The data set consists of simulated time‑series measurements from two gas‑lifted subsea oil wells, used to develop and evaluate data‑driven virtual flow metering (VFM) models for oil and gas flow rate prediction. Purpose: To assess a range of machine learning algorithms (10 methods, including LSTM, MLP, XGBoost, SVR, tree‑based and linear methods) for predicting multiphase flow rates in subsea oil production, and identify which give the lowest prediction error. To study the impact of measurement noise, the effect of noise filtering (median filter), and the quantification of prediction uncertainty (via 95% confidence intervals in XGBoost) in a VFM context. Scope: Two wells (Well 1 and Well 2) are considered, each represented by an open‑loop simulation model of a gas‑lifted oil well derived from Janatian et al. (2022). For each well, 5 762 samples of process data are generated and split into 70% training and 30% test sets using a time‑series split; key input variables include bottom‑hole and wellhead pressures and temperatures plus choke opening, with oil and gas flow rates as targets. The study covers the full workflow: data collection from the simulator, preprocessing (scaling, time‑series splitting, noise injection and filtering), model training and hyperparameter tuning, performance comparison via MAPE, and uncertainty quantification. Nature of the data: Synthetic, model‑generated process data rather than field measurements: data come from a validated dynamic model of gas‑lifted wells, not directly from a physical asset. Multivariate, time‑series data at sample‑level resolution, comprising sensor‑like inputs (pressures, temperatures, choke openings) and corresponding oil and gas flow rates over time for each well. Used primarily as a benchmarking set for supervised learning: different regression algorithms are trained and tested on identical data to compare prediction accuracy, robustness to impulse noise, and the effect of noise reduction and uncertainty quantification techniques.
Subject	Engineering; Computer and Information Science; Mathematical Sciences
Keyword	Machine learning techniques Data-driven estimations Uncertainty quantification Measurement noise Oil production
Related Publication	doi 10.3384/ecp212.014 https://doi.org/10.3384/ecp212.014
Producer	University of South-Eastern Norway (USN) https://www.usn.no/english/
Distributor	University of South-Eastern Norway (USN) https://dataverse.no/dataverse/usn
Depositor	Mirlekar, Gaurav
Deposit Date	2026-02-21

Dataset Terms

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Creative Commons CC0 1.0 Universal Public Domain Dedication. CC0 1.0

Dataset Version	Summary	Version Note	Contributors	Published on
No records found.

Edit File

This file has already been deleted (or replaced) in the current version. It may not be edited.

Restrict Access

Restricting limits access to published files. People who want to use the restricted files can request access by default. If you disable request access, you must add information about access to the Terms of Access field.

Learn about restricting files and dataset access in the User Guide.

Request Access

Enable access request

You must enable request access or add terms of access to restrict file access.

Terms of Access for Restricted Files

Save Changes

Edit Embargo

The selected file or files have already been published. Contact an administrator to change the embargo date or reason of the file or files.

Edit Retention Period

The selected file or files have already been published. Contact an administrator to change the retention period date or reason of the file or files.

Delete Files

The file will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Select File(s)

Please select one or more files.

Share Dataset

Share this dataset on your favorite social media networks.

Continue

Dataset Citations

Citations for this dataset are retrieved from Crossref via DataCite using Make Data Count standards. For more information about dataset metrics, please refer to the User Guide.

Sorry, no citations were found.

Inaccessible Files Selected

The selected file(s) may not be downloaded because you have not been granted access or the file(s) have a retention period that has expired or the files can only be transferred via Globus.

You may request access to any restricted file(s) by clicking the Request Access button.

Ineligible Files Selected

The selected file(s) may not be transferred because you have not been granted access or the file(s) have a retention period that has expired or the files are not Globus accessible.

You may request access to any restricted file(s) by clicking the Request Access button.

Download Options

The files selected are too large to download as a ZIP.

You can select individual files that are below the 9.3 GB download limit from the files table, or use the Data Access API for programmatic access to the files.

Select File(s)

Please select a file or files to be downloaded.

Inaccessible Files Selected

The selected file(s) may not be downloaded because you have not been granted access or the file(s) have a retention period that has expired.

Click Continue to download the files you have access to download.

Ineligible Files Selected

Some file(s) cannot be transferred. (They are restricted, embargoed, with an expired retention period, or not Globus accessible.)

Click Continue to transfer the elligible files.

Delete Dataset

Are you sure you want to delete this dataset and all of its files? You cannot undelete this dataset.

Delete Draft Version

Are you sure you want to delete this draft version? Files will be reverted to the most recently published version. You cannot undelete this draft.

Unpublished Dataset Preview URL

Preview URL can only be used with unpublished versions of datasets.

Unpublished Dataset Preview URL

Are you sure you want to disable the Preview URL? If you have shared the Preview URL with others they will no longer be able to use it to access your unpublished dataset.

Delete Files

The file(s) will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Compute

This dataset contains restricted files you may not compute on because you have not been granted access.

Deaccession Dataset

Are you sure you want to deaccession? This is permanent and the selected version(s) will no longer be viewable by the public.

Deaccession Dataset

Are you sure you want to deaccession this dataset? This is permanent an it will no longer be viewable by the public.

Version Differences Details

Please select two versions to view the differences.

Version Differences Details

Version:
Last Updated:

Select File(s)

Please select a file or files for access request.

Select File(s)

Embargoed files cannot be accessed. Please select an unembargoed file or files for your access request.

Edit Tags

Select existing file tags or create new tags to describe your files. Each file can have more than one tag.

Request Access

You need to Log In to request access.

Dataset Terms

Please confirm and/or complete the information needed below in order to request access to files in this dataset.

This dataset is made available under the following terms. Please confirm and/or complete the information needed below in order to continue.

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Creative Commons CC0 1.0 Universal Public Domain Dedication. CC0 1.0

Preview Guestbook

Upon downloading files the guestbook asks for the following information.

Guestbook Name

Collected Data

Account Information

Package File Download

Use the Download URL in a Wget command or a download manager to download this package file. Download via web browser is not recommended. User Guide - Downloading a Dataverse Package via URL

Download URL

https://dataverse.no/api/access/datafile/

Compute Batch

Clear Batch

Dataset	Persistent Identifier	Change Compute Batch

Compute Batch

Submit for Review

You will not be able to make changes to this dataset while it is in review.

Publish Dataset

Are you sure you want to republish this dataset?

Select if this is a minor or major version update.

Minor Release (1.1)

Major Release (2.0)

Publish Dataset

This dataset cannot be published until University of South-Eastern Norway is published by its administrator.

Publish Dataset

This dataset cannot be published until University of South-Eastern Norway and DataverseNO are published.

Return to Author

Return this dataset to contributor for modification.

Add/Edit a Version Note

Styled Citation