| Title | Historical Data Prediction for Geothermal Systems Using Data-Driven Modelling |
|---|---|
| Authors | Daniel CLARK, Michael TEREKHIN, Andreas KEMPA-LIEHR, John O'SULLIVAN, Michael O'SULLIVAN, Michael GRAVATT |
| Year | 2025 |
| Conference | Stanford Geothermal Workshop |
| Keywords | bayesian, optimisation, production, data-inference |
| Abstract | Geothermal reservoir modelling requires detailed historical extraction data to predict a geothermal system's future state accurately. However, this data is rarely available to modellers. Often, wells are sparsely measured, while grouped measurements, such as mass flow at a separator, occur more frequently. Previously, this data history was manually estimated, leading to an inaccurate and subjective dataset with unquantifiable uncertainty. This is a serious problem for geothermal reservoir models relying on accurate, well-by-well data to generate predictions. This paper outlines the process undertaken to develop methods of predicting this unknown data history. This paper introduces two data-driven models to address this issue. In the first method, is an optimal Tikhonov-regularized Linear Least Squares Optimization (TRLLSO) model, which is a computationally efficient and accurate way to predict a geothermal system's historical mass extraction data objectively. This method relies on fitting an arbitrarily high order polynomial for each wells mass production. The approach calculates the weights of the terms in the polynomial such that the fit to the data is minimized. The uncertainty of model predictions can be quantified through Monte Carlo simulation uncertainty propagation. The second method uses Gaussian Process Regression (GPR) to solve this sparse data problem. GPR is a Bayesian approach that assumes a time correlation between dense data points based on a kernel function while respecting the sparse data for each well. This approach was modified so that the sum of wells also respected the dense time history data measured at the separator. Both approaches were tested on synthetic data and data from an operational geothermal field. The results of these methods are compared in this paper, but both show merit in providing solutions to this problem. |