Efficient estimation of natural gas compressibility factor using
1. Efficient estimation of natural gas compressibility factor using
a rigorous method
Amir Fayazi a
, Milad Arabloo a
, Amir H. Mohammadi b,c,*
a
Department of Petroleum Engineering, Petroleum University of Technology, Ahwaz, Iran
b
Institut de Recherche en Génie Chimique et Pétrolier (IRGCP), Paris Cedex, France
c
Thermodynamics Research Unit, School of Engineering, University of KwaZulu-Natal, Howard College Campus, King George V Avenue,
Durban 4041, South Africa
a r t i c l e i n f o
Article history:
Received 3 August 2013
Received in revised form
6 October 2013
Accepted 28 October 2013
Available online
Keywords:
Natural gas
Compressibility factor
Least square support vector machine
Sour gas
a b s t r a c t
The compressibility factor (Z-factor) of natural gases is necessary in many gas reservoir engineering
calculations. Accurate determination of this parameter is of crucial need and challenges a large number
of used simulators in petroleum engineering. Although numerous studies for prediction of gas
compressibility factor have been reported in the literature, the accurate prediction of this parameter has
been a topic of debate in the literature. For this purpose, a new soft computing approach namely, least
square support vector machine (LSSVM) modeling optimized with coupled simulated annealing opti-
mization technique is implemented. The model is developed and tested using a large database consisting
of more than 2200 samples of sour and sweet gas compositions. The developed model can predict the
natural gas compressibility factor as a function of the gas composition (mole percent of C1eC7þ, H2S, CO2,
and N2), molecular weight of the C7þ, pressure and temperature. The calculated Z-factor values by
developed intelligent model are also compared with predictions of other well-known empirical corre-
lations. Statistical error analysis shows that the developed LSSVM model outperforms all existing pre-
dictive models with average absolute relative error of 0.19% and correlation coefficient of 0.999. Results
from present study show that implementation of LSSVM can lead to more accurate and reliable esti-
mation of natural gas compressibility factor.
Ó 2013 Elsevier B.V. All rights reserved.
1. Introduction
The role of natural gas in meeting the world energy demand has
been increasing because of its abundance, versatility, and clean
burning (Wang and Economides, 2009). Natural gas often contains
some amounts of heavier hydrocarbon and non-hydrocarbon
components that contribute to its properties. It is important to
obtain accurate and reliable estimates of the physical properties of
natural gas for optimal exploitation and usage. In most upstream
and downstream petroleum and natural gas engineering calcula-
tions, the compressibility factor of natural gases are necessary to
gas metering, gas compression, design of pipelines and surface fa-
cilities (Azizi et al., 2010; Elsharkawy, 2004).
The common sources of Z-factor values are experimental
measurements, equations of state (EoS) and empirical correlations.
The most reliable and accurate way to obtain physical properties is
from accurate experimental measurements. These experiments are
expensive and time-consuming and it is impossible to measure
properties for all possible compositions of natural gases (Ahmed,
2001). However, when laboratory analyses are not available, it is
the task of empirical correlations and equations of state (EoS) to
estimate the petroleum fluid properties as a function of the reser-
voir’s readily available characteristics (Ahmed, 1989). Empirical
correlations, which are used to predict natural gas Z-factor, are
much easier and faster than equations of state. Sometimes these
correlations have comparable accuracy to equations of state
(Elsharkawy, 2004). In addition, equations of state (EoS) are more
complex than the empirical correlations, involving a large number
of parameters, which require more complicated and longer
computations.
The recent development and success of applying support vector
machine modeling to solve various difficult engineering problems
has drawn the attention to its potential applications in the petro-
leum industry (Arabloo et al., 2013; Farasat et al., 2013; Shokrollahi
et al., 2013). This study presents a new compositional model for
* Corresponding author. Institut de Recherche en Génie Chimique et Pétrolier
(IRGCP), Paris Cedex, France.
E-mail address: a.h.m@irgcp.fr (A.H. Mohammadi).
Contents lists available at ScienceDirect
Journal of Natural Gas Science and Engineering
journal homepage: www.elsevier.com/locate/jngse
1875-5100/$ e see front matter Ó 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.jngse.2013.10.004
Journal of Natural Gas Science and Engineering 16 (2014) 8e17
2. estimation of gas compressibility factor based on support vector
machine modeling approach. A total of 2249 data points for a va-
riety of natural gases, covering lean, sweet to rich and acid or sour
gases (H2S, and CO2) are collected from open literature. The pro-
posed model efficiency is compared to five commonly used
empirical correlations (Beggs and Brill, 1973; Kumar, 2004;
Heidaryan et al., 2010; Azizi et al., 2010; Sanjari and Lay, 2012) and
several criteria are used to evaluate the developed model including
the coefficient of determination (R2
), average relative error (ARE),
average absolute relative error (AARE), and root mean square error
(RMSE).
In the following section, a review on some existing Z-factor
estimation techniques is presented. Then, backgrounds of the
proposed model and computation procedure are discussed in the
subsequent sections. Accuracy and validation of the proposed
models is checked later in Section 4. Subsequently, key findings of
the present work are presented in Section 5.
2. Natural gas compressibility factor
The ratio of the real volume to the ideal volume, which is a
measure of the amount the gas deviating from perfect behavior, is
called the compressibility factor. It is also called the gas deviation
factor and is denoted by the symbol Z. Gas properties such as gas
volume, density and viscosity can be estimated using the gas de-
viation factor.
The principle underlying development of all early correlations
for gas compressibility factor is the law of corresponding states that
originally proposed by van der Waals (1873). This law proposes that
all gases will exhibit the same behavior, e.g. Z-factor, when viewed
in terms of reduced pressure and reduced temperature. Mathe-
matically, this principle can be defined as:
Z ¼ f Tr; PrÞð (1)
By definition,
Pr ¼
P
Pc
(2)
Tr ¼
T
Pc
(3)
where Pc and Tc are critical pressure and critical temperature of
the gas, respectively. We should also note that only single-
component gases have distinct, single-valued critical pressures
and temperatures. We often observe a range of pressures over
which a natural gas mixture will liquefy at a given temperature
and a range of temperatures at which a liquid may exist at a given
pressure, so it is often very difficult to determine the exact critical
properties of a natural gas mixture (Calhoun, 1951). Conse-
quently, the petroleum industry has embraced use of pseudo-
critical properties as correlating parameters for natural gas
mixtures. The values of critical properties for gas mixtures can be
calculated via one of the mixing rules. Kay (1936), SBV (Stewart
et al., 1959), and SSBV (SBV modified by Sutton, 1985) are three
widely used mixing rules in the petroleum industry to calculate
pseudo critical properties of natural gases, if the composition of
the gas and the critical properties of the individual components
are known. Otherwise, the pseudo critical temperature and
pressure may be estimated using correlations based on gas spe-
cific gravity.
Therefore, the compressibility of a natural gas at a given pres-
sure and temperature can be obtained from the pseudo-reduced
pressure and temperature by using either the EoS or the
experimental chart. Particularly for natural hydrocarbon gases,
Standing and Katz (1942) and Katz et al. (1959) charts are standards
in oil and gas industry. Several attempts were made to fit the
Standing Katz chart mathematically (Dranchuk and Abou-Kassem,
1975; Hall and Iglesias-Silva, 2007; Heidaryan et al., 2010;
Londono et al., 2005). However, these charts were prepared for
binary mixtures of low molecular weight sweet gases.
2.1. Equations of state
Several forms of EoS have been presented to the petroleum in-
dustry to calculate hydrocarbon reservoir fluid properties. Volu-
metric behavior is calculated by solving the cubic equation, usually
expressed in terms of Z:
Z3
þ A1Z2
þ A2Z þ A3 ¼ 0 (4)
where constants A1, A2 and A3 are functions of pressure, temperature
and phase composition. The most widely used EoSs are: Soavee
RedlicheKwong (Soave, 1972) and Peng and Robinson (1976).
2.2. Empirical correlations
The lack of knowledge to calculate critical properties, acentric
factors of plus-fraction’s components and the binary interaction
parameters involved in equations of state calculations resulted in
utilization of empirical correlations which facilitated the com-
putations and seemed to be more user-friendly models. This
section presents a review of several widely used empirical
correlations.
2.2.1. Beggs and Brill (1973)
Beggs and Brill (1973) introduced an equation generated from
Standing and Katz (1942) Z-factor chart. This correlation is a func-
tion of pseudo-reduced pressure and temperature. Their proposed
equation is as follow:
Z ¼ A þ
À
1 À A
Á
exp
À
À B
Á
þ CPD
pr (5)
where
A ¼ 1:39
À
Tpr À 0:92
Á0:5
À 0:36Tpr À 0:101 (6)
B ¼
0:62 À 0:23Tpr
Ppr þ
0:066
Tpr À 0:86
À 0:037
P2
pr
þ
0:32
10ð9ðTprÀ1ÞÞ
P6
pr (7)
C ¼ 0:132 À 0:32logðTprÞ (8)
D ¼ 10ð0:3106À0:49Tprþ0:1824T2
prÞ (9)
This method is not suggested to be used for reduced tempera-
ture (Tpr) values less than 0.92.
2.2.2. Shell oil company
Kumar (2004) referenced the shell company model for estima-
tion of Z-factor as:
Z ¼ A þ BPpr þ
À
1 À A
Á
exp
À
À C
Á
À D
Ppr
10
4
(10)
A. Fayazi et al. / Journal of Natural Gas Science and Engineering 16 (2014) 8e17 9
3. where
A ¼ À0:101 À 0:36Tpr þ 1:3868
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Tpr À 0:919
q
(11)
B ¼ 0:021 þ
0:04275
Tpr À 0:65
(12)
C ¼ Ppr E þ FPPr þ GP4
pr
(13)
D ¼ 0:122exp À 11:3 Tpr À 1
ÁÁÀÀ
(14)
E ¼ 0:6222 À 0:224Tpr (15)
F ¼
0:0657
Tpr À 0:85
À 0:037 (16)
G ¼ 0:32exp À 19:53 Tpr À 1
ÁÁÀÀ
(17)
2.2.3. Heidaryan et al. (2010)
Multiple regression analysis was carried out by Heidaryan et al.
(2010) to develop a correlation benefiting of 1220 data points in
range of 0:2 Ppr 15 and 1:2 Tpr 3. Their proposed correlation
for Z-Factor has 0.40% and 1.37% of absolute average error respec-
tively versus Standing and Katz (1942) chart and experimental data.
This correlation is given by:
Z ¼ ln
0
B
@
A1 þA3ln
À
Ppr
Á
þ A5
Tpr
þA7
À
lnPpr
Á2
þ A9
T2
pr
þA11
Tpr
ln
À
Ppr
Á
1þA2ln
À
Ppr
Á
þ A4
Tpr
þA6
À
lnPpr
Á2
þ A8
T2
pr
þA10
Tpr
ln
À
Ppr
Á
1
C
A (18)
2.2.4. Azizi et al. (2010)
In 2010, Azizi et al. (2010) developed a model based on linear
genetic programming approach to estimate the sweet gases
compressibility factor over the range of 0:2 Ppr 11 (217 Ppr
values) and 1:1 Tpr 2 (14 Tpr values) as:
Z ¼ A þ
B þ C
D þ E
(19)
where
A ¼ aT2:16
pr þ bP1:028
pr þ cP1:58
pr TÀ2:1
pr þ dln
À
Tpr
ÁÀ0:5
(20)
B ¼ e þ fT2:4
pr þ gP1:56
pr þ hP0:124
pr T3:033
pr (21)
C ¼ iln
À
Tpr
ÁÀ1:28
þ jln
À
Tpr
Á1:37
þ kln
À
Ppr
Á
þ lln
À
Ppr
Á2
þ mln
À
Ppr
Á
ln
À
Tpr
Á
(22)
D ¼ 1 þ nT5:55
pr þ oP0:68
pr T0:33
pr (23)
E ¼ pln
À
Tpr
Á1:18
þ qln
À
Tpr
Á2:1
þ rln
À
Ppr
Á
þ sln
À
Ppr
Á2
þ tln
À
Ppr
Á
ln
À
Tpr
Á
(24)
2.2.5. Sanjari and Lay (2012)
By using multiple regression analysis, Sanjari and Lay (2012)
developed an empirical correlation based on Virial equation of
state within the ranges of 1:01 Tpr 3 and 0:01 Ppr 15. It
divides the pressure region into two sections resulting two sets of
coefficients for 0:01 Ppr 3 and 3 Ppr 15. This model
(Eq. (25)) has two dependent variables (Tpr and Ppr) and 8 inde-
pendent variables (A1 À A8).
Z ¼ 1 þ A1Ppr þ A2P2
pr þ
A3PA4
pr
TA5
pr
þ
A6P
ðA4þ1Þ
pr
TA7
pr
þ
A8P
ðA4þ2Þ
pr
T
ðA7þ1Þ
pr
(25)
The application range of some empirical correlations is limited
to the experimental conditions for building the correlations and fail
badly close and beyond to their limits. Also, some correlations
require an iterative procedures to obtain the corresponding Z-fac-
tor, such as Dranchuk and Abou-Kassem (1975), and may even
present different results dependent on the initial guess for the
initial iteration.
Therefore, the main objective of this study is to present a reliable
predictive compositional model based on Least Squares Support
Vector Machine (LSSVM) (Suykens and Vandewalle, 1999)
modeling approach to predict gas compressibility factor without
the need for estimation of critical properties, acentric factors of
plus-fraction components, and the binary interaction parameters.
3. Support vector machine (SVM) model
3.1. Background
The SVM is a new and supervised machine learning technique
based on the statistical learning theory (Cortes and Vapnik, 1995;
Suykens and Vandewalle, 1999; Vapnik, 2000). It has been studied
extensively for both classification and regression analysis
(Amendolia et al., 2003; Baylar et al., 2009; Chen et al., 2011; Rafiee-
Taghanaki et al., 2013; Shokrollahi et al., 2013; Übeyli, 2010). The
SVM algorithm builds a separating hyper-surface in the input space.
This process is performed as follows (Amendolia et al., 2003;
Bazzani et al., 2001; Cortes and Vapnik, 1995; Suykens et al., 2002;
Suykens and Vandewalle, 1999):
1) It maps the input patterns into a higher dimensional feature
space through nonlinear mapping.
2) Builds a separating hyper-plane with maximum margin.
Consider a given training sampleððx1; y1Þ; ðx2; y2Þ; :::; ðxn; ynÞÞ
with input data xi˛Rn and output data yi˛R with class labels À1, 1
for classes 1 and 2, respectively. If this data sample is linearly
separable in the feature space, then the following regression model
can be constructed:
y ¼ wT
FðxÞ þ b (26)
where FðxÞ represents the nonlinear function that maps x into
n-dimensional feature space and performs linear regression; w and
b are weight vector and bias term, respectively. When the data of
the two classes are separable, one can say:
wT F
À
xk
Á
þ b ! þ1 if yk ¼ þ1
wT F
À
xk
Á
þ b À1 if yk ¼ À1
(27)
which is equivalent to:
yk
h
wT
FðxkÞ þ b
i
! þ1 k ¼ 1; 2; :::; N (28)
The extension of linear SVMs to non-separable case was also
made by Cortes and Vapnik (1995) in 1995. Basically, it is done by
introducing additional slack variables into Eq. (28) as follows:
A. Fayazi et al. / Journal of Natural Gas Science and Engineering 16 (2014) 8e1710
4. yk
h
wT
FðxkÞ þ b
i
! 1 À zk k ¼ 1; 2; :::; N (29)
zk ! 0 k ¼ 1; :::; N (30)
The generalized optimal separating hyper-plane is determined
by the vector w that minimizes the cost function:
Cost ðw; zÞ ¼
1
2
wT
w þ
C
2
XN
i ¼ 1
zp
i (31)
Subject to the constraints:
yk
h
wT
FðxkÞ þ b
i
! 1 À zk k ¼ 1; 2; :::; N (32)
where C is a positive real constant that determines the tradeoff
between the maximum margin and the minimum classification
error (Suykens et al., 2002; Suykens and Vandewalle, 1999; Übeyli,
2010). In the conventional SVM, optimal separating hyper-plane is
obtained by solving the above quadratic programming problem.
The solution to the optimization problem of Eq. (31) under the
constraints of Eq. (32) is given by the saddle point of the Lagrangian
(Minoux, 1986),
Jðw;b;a;z;bÞ ¼
1
2
wT
w þ
C
2
XN
i¼1
zi À
XN
i¼1
aiðyi
h
wT
FðxiÞ þ b
i
À 1 þ ziÞ
À
XN
i¼1
bizi
(33)
where a, b are the Lagrange multipliers. A modified version of SVM,
least square SVM (LSSVM), has been developed by Suykens and
Vandewalle (1999) for reducing the SVM model complexity and
its improvement. In LSSVM algorithm, solution is obtained by
solving a linear set of equations instead of solving a quadratic
programming problem involved by standard SVM (Suykens et al.,
2002; Suykens and Vandewalle, 1999).
In contrast to SVM, the LSSVM is trained by minimizing the cost
function which is defined as follow (Suykens and Vandewalle,
1999):
Qðw; zÞ ¼
1
2
wT
w þ
g
2
XN
i ¼ 1
z2
i (34)
Subject to the constraints (Suykens and Vandewalle, 1999):
yi
h
wT
FðxiÞ þ b
i
¼ 1 À zi i ¼ 1; 2; :::; N (35)
In the LSSVM, one works with equality instead of inequality
constraints. Therefore, the optimal solution can be obtained by
solving a set of linear equations instead of solving a quadratic
programming problem (Suykens and Vandewalle, 1999). To derive
the dual problem for LSSVM non-linear classification problem, the
Lagrange function is defined as:
L
À
w;b;z;a
Á
¼
1
2
wT
wþ
g
2
XN
i¼1
z2
i À
XN
i¼1
ai
n
yi
h
wT
F
À
xi
Á
þb
i
À1þzi
o
(36)
where ai values are Lagrange multipliers, which is positive or
negative due to LSSVM formulation. The conditions for optimality
of upper function yield (Suykens et al., 2002):
8
:
vL
vw ¼ 0 0 w ¼
PN
i¼1aiyiFðxiÞ
vL
vb
¼ 0 0
PN
i¼1aiyi ¼ 0
vL
vzi
¼ 0 0 ai ¼ gzi i ¼ 1; :::; N
vL
vai
¼ 0 0 yi
Â
wT F
À
xi
Á
þ b
Ã
¼ 1 À zi i ¼ 1; :::; N
(37)
By defining Y ¼ ½y1; :::; yNŠ, 1N ¼ ½1; :::; 1Š, z ¼ ½z1; :::; zNŠ,
a ¼ ½a1; :::; aNŠ and eliminating w and z, following KarusheKuhne
Trucker system is obtained (Suykens et al., 2002; Suykens and
Vandewalle, 1999):
0 1T
N
1N U þ gÀ1IN
!
b
a
!
¼
0
Y
!
(38)
where IN is an N Â N identity matrix, and U˛RNÂN is the kernel
matrix defined by:
Uij ¼ F xi F xj ¼ K xi; xj
ÁÀÁÀÁÀ
(39)
For LSSVM, there are many kernel function including linear, poly-
nomial, spline, radial basis function (RBF), sigmoid, etc. (Gunn,
1998; Muller et al., 2001). However, most widely used kernel
functions are RBF (Eq. (40)) and polynomial (Eq. (41)).
K
À
xi; xj
Á
¼ exp À
xi À xj
2
=s2
(40)
K
À
xi; xj
Á
¼
1 þ xT
i xj=c
d
(41)
where s2 is the squared variance of the Gaussian function and d is
the polynomial degree, which should be optimized by the user to
obtain the support vector.
Table 1
Statistical description of the data bank used for modeling.
Property Max. Min. Avg. SD
N2, mole % 20.00 0.00 2.39 5.03
CO2, mole % 40.16 0.00 3.04 7.14
H2S, mole % 22.60 0.00 1.57 4.47
C1, mole % 99.50 30.64 84.27 13.28
C2, mole % 35.33 0.00 5.78 7.22
C3, mole % 20.69 0.00 2.07 4.41
i-C4, mole % 5.87 0.00 0.25 0.75
n-C4, mole % 3.76 0.00 0.17 0.47
i-C5, mole % 0.91 0.00 0.08 0.17
n-C5, mole % 0.66 0.00 0.03 0.08
C6, mole % 1.09 0.00 0.07 0.14
C7þ, mole % 1.31 0.00 0.27 0.45
MW C7þ 236.71 0 80.95 93.62
Temperature, K 441.80 240.00 334.87 42.16
Pressure, MPa 118.89 0.66 37.88 31.95
Z-factor 2.1927 0.4230 1.0866 0.3354
Gas gravity 1.0817 0.5625 0.6753 0.0034
Table 2
Statistical quality measures of the developed LSSVM model to determine the
compressibility factor.
Statistical parameter Training
set
Validation
set
Test
set
Total
Coefficient of determination (R2
) 0.9999 0.9998 0.9997 0.9999
Average absolute relative error
(AARE %)
0.16 0.27 0.26 0.19
Root mean square error (RMSE) 0.0032 0.0052 0.0052 0.0039
Number of experimental data set 1574 337 338 2249
A. Fayazi et al. / Journal of Natural Gas Science and Engineering 16 (2014) 8e17 11
5. 3.2. Data collection
In order to perform the work plan explained in this study, a large
number of data for a variety of natural gases were collected from
open literature (Buxton and Campbell, 1967; Capla et al., 2002;
Chamorro et al., 2006; Li and Guo, 1991; Liu et al., 2013; May et al.,
2001; McElroy et al., 2001; McLeod, 1968; Satter and Campbell,
1963; Sun et al., 2012; Yan et al., 2013). These data contain prop-
erties of 2249 gases, covering lean, sweet to rich and acid or sour
gases (H2S and CO2). These measurements include gas composi-
tions (mole percent of C1eC7þ, H2S, CO2, and N2), molecular weight
and specific gravity of the C7þ, experimentally measured
compressibility factors, pressures and temperatures. A complete
statistical description of the data bank is reported in Table 1.
The database was first divided into three sets. The first part
known as training set is used for construction and training of the
model (70% of main data set). The second part namely validation set
is used for selecting optimal parameters of the LSSVM model and
also to avoid the over-fitting problems (15% of main data set). The
task of remaining data, i.e. test set, is to evaluate the capability of
proposed model for prediction of unused data within the model
development (Arabloo et al., 2013; Mohammadi et al., 2011). It
should be noted that the division of database into three mentioned
sections is performed randomly. The benefit of this kind of data
allocation is that in each subset there is enough representative data
for whole ranges of operating conditions.
3.3. Designing the LSSVM model
To build the LSSVM model for precise prediction of gas
compressibility factor, gas composition (mole percent of C1eC7þ,
H2S, CO2, and N2), molecular weight of C7þ, pressure and temper-
ature are assumed as the correlating variables as:
Z ¼ f
À
yi; MWC7þ
; P; T
Á
yi˛
È
yC1
; yC2
; :::; yC6
; yC7þ
; yH2S; yCO2
; yN2
É
(42)
The mean square error (MSE) between the developed model
results and corresponding experimental values, as defined by Eq.
(43), is considered as objective function during model computation.
MSE ¼
PN
j¼1
À
tj À oj
Á2
N
(43)
where t and o are target and estimated values, respectively.
0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4
−20
−15
−10
−5
0
5
10
15
20
Experimental Z−factor
(Zexp
−Zpred
)/Zexp
*100
Training
Validation
Test
Fig. 2. Relative errors of the gas compressibility factor values obtained by the developed model from experimental data base values.
Fig. 1. Comparison between the results of the developed LSSVM model and the
experimental data.
A. Fayazi et al. / Journal of Natural Gas Science and Engineering 16 (2014) 8e1712
6. 4. Results and discussion
4.1. Accuracy of the model
There are generally two parameters in LSSVM algorithm
including s2 and g, which are supposed to be optimized regarding
the specified problem (Farasat et al., 2013; Hemmati-Sarapardeh
et al., 2014; Rafiee-Taghanaki et al., 2013). The optimization pro-
cedure has been repeated several times as attempts to reach to the
global optimum of the problem. In this work, we have applied the
Coupled Simulated Annealing (CSA) optimization technique
(Xavier-de-Souza et al., 2010). The optimized values of the LSSVM
algorithm have been calculated as follows:
g ¼ 6:048046E þ 4
s2
¼ 8:647087E À 1
Table 2 indicates the statistical parameters of the developed
model including coefficient of determination (R2
), average absolute
relative error (AARE), and root mean square error (RMSE) for pre-
diction of compressibility factor.
The scatter diagram that compares developed model outputs
versus experimental values is shown in Fig.1. A tight cloud of points
about 45 line for training, validation and testing data sets indicate
the robustness of the proposed model.
Fig. 2 shows the relative error distribution for all experimental
data points. The results illustrate that excellent agreement exists
between the prediction of LSSVM model and the experimental
data. It would also be interesting to see the performance and ac-
curacy of the proposed model against existing correlations. For
this purpose, the data sets used to develop the LSSVM model were
utilized to evaluate the accuracy of the model against existing
correlations: Beggs and Brill (BB) (Beggs and Brill, 1973), Shell Oil
Company (S) (Kumar, 2004), HeidaryaneMoghadasieRahimi
0 0.5 1 1.5 2 2.5
0
0.5
1
1.5
2
2.5
Experimental Z−factor
PredictedZ−factor
Fig. 5. Comparison between the results of the HeidaryaneMoghadasieRahimi (HMR)
correlation and the experimental data.
0 0.5 1 1.5 2 2.5
0
0.5
1
1.5
2
2.5
Experimental Z−factor
PredictedZ−factor
Fig. 6. Comparison between the results of the AzizieBehbahanieIsazadeh (ABI) cor-
relation and the experimental data.
0 0.5 1 1.5 2 2.5
0
0.5
1
1.5
2
2.5
Experimental Z−factor
PredictedZ−factor
Fig. 4. Comparison between the results of the Shell Oil Company (S) correlation and
the experimental data.
0 0.5 1 1.5 2 2.5
0
0.5
1
1.5
2
2.5
Experimental Z−factor
PredictedZ−factor
Fig. 3. Comparison between the results of the Beggs and Brill (BB) correlation and the
experimental data.
A. Fayazi et al. / Journal of Natural Gas Science and Engineering 16 (2014) 8e17 13
7. (HMR) (Heidaryan et al., 2010), AzizieBehbahanieIsazadeh (ABI)
(Azizi et al., 2010), and Sanjari and Nemati Lay (SN) (Sanjari and
Lay, 2012).
Figs. 3e7 illustrate the predicted results applying the above-
mentioned correlations versus experimental values of Z-factor for
all of the 2249 data sets used for developing the LSSVM model.
These cross-plots show the degree of agreement between experi-
mentally measured data and the predicted values. As can be seen
from Figs. 1 and 3e7 the predictions of the Z-factor made by
developed LSSVM model yield the closest agreement with the
experimental data among the selected correlations.
Furthermore, statistical errors of the mentioned correlations as
well as our proposed model are reported in Tables 3 and 4. It is clear
that the developed compositional LSSVM model presented in this
study has the smallest average relative error (ARE), average abso-
lute relative error (AARE), root mean square error (RMSE), and the
highest coefficient of determination (R2
) for all types of natural
gases considered.
4.2. Validity of the model
To make sure that the proposed model is physically correct, its
validity should be checked (Chamkalani et al., 2013). For this pur-
pose, the experimental data and computed Z-factor values from
LSSVM model as well as other mentioned empirical correlations
versus pseudo-reduced pressure at constant pseudo-reduced
temperature for four different natural gas mixtures (see Table 5)
are presented in Fig. 8. Real gases may deviate negatively or posi-
tively from ideality, depending on the effect of the intermolecular
forces of the gas. As can be seen from Fig. 8, the model has suc-
cessfully captured the physical trend of changing the gas
compressibility factor versus pseudo-reduced pressure at constant
temperature.
4.3. Case study
The ability of the new method for calculating the gas
compressibility factor as a function of changing pressure has been
investigated for a gas sample (Zhou et al., 2006) that was not
employed during the process of model development. The compo-
sition of this sample is reported in Table 6.
Fig. 9 shows the comparison between the experimental and
predicted Z-factor (see Table 7) by all models considered in this
study for this gas sample. As shown in Fig. 9, the developed LSSVM
model is much more accurate than other empirical methods for
Table 4
Average absolute relative error of the developed LSSVM model compared with other predictive correlations.
Ref. Data points No. of gas
mixtures
P, MPa T, K MW AARE %
BB S HMR ABI SN LSSVM
(This study)
(Buxton and Campbell, 1967) 165 5 7.07e48.44 310.93e344.26 18.17e23.68 2.14 2.00 2.41 2.09 2.98 0.37
(Satter and Campbell, 1963) 105 5 7.07e48.44 311.54e344.87 18.11e20.86 2.60 2.60 2.04 4.02 3.64 0.08
(Li and Guo, 1991) 47 5 0.66e7.53 310.20e359.40 16.37e24.42 0.96 0.77 0.80 0.82 0.98 0.13
(Liu et al., 2013) 92 2 35.00e95.04 347.70e419.20 17.04e17.07 3.00 1.72 1.78 1.50 2.55 0.05
(Yan et al., 2013) 234 2 10.00e116.50 313.20e441.80 16.51e19.43 3.83 1.99 1.35 1.96 1.78 0.04
(Sun et al., 2012) 535 4 22.03e118.89 303.20e418.60 17.05e20.51 4.04 1.71 1.54 2.13 2.18 0.02
(McLeod, 1968) 597 25 3.45e48.44 266.48e366.48 17.12e26.72 3.22 2.70 3.99 3.58 5.36 0.4
(May et al., 2001) 87 5 0.94e10.18 278.30e313.16 17.92e21.85 5.43 3.80 5.20 5.03 5.92 0.18
(Capla et al., 2002) 84 3 0.99e15.02 253.15e323.15 16.31e17.84 2.44 1.57 2.22 2.18 3.44 0.35
(Chamorro et al., 2006) 242 2 0.90e20.07 240.00e400.07 17.24e18.43 5.48 6.31 5.64 5.63 3.77 0.12
(McElroy et al., 2001) 61 6 0.67e8.61 283.14e333.17 29.93e31.37 3.28 4.59 4.63 4.73 4.30 0.22
Total 2249 64 3.62 2.67 3.03 3.07 3.57 0.19
Table 5
Compositions of four natural gas mixtures used for validation.
Component Mole (%)
No. 1 No. 2 No. 3 No. 4
N2 0.52 0 0.52 5.84
CO2 1.31 0 20.16 0
H2S 5.7 19.7 0 0
C1 91.51 71.3 74.58 54.35
C2 0.84 9 4.74 16.32
C3 0.08 0 0 16.2
i-C4 0.02 0 0 5.87
n-C4 0.02 0 0 0
i-C5 0 0 0 0.91
n-C5 0 0 0 0
C6 0 0 0 0.18
Table 3
Statistical parameters for each Z-factor correlation versus experimental data.
Correlation ARE % AARE % RMSE R2
Beggs and Brill (1973) À1.02 3.61 0.055 0.970
Shell oil company (Kumar, 2004) À0.52 2.67 0.036 0.988
Heidaryan et al. (2010) 0.85 3.03 0.038 0.987
Azizi et al. (2010) 1.04 3.07 0.040 0.987
Sanjari and Lay (2012) 1.55 3.57 0.047 0.980
LSSVM (this study) À0.01 0.19 0.004 0.999
Fig. 7. Comparison between the results of the Sanjari and Nemati Lay (SN) correlation
and the experimental data.
A. Fayazi et al. / Journal of Natural Gas Science and Engineering 16 (2014) 8e1714
8. prediction of a natural gas stream containing non-hydrocarbon
components.
5. Conclusion
In this study, least square support vector machine technique as a
supervised learning method has been applied to predict Z-factor of
natural gases. Coupled simulated annealing (CSA) optimization was
used for determination of LSSVM hyper-parameters. To achieve the
0 1 2 3 4 5 6 7 8
0.7
0.75
0.8
0.85
0.9
0.95
Ppr
Z-factor
Experimental
Beggs and Brill
Shell oil company
Heidaryan et al.
Azizi et al.
Sanjari and Lay
LSSVM (This study)
1 2 3 4 5 6 7 8 9 10 11
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
Ppr
Z-factor
Experimental
Beggs and Brill
Shell oil company
Heidaryan et al.
Azizi et al.
Sanjari and Lay
LSSVM (This study)
1 2 3 4 5 6 7 8 9 10 11
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
Ppr
Z-factor
Experimental
Beggs and Brill
Shell oil company
Heidaryan et al.
Azizi et al.
Sanjari and Lay
LSSVM (This study)
0 1 2 3 4 5 6 7 8
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Ppr
Z-factor
Experimental
Beggs and Brill
Shell oil company
Heidaryan et al.
Azizi et al.
Sanjari and Lay
LSSVM (This study)
(a) (b)
(d)(c)
Fig. 8. Trend plot of Z-factor vs. Ppr. (a): gas mixture 1 at Tpr ¼ 1.52; (b): gas mixture 2 at Tpr ¼ 1.59; (c): gas mixture 3 at Tpr ¼ 1.45; (d): gas mixture 4 at Tpr ¼ 1.28.
Table 6
Composition of the case studied gas sample.
Component Mole (%)
N2 2.031
CO2 0.403
C1 90.991
C2 2.949
C3 1.513
i-C4 0.755
n-C4 0.755
i-C5 0.299
n-C5 0.304
2 4 6 8 10 12 14 16 18 20
0.78
0.8
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0.96
P(MPa)
Z-factor
Experimental
Beggs and Brill
Shell oil company
Heidaryan et al.
Azizi et al.
Sanjari and Lay
LSSVM (This study)
Fig. 9. Experimental and predicted compressibility factor for the gas sample.
A. Fayazi et al. / Journal of Natural Gas Science and Engineering 16 (2014) 8e17 15
9. research objectives, 2249 data sets covering wide range of experi-
mental conditions were gathered from open literature to construct
and test the model. The average absolute relative error (AARE) and
coefficient of determination (R2
) between the model predictions
and the relevant experimental data were found to be 0.19% and
0.999, respectively. Moreover, a comparison between predictions of
developed LSSVM model and other empirical correlations shows
that developed model is more reliable than other conventional
methods for predicting natural gas Z-factor. In addition, the validity
of the model was examined and the results indicate that the model
is capable of simulating the actual physical trend of the Z-factor as a
function of pseudo-reduced pressure and temperature. Results
from present study show that the proposed compositional LSSVM
model can be easily implemented in any reservoir simulation
software and provides superior accuracy and performance for gas
reservoir engineering calculations.
Appendix A. Statistical formulas
Coefficient of determination
R2
¼ 1 À
PN
i¼1
Zpred
i
À Zexp
i
2
PN
i¼1
Z
pred
i
À average
Z
exp
i
2
Average relative error ARE% ¼
100
N
XN
i¼1
Zpred
i
À Zexp
i
Zexp
i
!
Average absolute relative error
AARE% ¼
100
N
XN
i¼1
Z
pred
i
À Z
exp
i
Z
exp
i
!
Root mean square error ðRMSEÞ
RMSE ¼
0
B
@
PN
i¼1
Z
pred
i
À Z
exp
i
2
N
1
C
A
1
2
Nomenclature
AT transpose of matrix A
b bias term
d the polynomial degree
exp experimental
IN N Â N identity matrix
Kðxi; xjÞ Kernel function
L Lagrangian
MW C7þ molecular weight of heptane-plus fraction
Pc critical pressure
Ppr Pseudo-reduced pressure
Pr reduced pressure
pred predicted
Tc critical temperature
Tpr Pseudo-reduced temperature
Tr reduced temperature
w weight vector
ai Lagrange multipliers
F map from input space into feature space
g relative weight of the summation of the regression errors
s2 squared bandwidth
U Kernel matrix
z slack variable
References
Ahmed, T., 2001. Reservoir Engineering Handbook, second ed. Gulf Professional
Publishing Company, Houston, Texas, USA.
Ahmed, T.H., 1989. Hydrocarbon Phase Behavior. Gulf Publishing Company, Hous-
ton, TX.
Amendolia, S.R., et al., 2003. A comparative study of K-nearest neighbour, support
vector machine and multi-layer perceptron for Thalassemia screening. Che-
mometr. Intell. Lab. Syst. 69 (1e2), 13e20.
Arabloo, M., Shokrollahi, A., Gharagheizi, F., Mohammadi, A.H., 2013. Toward a
predictive model for estimating dew point pressure in gas condensate systems.
Fuel Process. Technol. 116 (0), 317e324.
Azizi, N., Behbahani, R., Isazadeh, M.A., 2010. An efficient correlation for calculating
compressibility factor of natural gases. J. Nat. Gas Chem. 19 (6), 642e645.
Baylar, A., Hanbay, D., Batan, M., 2009. Application of least square support vector
machines in the prediction of aeration performance of plunging overfall jets
from weirs. Expert Syst. Appl. 36 (4), 8368e8374.
Bazzani, A., et al., 2001. An SVM classifier to separate false signals from micro-
calcifications in digital mammograms. Phys. Med. Biol. 46 (6), 1651.
Beggs, D.H., Brill, J.P., 1973. A study of two-phase flow in inclined pipes. J. Petrol.
Technol. 25 (5), 607e617.
Buxton, T.S., Campbell, J.M., 1967. Compressibility factors for lean natural gase
carbon dioxide mixtures at high pressure. SPE J. 7 (1), 80e86.
Calhoun, J.C., 1951. Reservoir Fluids. University of Oklahoma Press, Norman, OK.
Capla, L., Buryan, P., Jedelský, J., Rottner, M., Linek, J., 2002. Isothermal pVT mea-
surements on gas hydrocarbon mixtures using a vibrating-tube apparatus.
J. Chem. Thermodyn. 34 (5), 657e667.
Chamkalani, A., Arabloo, M., Chamkalani, R., Zargari, M.H., Dehestani-Ardakani, M.R.,
Farzam, M., 2013. Soft computing method for prediction of CO2 corrosion in flow
lines based on neural network approach. Chem. Eng. Commun. 200, 731e747.
Chamorro, C.R., et al., 2006. Measurement of the (pressure, density, temperature)
relation of two (methane þ nitrogen) gas mixtures at temperatures between
240 and 400 K and pressures up to 20 MPa using an accurate single-sinker
densimeter. J. Chem. Thermodyn. 38 (7), 916e922.
Chen, T.-S., et al., 2011. A novel knowledge protection technique base on support
vector machine model for anti-classification. In: Zhu, M. (Ed.), Electrical Engi-
neering and Control. Lecture Notes in Electrical Engineering. Springer, Berlin
Heidelberg, pp. 517e524.
Cortes, C., Vapnik, V., 1995. Support-vector networks. Mach. Learn. 20 (3), 273e297.
Dranchuk, P.M., Abou-Kassem, J.H., 1975. Calculation of z factors for natural gases
using equations of state. J. Can. Petrol. Technol. 14 (3), 34e36.
Elsharkawy, A.M., 2004. Efficient methods for calculations of compressibility, den-
sity and viscosity of natural gases. Fluid Phase Equilibr. 218 (1), 1e13.
Farasat, A., Shokrollahi, A., Arabloo, M., Gharagheizi, F., Mohammadi, A.H., 2013.
Toward an intelligent approach for determination of saturation pressure of
crude oil. Fuel Process. Technol. 115, 201e214.
Gunn, S.R., 1998. Support Vector Machines for Classification and Regression. Uni-
versity of Southampton, Faculty of Engineering, Science and Mathematics
School of Electronics and Computer Science.
Table 7
Experimental data and results of LSSVM model and other studied correlations for the gas sample.
T, K P, MPa Experimental Z-factor Predicted Z-factor
BB S HMR ABI SN LSSVM (this study)
305.15 18.88 0.7937 0.7986 0.7903 0.8172 0.8131 0.8357 0.7952
15.94 0.7884 0.8017 0.7910 0.8086 0.8117 0.8235 0.7921
13.03 0.7979 0.8173 0.8049 0.8218 0.8230 0.8380 0.8035
10.79 0.8169 0.8379 0.8249 0.8431 0.8409 0.8551 0.8221
7.02 0.8680 0.8877 0.8754 0.8902 0.8864 0.8973 0.8681
3.72 0.9272 0.9413 0.9323 0.9380 0.9351 0.9436 0.9203
A. Fayazi et al. / Journal of Natural Gas Science and Engineering 16 (2014) 8e1716
10. Hall, K.R., Iglesias-Silva, G.A., 2007. Improved equations for the StandingeKatz ta-
bles. Hydrocarbon Process 86 (4), 107e110.
Heidaryan, E., Moghadasi, J., Rahimi, M., 2010. New correlations to predict natural
gas viscosity and compressibility factor. J. Petrol. Sci. Eng. 73 (1e2), 67e72.
Hemmati-Sarapardeh, A., et al., 2014. Reservoir oil viscosity determination using a
rigorous approach. Fuel 116, 39e48.
Katz, D.L., et al., 1959. Handbook of Natural Gas Engineering. McGraw-Hill Book
Company, New York City.
Kay, W., 1936. Gases and vapors at high temperature and pressure e density of
hydrocarbon. Indust. Eng. Chem. 28 (9), 1014e1019.
Kumar, N., 2004. Compressibility Factor for Natural and Sour Reservoir Gases by
Correlations and Cubic Equations of State. MS thesis. Texas Tech University,
Lubbock, Tex, USA.
Li, Q., Guo, T.-M., 1991. A study on the supercompressibility and compressibility
factors of natural gas mixtures. J. Petrol. Sci. Eng. 6 (3), 235e247.
Liu, H., et al., 2013. Phase behavior and compressibility factor of two China gas
condensate samples at pressures up to 95 MPa. Fluid Phase Equilibr. 337, 363e369.
Londono, F.E., Archer, R.A., Blasingame, T.A., 2005. Correlations for hydrocarbon-gas
viscosity and gas density-validation and correlation of behavior using a large-
scale database. SPE Reservoir Eval. Eng. 8 (6), 561e572.
May, E.F., Miller, R.C., Shan, Z., 2001. Densities and dew points of vapor mixtures of
methane þ propane and methane þ propane þ hexane using a dual-sinker
densimeter. J. Chem. Eng. Data 46 (5), 1160e1166.
McElroy, P.J., Fang, J., Williamson, C.J., 2001. Second and third virial coefficients for
(methane þ ethane þ carbon dioxide). J. Chem. Thermodyn. 33 (2), 155e163.
McLeod, W.R., 1968. Application of Molecular Refraction to the Principle of Corre-
sponding States. Ph.D. thesis. University of Oklahoma.
Minoux, M., 1986. Mathematical Programming: Theory and Algorithms. John Wiley
and Sons.
Mohammadi, A.H., et al., 2011. Gas hydrate phase equilibrium in porous media:
mathematical modeling and correlation. Indust. Eng. Chem. Res. 51 (2),1062e1072.
Muller, K.-R., Mika, S., Ratsch, G., Tsuda, K., Scholkopf, B., 2001. An introduction to
kernel-based learning algorithms. IEEE Trans. Neural Networks 12 (2), 181e201.
Peng, D.-Y., Robinson, D.B., 1976. A new two-constant equation of state. Indust. Eng.
Chem. Fundament. 15 (1), 59e64.
Rafiee-Taghanaki, S., et al., 2013. Implementation of SVM framework to estimate
PVT properties of reservoir oil. Fluid Phase Equilibr. 346 (0), 25e32.
Sanjari, E., Lay, E.N., 2012. An accurate empirical correlation for predicting natural
gas compressibility factors. J. Nat. Gas Chem. 21 (2), 184e188.
Satter, A., Campbell, J.M., 1963. Non-ideal behavior of gases and their mixtures. SPE
J. 3 (4), 333e347.
Shokrollahi, A., Arabloo, M., Gharagheizi, F., Mohammadi, A.H., 2013. Intelligent
model for prediction of CO2 e reservoir oil minimum miscibility pressure. Fuel
112, 375e384.
Soave, G., 1972. Equilibrium constants from a modified RedlicheKwong equation of
state. Chem. Eng. Sci. 27 (6), 1197e1203.
Standing, M.B., Katz, D.L., 1942. Density of natural gases. Trans. AIME 146, 140e149.
Stewart, W.F., Burkhardt, S.F., Voo, D., 1959. Prediction of Pseudo Critical Parameters
for Mixtures, AIChE Meeting, Kansas City, MO.
Sun, C.-Y., et al., 2012. Experiments and modeling of volumetric properties and
phase behavior for condensate gas under ultra-high-pressure conditions.
Indust. Eng. Chem. Res. 51 (19), 6916e6925.
Sutton, R.P., 1985. Compressibility Factors for High-molecular-weight Reservoir
Gases, SPE Annual Technical Conference and Exhibition. Society of Petroleum
Engineers, Las Vegas, NV.
Suykens, J.A.K., Gestel, T.V., Brabanter, J.D., Moor, B.D., Vandewalle, J., 2002. Least
Squares Support Vector Machines. World Scientific Pub. Co., Singapore.
Suykens, J.A.K., Vandewalle, J., 1999. Least squares support vector machine classi-
fiers. Neural Process. Lett. 9 (3), 293e300.
Übeyli, E.D., 2010. Least squares support vector machine employing model-based
methods coefficients for analysis of EEG signals. Expert Syst. Appl. 37 (1),
233e239.
van der Waals, J.D., 1873. Continuity of the Gaseous and Liquid State of Matter. Ph.D.
dissertation thesis. University of Leiden.
Vapnik, V., 2000. The Nature of Statistical Learning Theory. Springer.
Wang, X., Economides, M., 2009. Advanced Natural Gas Engineering. Gulf Pub-
lishing Company, Houston, Texas, p. 400.
Xavier-de-Souza, S., Suykens, J.A.K., Vandewalle, J., Bolle, D., 2010. Coupled
simulated annealing. IEEE Trans. Syst. Man Cybernet. B: Cybernet. 40 (2),
320e335.
Yan, K.-L., et al., 2013. Measurement and calculation of gas compressibility factor for
condensate gas and natural gas under pressure up to 116 MPa. J. Chem. Ther-
modyn. 63, 38e43.
Zhou, J., et al., 2006. (p, Vm, T) and phase equilibrium measurements for a natural
gas-like mixture using an automated isochoric apparatus. J. Chem. Thermodyn.
38 (11), 1489e1494.
A. Fayazi et al. / Journal of Natural Gas Science and Engineering 16 (2014) 8e17 17