Detecting data anomalies in the operational
process of oil and gas pipelines is very important to reduce the
risk of disasters, which can adversely affect human safety, the
environment and financial aspects. Failure to do so can lead to
catastrophic results. The problem is also supported by several
catastrophic events that have occurred in several areas of oil and
gas production facilities in several regions. To solve this
problem, it is necessary to implement a suitable monitoring
system that aims to prevent potential losses caused by leaks or
over-pressurization of natural gas pipelines. Among the many
machine learning algorithms available for anomaly detection
such as Feed Forward Neural Network, Linear Regression,
KNN, Random Forest, and Support Vector Machine and
unsupervised machine learning models such as Principal
Component Analysis (PCA) and Hierarchical clustering, One-
Class SVM and Isolation Forest are the most prominent.
However, these algorithms have their own advantages and
disadvantages regarding their performance. This study aims to
compare the performance of machine learning algorithms in
classifying and detecting data anomalies in offshore natural gas
pipeline operational datasets. The assessment is based on ROC-
AUC Curve, Confusion Matrix, Sensitivity, and Specificity. The
findings indicate that the Isolation Forest model outperforms
the One-Class SVM, with a ROC-AUC value of 90%, compared
to the One-Class SVM's value of only 61%. Furthermore, the
Isolation Forest exhibits a Sensitivity value of 98%, in contrast
to the One-Class SVM's 41%, and a Specificity of 81%,
compared to the One-Class SVM's 80%.