Extracting events from Daily Drilling Reports using Fuzzy String Matching
Mariana S. Oliveira A B * , Adriano Mourthe A B and Maria Clara Duque A CA Intelie, Rio de Janeiro, RJ, Brazil.
B Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, RJ, Brazil.
C Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro, RJ, Brazil.
The APPEA Journal 62 S158-S161 https://doi.org/10.1071/AJ21118
Accepted: 4 March 2022 Published: 13 May 2022
© 2022 The Author(s) (or their employer(s)). Published by CSIRO Publishing on behalf of APPEA.
Abstract
Continuous monitoring of oil drilling operations reduces process interruptions and equipment failure. It also contributes to the development of Key Performance Indicators, which leads to more efficient resource management. Daily Drilling Reports (DDRs) have long been the primary way of recording noticeable events, such as stuck pipe. DDRs came to constitute a valuable information base for most oil drilling companies. However, the task of extracting knowledge from DDRs can be costly and time-consuming. This work proposes an approach to recognise drilling events in DDRs using a rule-based language processing method called Fuzzy String Matching (FSM). We applied the FSM algorithm to search for a set of predefined keywords and key phrases to extract possible Invisible Lost Time (ILT) events from DDRs that may indicate risks or low operational efficiency. The fuzzy part of the algorithm allows the identification of terms or expressions that match the pre-established ones approximately rather than exactly, accounting for typos and different suffixes or prefixes. The proposed solution was applied on a data set of 392 real-world DDR records from a drilling company using a set of six ILT event’s key phrases annotated by Subject Matter Specialists. This process can be readily replicated to other events. The results show that in 116 reports tagged as normal, 92 records were identified as possible ILT events, which represents, in hours, 56% of the total drill normal time. Such promising results can lead to very significant improvements in identifying and extracting drilling events within DDRs.
Keywords: Daily Drilling Reports, drilling, Fuzzy String Matching, Invisible Lost Time, keyword extraction, natural language processing, oil and gas, textual data.
Mariana Oliveira is a Data Scientist at Intelie, a Viasat Company, for 3 years. She works with development and implementation of Machine Learning and Natural Language Processing models in the Oil and Gas area. She graduated with a bachelor’s degree in Computer Information Systems from the Federal University of the State of Rio de Janeiro (UNIRIO). Currently, she is a master’s student in Computing at UNIRIO. |
Adriano Mourthe is a Data Scientist at Intelie, a Viasat Company, for 5 years. He works with development and implementation of Machine Learning models for the Oil and Gas area. He graduated with a bachelor’s degree in Computer Information Systems from the Federal University of the State of Rio de Janeiro (UNIRIO). Currently, he is a doctoral student in Informatics at UNIRIO. |
Maria Clara Duque is a Data Scientist at Intelie, a Viasat Company, since August 2021. She works with development and implementation of Machine Learning and Natural Language Processing models in the Oil and Gas area. She graduated with a bachelor’s degree in Petroleum Engineering from the Federal University of Rio de Janeiro (UFRJ). She also graduated with a master’s degree in Industrial Engineering. Currently, she is a PhD candidate in Industrial Engineering at UFRJ. |
References
Damerau FJ (1964) A technique for computer detection and correction of spelling errors. Communications of the ACM 7, 171–176.| A technique for computer detection and correction of spelling errors.Crossref | GoogleScholarGoogle Scholar |
Levenshtein V (1966) Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10, 707
Navarro G (2001) A guided tour to approximate string matching. ACM Computing Surveys 33, 31–88.
| A guided tour to approximate string matching.Crossref | GoogleScholarGoogle Scholar |