1,046 research outputs found

    Oil and Gas flow Anomaly Detection on offshore naturally flowing wells using Deep Neural Networks

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceThe Oil and Gas industry, as never before, faces multiple challenges. It is being impugned for being dirty, a pollutant, and hence the more demand for green alternatives. Nevertheless, the world still has to rely heavily on hydrocarbons, since it is the most traditional and stable source of energy, as opposed to extensively promoted hydro, solar or wind power. Major operators are challenged to produce the oil more efficiently, to counteract the newly arising energy sources, with less of a climate footprint, more scrutinized expenditure, thus facing high skepticism regarding its future. It has to become greener, and hence to act in a manner not required previously. While most of the tools used by the Hydrocarbon E&P industry is expensive and has been used for many years, it is paramount for the industry’s survival and prosperity to apply predictive maintenance technologies, that would foresee potential failures, making production safer, lowering downtime, increasing productivity and diminishing maintenance costs. Many efforts were applied in order to define the most accurate and effective predictive methods, however data scarcity affects the speed and capacity for further experimentations. Whilst it would be highly beneficial for the industry to invest in Artificial Intelligence, this research aims at exploring, in depth, the subject of Anomaly Detection, using the open public data from Petrobras, that was developed by experts. For this research the Deep Learning Neural Networks, such as Recurrent Neural Networks with LSTM and GRU backbones, were implemented for multi-class classification of undesirable events on naturally flowing wells. Further, several hyperparameter optimization tools were explored, mainly focusing on Genetic Algorithms as being the most advanced methods for such kind of tasks. The research concluded with the best performing algorithm with 2 stacked GRU and the following vector of hyperparameters weights: [1, 47, 40, 14], which stand for timestep 1, number of hidden units 47, number of epochs 40 and batch size 14, producing F1 equal to 0.97%. As the world faces many issues, one of which is the detrimental effect of heavy industries to the environment and as result adverse global climate change, this project is an attempt to contribute to the field of applying Artificial Intelligence in the Oil and Gas industry, with the intention to make it more efficient, transparent and sustainable

    Efficient Learning Machines

    Get PDF
    Computer scienc

    Advances in Remote Sensing-based Disaster Monitoring and Assessment

    Get PDF
    Remote sensing data and techniques have been widely used for disaster monitoring and assessment. In particular, recent advances in sensor technologies and artificial intelligence-based modeling are very promising for disaster monitoring and readying responses aimed at reducing the damage caused by disasters. This book contains eleven scientific papers that have studied novel approaches applied to a range of natural disasters such as forest fire, urban land subsidence, flood, and tropical cyclones

    Statistical methods for NHS incident reporting data

    Get PDF
    The National Reporting and Learning System (NRLS) is the English and Welsh NHS’ national repository of incident reports from healthcare. It aims to capture details of incident reports, at national level, and facilitate clinical review and learning to improve patient safety. These incident reports range from minor ‘near-misses’ to critical incidents that may lead to severe harm or death. NRLS data are currently reported as crude counts and proportions, but their major use is clinical review of the free-text descriptions of incidents. There are few well-developed quantitative analysis approaches for NRLS, and this thesis investigates these methods. A literature review revealed a wealth of clinical detail, but also systematic constraints of NRLS’ structure, including non-mandatory reporting, missing data and misclassification. Summary statistics for reports from 2010/11 – 2016/17 supported this and suggest NRLS was not suitable for statistical modelling in isolation. Modelling methods were advanced by creating a hybrid dataset using other sources of hospital casemix data from Hospital Episode Statistics (HES). A theoretical model was established, based on ‘exposure’ variables (using casemix proxies), and ‘culture’ as a random-effect. The initial modelling approach examined Poisson regression, mixture and multilevel models. Overdispersion was significant, generated mainly by clustering and aggregation in the hybrid dataset, but models were chosen to reflect these structures. Further modelling approaches were examined, using Generalized Additive Models to smooth predictor variables, regression tree-based models including Random Forests, and Artificial Neural Networks. Models were also extended to examine a subset of death and severe harm incidents, exploring how sparse counts affect models. Text mining techniques were examined for analysis of incident descriptions and showed how term frequency might be used. Terms were used to generate latent topics models used, in-turn, to predict the harm level of incidents. Model outputs were used to create a ‘Standardised Incident Reporting Ratio’ (SIRR) and cast this in the mould of current regulatory frameworks, using process control techniques such as funnel plots and cusum charts. A prototype online reporting tool was developed to allow NHS organisations to examine their SIRRs, provide supporting analyses, and link data points back to individual incident reports

    Mapping drought stress in commercial eucalyptus forest plantations using remotely sensed techniques in Southern Africa.

    Get PDF
    Masters Degree. University of KwaZulu-Natal, Pietermaritzburg.Drought is one of the least understood and hazardous natural disasters that leave many parts of the world devastated. To improve understanding and detection of drought onset, remote sensing technology is required to map drought affected areas as it covers large geographical areas. The study aimed to evaluate the utility of a cost-effective Landsat 8 imagery in mapping the spatial extent of drought prone Eucalyptus dunnii plantations. The first objective was to compare the utility of Landsat spectra with a combination of vegetation indices to detect drought affected plantations using the Stochastic gradient boosting algorithm. The test datasets showed that using Landsat 8 spectra only produced an overall accuracy of 74.70% and a kappa value of 0.59. The integration of Landsat 8 spectra with vegetation indices produced an overall accuracy of 83.13% and a kappa of 0.76. The second objective of this study was to do a trend analysis of vegetation health during drought. The normalized difference vegetation index (NDVI) values fluctuated over the years where 2013 had the highest value of 0.68 and 2015 the lowest NDVI of 0.55 and the normalized difference water index (NDWI) had the lowest value in 2015. Most indices showed a similar trend where 2013 had the highest index value and 2015 the lowest. The third objective was to do a trend analysis of rainfall and temperature during drought. The rainfall trend analysis from 2013 to 2017 indicated that the month of February 2017 received the highest rainfall of 154 mm. In addition, July of 2016 received the highest rainfall compared to 2013, 2014, 2015 and 2017 with rainfalls of 6.4 mm, 0.6 mm, 28 mm, and 1 mm, respectively. The temperature trend analysis from 2013 to 2017 indicated that December 2015 had the highest temperature of 28 ° C compared to December of 2013 2014, 2016 and 2017 with temperatures of 24°C, 25°C, 27°C, 24°C, respectively. Furthermore, it was also noted that June 2017 had the highest temperature of 23°C while June 2015 had the lowest at 20°C. The fourth objective of this study was to compare the utility of topographical variables with a combination of Landsat vegetation indices to detect drought affected plantations using the One class support vector machine algorithm. The multiclass support vector machine using Landsat vegetation indices and topographical variables produced an overall accuracy of 73.86% and a kappa value of 0.71 with user’s and producer’s accuracies ranging between 61% to 69% for drought damaged trees, while for healthy trees ranged from 84% to 90%. The one class support vector machine using Landsat vegetation indices and topographical variables produced an overall accuracy of 82.35% and a kappa value of 0.73. The one class support vector machine produced the highest overall accuracy compared to the multiclass SVM and stochastic gradient boosting algorithm. The use of topographical variables further improved the accuracies compared to the combination of Landsat spectra with vegetation indices

    Evolutionary Computation

    Get PDF
    This book presents several recent advances on Evolutionary Computation, specially evolution-based optimization methods and hybrid algorithms for several applications, from optimization and learning to pattern recognition and bioinformatics. This book also presents new algorithms based on several analogies and metafores, where one of them is based on philosophy, specifically on the philosophy of praxis and dialectics. In this book it is also presented interesting applications on bioinformatics, specially the use of particle swarms to discover gene expression patterns in DNA microarrays. Therefore, this book features representative work on the field of evolutionary computation and applied sciences. The intended audience is graduate, undergraduate, researchers, and anyone who wishes to become familiar with the latest research work on this field

    Innovative Techniques for the Retrieval of Earth’s Surface and Atmosphere Geophysical Parameters: Spaceborne Infrared/Microwave Combined Analyses

    Get PDF
    With the advent of the first satellites for Earth Observation: Landsat-1 in July 1972 and ERS-1 in May 1991, the discipline of environmental remote sensing has become, over time, increasingly fundamental for the study of phenomena characterizing the planet Earth. The goal of environmental remote sensing is to perform detailed analyses and to monitor the temporal evolution of different physical phenomena, exploiting the mechanisms of interaction between the objects that are present in an observed scene and the electromagnetic radiation detected by sensors, placed at a distance from the scene, operating at different frequencies. The analyzed physical phenomena are those related to climate change, weather forecasts, global ocean circulation, greenhouse gas profiling, earthquakes, volcanic eruptions, soil subsidence, and the effects of rapid urbanization processes. Generally, remote sensing sensors are of two primary types: active and passive. Active sensors use their own source of electromagnetic radiation to illuminate and analyze an area of interest. An active sensor emits radiation in the direction of the area to be investigated and then detects and measures the radiation that is backscattered from the objects contained in that area. Passive sensors, on the other hand, detect natural electromagnetic radiation (e.g., from the Sun in the visible band and the Earth in the infrared and microwave bands) emitted or reflected by the object contained in the observed scene. The scientific community has dedicated many resources to developing techniques to estimate, study and analyze Earth’s geophysical parameters. These techniques differ for active and passive sensors because they depend strictly on the type of the measured physical quantity. In my P.h.D. work, inversion techniques for estimating Earth’s surface and atmosphere geophysical parameters will be addressed, emphasizing methods based on machine learning (ML). In particular, the study of cloud microphysics and the characterization of Earth’s surface changes phenomenon are the critical points of this work

    Countering internet packet classifiers to improve user online privacy

    Get PDF
    Internet traffic classification or packet classification is the act of classifying packets using the extracted statistical data from the transmitted packets on a computer network. Internet traffic classification is an essential tool for Internet service providers to manage network traffic, provide users with the intended quality of service (QoS), and perform surveillance. QoS measures prioritize a network\u27s traffic type over other traffic based on preset criteria; for instance, it gives higher priority or bandwidth to video traffic over website browsing traffic. Internet packet classification methods are also used for automated intrusion detection. They analyze incoming traffic patterns and identify malicious packets used for denial of service (DoS) or similar attacks. Internet traffic classification may also be used for website fingerprinting attacks in which an intruder analyzes encrypted traffic of a user to find behavior or usage patterns and infer the user\u27s online activities. Protecting users\u27 online privacy against traffic classification attacks is the primary motivation of this work. This dissertation shows the effectiveness of machine learning algorithms in identifying user traffic by comparing 11 state-of-art classifiers and proposes three anonymization methods for masking generated user network traffic to counter the Internet packet classifiers. These methods are equalized packet length, equalized packet count, and equalized inter-arrival times of TCP packets. This work compares the results of these anonymization methods to show their effectiveness in reducing machine learning algorithms\u27 performance for traffic classification. The results are validated using newly generated user traffic. Additionally, a novel model based on a generative adversarial network (GAN) is introduced to automate countering the adversarial traffic classifiers. This model, which is called GAN tunnel, generates pseudo traffic patterns imitating the distributions of the real traffic generated by actual applications and encapsulates the actual network packets into the generated traffic packets. The GAN tunnel\u27s performance is tested against random forest and extreme gradient boosting (XGBoost) traffic classifiers. These classifiers are shown not being able of detecting the actual source application of data exchanged in the GAN tunnel in the tested scenarios in this thesis

    Industrial Applications: New Solutions for the New Era

    Get PDF
    This book reprints articles from the Special Issue "Industrial Applications: New Solutions for the New Age" published online in the open-access journal Machines (ISSN 2075-1702). This book consists of twelve published articles. This special edition belongs to the "Mechatronic and Intelligent Machines" section
    • …
    corecore