5,190 research outputs found

    Interactive visualization of event logs for cybersecurity

    Get PDF
    Hidden cyber threats revealed with new visualization software Eventpa

    Big Data and Analysis of Data Transfers for International Research Networks Using NetSage

    Get PDF
    Modern science is increasingly data-driven and collaborative in nature. Many scientific disciplines, including genomics, high-energy physics, astronomy, and atmospheric science, produce petabytes of data that must be shared with collaborators all over the world. The National Science Foundation-supported International Research Network Connection (IRNC) links have been essential to enabling this collaboration, but as data sharing has increased, so has the amount of information being collected to understand network performance. New capabilities to measure and analyze the performance of international wide-area networks are essential to ensure end-users are able to take full advantage of such infrastructure for their big data applications. NetSage is a project to develop a unified, open, privacy-aware network measurement, and visualization service to address the needs of monitoring today's high-speed international research networks. NetSage collects data on both backbone links and exchange points, which can be as much as 1Tb per month. This puts a significant strain on hardware, not only in terms storage needs to hold multi-year historical data, but also in terms of processor and memory needs to analyze the data to understand network behaviors. This paper addresses the basic NetSage architecture, its current data collection and archiving approach, and details the constraints of dealing with this big data problem of handling vast amounts of monitoring data, while providing useful, extensible visualization to end users

    Oil and Gas flow Anomaly Detection on offshore naturally flowing wells using Deep Neural Networks

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceThe Oil and Gas industry, as never before, faces multiple challenges. It is being impugned for being dirty, a pollutant, and hence the more demand for green alternatives. Nevertheless, the world still has to rely heavily on hydrocarbons, since it is the most traditional and stable source of energy, as opposed to extensively promoted hydro, solar or wind power. Major operators are challenged to produce the oil more efficiently, to counteract the newly arising energy sources, with less of a climate footprint, more scrutinized expenditure, thus facing high skepticism regarding its future. It has to become greener, and hence to act in a manner not required previously. While most of the tools used by the Hydrocarbon E&P industry is expensive and has been used for many years, it is paramount for the industry’s survival and prosperity to apply predictive maintenance technologies, that would foresee potential failures, making production safer, lowering downtime, increasing productivity and diminishing maintenance costs. Many efforts were applied in order to define the most accurate and effective predictive methods, however data scarcity affects the speed and capacity for further experimentations. Whilst it would be highly beneficial for the industry to invest in Artificial Intelligence, this research aims at exploring, in depth, the subject of Anomaly Detection, using the open public data from Petrobras, that was developed by experts. For this research the Deep Learning Neural Networks, such as Recurrent Neural Networks with LSTM and GRU backbones, were implemented for multi-class classification of undesirable events on naturally flowing wells. Further, several hyperparameter optimization tools were explored, mainly focusing on Genetic Algorithms as being the most advanced methods for such kind of tasks. The research concluded with the best performing algorithm with 2 stacked GRU and the following vector of hyperparameters weights: [1, 47, 40, 14], which stand for timestep 1, number of hidden units 47, number of epochs 40 and batch size 14, producing F1 equal to 0.97%. As the world faces many issues, one of which is the detrimental effect of heavy industries to the environment and as result adverse global climate change, this project is an attempt to contribute to the field of applying Artificial Intelligence in the Oil and Gas industry, with the intention to make it more efficient, transparent and sustainable

    Impact Of Artificial Intelligence And Big Data On The Oil And Gas Industry In Nigeria

    Get PDF
    This paper examines the concept of Artificial intelligence and Big Data as a field of study and its Impact on the oil and gas industry. Artificial Intelligence refers to the concept having of Computer systems that can perform tasks that would typically require human intelligence. Some such tasks are visual perception, speech recognition, decision-making and translation between languages, amongst others. “Big data” or Big Data analytics is a term often used to describe a huge or somewhat overwhelming data size that exceeds the capacity of both humans and the traditional software to process within an acceptable time and value. There is a big interface between the two concepts. AI does not stand alone; it requires big data for efficiency. AI and Big Data have brought about great impact across different industries and organizations. In the oil and gas industry, there have been an increasing installation of data recording sensors, hence data acquisition in exploration, drilling and production aspects of the industry. The industry is gradually making use of this huge data set by processing them using AI enabled tools and software to arrive at smart decisions that bring efficiency to operations in the industry. Some of such areas are analysis of seismic and micro-seismic data, improvement in reservoir characterization and simulation, reduction in drilling time and increasing drilling safety, optimization of pump performance, amongst others. Some of the solutions listed above have been successfully implemented in Nigeria, mostly by the international oil companies and some additional areas have also been impacted: managing asset integrity, tubular tally for drilling operations using RFID and the licensing and permit system by DPR. The industry has fully embraced the AI and Big Data concept, the future is very bright for more innovative solutions. However, there are still a few challenges especially in Nigeria. Some of these challenges include lack of local skilled manpower, poor data culture, security challenges in the industry’s operating areas, limited availability of good quality data, and understanding the complexity of the concept

    Potential utility of future satellite magnetic field data

    Get PDF
    The requirements for a program of geomagnetic field studies are examined which will satisfy a wide range of user needs in the interim period between now and the time at which data from the Geopotential Research Mission (GRM) becomes available, and the long term needs for NASA's program in this area are considered. An overview of the subject, a justification for the recommended activities in the near term and long term, and a summary of the recommendations reached by the contributors is included

    Making intelligent systems team players: Case studies and design issues. Volume 1: Human-computer interaction design

    Get PDF
    Initial results are reported from a multi-year, interdisciplinary effort to provide guidance and assistance for designers of intelligent systems and their user interfaces. The objective is to achieve more effective human-computer interaction (HCI) for systems with real time fault management capabilities. Intelligent fault management systems within the NASA were evaluated for insight into the design of systems with complex HCI. Preliminary results include: (1) a description of real time fault management in aerospace domains; (2) recommendations and examples for improving intelligent systems design and user interface design; (3) identification of issues requiring further research; and (4) recommendations for a development methodology integrating HCI design into intelligent system design

    Data mining for anomaly detection in maritime traffic data

    Get PDF
    For the past few years, oceans have become once again, an important means of communication and transport. In fact, traffic density throughout the globe has suffered a substantial growth, which has risen some concerns. With this expansion, the need to achieve a high Maritime Situational Awareness (MSA) is imperative. At the present time, this need may be more easily fulfilled thanks to the vast amount of data available regarding maritime traffic. However, this brings in another issue: data overload. Currently, there are so many data sources, so many data to obtain information from, that the operators cannot handle it. There is a pressing need for systems that help to sift through all the data, analysing and correlating, helping in this way the decision making process. In this dissertation, the main goal is to use different sources of data in order to detect anomalies and contribute to a clear Recognised Maritime Picture (RMP). In order to do so, it is necessary to know what types of data exist and which ones are available for further analysis. The data chosen for this dissertation was Automatic Identification System (AIS) and Monitorização Contínua das Atividades da Pesca (MONICAP) data, also known as Vessel Monitoring System (VMS) data. In order to store 1 year worth of AIS and MONICAP data, a PostgreSQL database was created. To analyse and draw conclusions from the data, a data mining tool was used, namely, Orange. Tests were conducted in order to assess the correlation between data sources and find anomalies. The importance of data correlation has never been so important and with this dissertation the aim is to show that there is a simple and effective way to get answers from great amounts of data.Nos últimos anos, os oceanos tornaram-se, mais uma vez, um importante meio de comunicação e transporte. De facto, a densidade de tráfego global sofreu um crescimento substancial, o que levantou algumas preocupações. Com esta expansão, a necessidade de atingir um elevado Conhecimento Situacional Marítimo (CSM) é imperativa. Hoje em dia, esta necessidade pode ser satisfeita mais facilmente graças à vasta quantidade de dados disponíveis de tráfego marítimo. No entanto, isso leva a outra questão: sobrecarga de dados. Atualmente existem tantas fontes de dados, tantos dados dos quais extrair informação, que os operadores não conseguem acompanhar. Existe uma necessidade premente para sistemas que ajudem a escrutinar todos os dados, analisando e correlacionando, contribuindo desta maneira ao processo de tomada de decisão. Nesta dissertação, o principal objetivo é usar diferentes fontes de dados para detetar anomalias e contribuir para uma clara Recognised Maritime Picture (RMP). Para tal, é necessário saber que tipos de dados existem e quais é que se encontram disponíveis para análise posterior. Os dados escolhidos para esta dissertação foram dados Automatic Identification System (AIS) e dados de Monitorização Contínua das Atividades da Pesca (MONICAP), também conhecidos como dados de Vessel Monitoring System (VMS). De forma a armazenar dados correspondentes a um ano de AIS e MONICAP, foi criada uma base de dados em PostgreSQL. Para analisar e retirar conclusões, foi utilizada uma ferramenta de data mining, nomeadamente, o Orange. De modo a que pudesse ser avaliada a correlação entre fontes de dados e serem detetadas anomalias foram realizados vários testes. A correlação de dados nunca foi tão importante e pretende-se com esta dissertação mostrar que existe uma forma simples e eficaz de obter respostas de grandes quantidades de dado

    Predicting Pilot Misperception of Runway Excursion Risk Through Machine Learning Algorithms of Recorded Flight Data

    Get PDF
    The research used predictive models to determine pilot misperception of runway excursion risk associated with unstable approaches. The Federal Aviation Administration defined runway excursion as a veer-off or overrun of the runway surface. The Federal Aviation Administration also defined a stable approach as an aircraft meeting the following criteria: (a) on target approach airspeed, (b) correct attitude, (c) landing configuration, (d) nominal descent angle/rate, and (e) on a straight flight path to the runway touchdown zone. Continuing an unstable approach to landing was defined as Unstable Approach Risk Misperception in this research. A review of the literature revealed that an unstable approach followed by the failure to execute a rejected landing was a common contributing factor in runway excursions. Flight Data Recorder data were archived and made available by the National Aeronautics and Space Administration for public use. These data were collected over a four-year period from the flight data recorders of a fleet of 35 regional jets operating in the National Airspace System. The archived data were processed and explored for evidence of unstable approaches and to determine whether or not a rejected landing was executed. Once identified, those data revealing evidence of unstable approaches were processed for the purposes of building predictive models. SAS™ Enterprise MinerR was used to explore the data, as well as to build and assess predictive models. The advanced machine learning algorithms utilized included: (a) support vector machine, (b) random forest, (c) gradient boosting, (d) decision tree, (e) logistic regression, and (f) neural network. The models were evaluated and compared to determine the best prediction model. Based on the model comparison, the decision tree model was determined to have the highest predictive value. The Flight Data Recorder data were then analyzed to determine predictive accuracy of the target variable and to determine important predictors of the target variable, Unstable Approach Risk Misperception. Results of the study indicated that the predictive accuracy of the best performing model, decision tree, was 99%. Findings indicated that six variables stood out in the prediction of Unstable Approach Risk Misperception: (1) glideslope deviation, (2) selected approach speed deviation (3) localizer deviation, (4) flaps not extended, (5) drift angle, and (6) approach speed deviation. These variables were listed in order of importance based on results of the decision tree predictive model analysis. The results of the study are of interest to aviation researchers as well as airline pilot training managers. It is suggested that the ability to predict the probability of pilot misperception of runway excursion risk could influence the development of new pilot simulator training scenarios and strategies. The research aids avionics providers in the development of predictive runway excursion alerting display technologies

    ALOJA-ML: a framework for automating characterization and knowledge discovery in Hadoop deployments

    Get PDF
    This article presents ALOJA-Machine Learning (ALOJA-ML) an extension to the ALOJA project that uses machine learning techniques to interpret Hadoop benchmark performance data and performance tuning; here we detail the approach, efficacy of the model and initial results. The ALOJA-ML project is the latest phase of a long-term collaboration between BSC and Microsoft, to automate the characterization of cost-effectiveness on Big Data deployments, focusing on Hadoop. Hadoop presents a complex execution environment, where costs and performance depends on a large number of software (SW) configurations and on multiple hardware (HW) deployment choices. Recently the ALOJA project presented an open, vendor-neutral repository, featuring over 16.000 Hadoop executions. These results are accompanied by a test bed and tools to deploy and evaluate the cost-effectiveness of the different hardware configurations, parameter tunings, and Cloud services. Despite early success within ALOJA from expert-guided benchmarking, it became clear that a genuinely comprehensive study requires automation of modeling procedures to allow a systematic analysis of large and resource-constrained search spaces. ALOJA-ML provides such an automated system allowing knowledge discovery by modeling Hadoop executions from observed benchmarks across a broad set of configuration parameters. The resulting empirically-derived performance models can be used to forecast execution behavior of various workloads; they allow a-priori prediction of the execution times for new configurations and HW choices and they offer a route to model-based anomaly detection. In addition, these models can guide the benchmarking exploration efficiently, by automatically prioritizing candidate future benchmark tests. Insights from ALOJA-ML's models can be used to reduce the operational time on clusters, speed-up the data acquisition and knowledge discovery process, and importantly, reduce running costs. In addition to learning from the methodology presented in this work, the community can benefit in general from ALOJA data-sets, framework, and derived insights to improve the design and deployment of Big Data applications.This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 re- search and innovation programme (grant agreement No 639595). This work is partially supported by the Ministry of Economy of Spain under contracts TIN2012-34557 and 2014SGR105Peer ReviewedPostprint (published version
    corecore