Search CORE

9 research outputs found

ALGORITHM FOR PROCESSING DATA OF GEOLOGICAL SURVEYS USING GIS TECHNOLOGIES (ON THE EXAMPLE OF THE MATERIALS OF DRILLING STUDY OF BREST REGION TERRITORY)

Author: Богдасаров М. А.
Маевская А. Н.
Шешко Н. Н.
Publication venue: БрГУ имени А. С. Пушкина
Publication date: 01/01/2020
Field of study

Рассмотрена проблема «больших данных», подходы к их классификации, а также основные методы, применяемые при их обработке. Приведен анализ наиболее распространенных способов пред- варительного статистического анализа пространственных данных. На примере информации, полученной в результате геологических изысканий, проведенных на территории Брестской области, разработан алгоритм, позволяющий с применением геоинформационных технологий осуществлять обработку данных геологического бурения. Представленный алгоритм включает в себя несколько последовательных этапов и учитывает существующие подходы к анализу пространственных данных. Для автоматизации процессов обработки информации создан набор инструментов «processing of geological data»

Репозиторий БрГУ имени А.С. Пушкина

АЛГОРИТМ ОБРАБОТКИ ДАННЫХ ГЕОЛОГИЧЕСКИХ ИЗЫСКАНИЙ С ПРИМЕНЕНИЕМ ГИС-ТЕХНОЛОГИЙ (НА ПРИМЕРЕ МАТЕРИАЛОВ БУРОВОЙ ИЗУЧЕННОСТИ ТЕРРИТОРИИ БРЕСТСКОЙ ОБЛАСТИ)

Author: Богдасаров Максим Альбертович
Маевская Анна Николаевна
Шешко Николай Николаевич
Publication venue: Брест
Publication date: 01/01/2020
Field of study

В данной статье рассмотрена проблема “больших данных”, подходы к их классификации, а также основные методы, применяемые при их обработке. Приведен анализ наиболее распространенных способов предварительного статистического анализа пространственных данных. На примере информации, полученной в результате геологических изысканий, проведенных на территории Брестской области разработан алгоритм, позволяющий с применением геоинформационных технологий осуществлять обработку данных геологического бурения. Представленный алгоритм включает несколько последовательных этапов и учитывает существующие подходы к анализу пространственных данных. Для автоматизации процессов обработки информации создан набор инструментов «processing of geological data»

Репозиторий БрГУ имени А.С. Пушкина

Converting a Water Pressurized Network in a Small Town into a Solar Power Water System

Author: Fernández Rodríguez Héctor
Jódar-Abellán Antonio
Pardo Picazo Miguel Ángel
Publication venue: 'MDPI AG'
Publication date: 04/08/2020
Field of study

The efficient management of water and energy is one challenge for managers of water pressurized systems. In a scheme with high pressure on the environment, solar power appears as an opportunity for nonrenewable energy expenditure reduction and emissions elimination. In Spain, new legislation that eliminates old taxes associated with solar energy production, a drop in the cost of solar photovoltaic modules, and higher values of irradiance has converted solar powered water systems into one of the trendiest topics in the water industry. One alternative to store energy (compulsory in standalone photovoltaic systems) when managing pressurized urban water networks is the use of head tanks (tanks accumulate water during the day and release it at night). This work intends to compare the pressurized network running as a standalone system and a hybrid solution that incorporates solar energy supply and electricity grids. The indicator used for finding the best choice is the net present value for the solar power water system lifespan. This study analyzed the possibility of transferring the energy surplus obtained at midday to the electricity grid, a circumstance introduced in the Spanish legislation since April 2019. We developed a real case study in a small town in the Alicante Province, whose findings provide planning policymakers with very useful information in this case and similar case studies.Antonio Jodar-Abellán acknowledges financial support received from the Spanish FPU scholarship for the training of university teachers. In the same way, this work has been partially funded by the Cátedra del Agua of the University of Alicante and the Diputación Provincial de Alicante (https://catedradelaguaua.org/)

Repositorio Institucional de la Universidad de Alicante

Multidisciplinary Digital Publishing Institute

DENCAST: distributed density-based clustering for multi-target regression

Author: Donato Malerba
Gianvito Pio
Michelangelo Ceci
Roberto Corizzo
Publication venue
Publication date: 03/06/2019
Field of study

Recent developments in sensor networks and mobile computing led to a huge increase in data generated that need to be processed and analyzed efficiently. In this context, many distributed data mining algorithms have recently been proposed. Following this line of research, we propose the DENCAST system, a novel distributed algorithm implemented in Apache Spark, which performs density-based clustering and exploits the identified clusters to solve both single- and multi-target regression tasks (and thus, solves complex tasks such as time series prediction). Contrary to existing distributed methods, DENCAST does not require a final merging step (usually performed on a single machine) and is able to handle large-scale, high-dimensional data by taking advantage of locality sensitive hashing. Experiments show that DENCAST performs clustering more efficiently than a state-of-the-art distributed clustering algorithm, especially when the number of objects increases significantly. The quality of the extracted clusters is confirmed by the predictive capabilities of DENCAST on several datasets: It is able to significantly outperform (p-value

<0.05

) state-of-the-art distributed regression methods, in both single and multi-target settings

Open Access Repository

Anomaly Detection and Repair for Accurate Predictions in Geo-distributed Big Data

Author: Ceci M.
Corizzo R.
Japkowicz N.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

The increasing presence of geo-distributed sensor networks implies the generation of huge volumes of data from multiple geographical locations at an increasing rate. This raises important issues which become more challenging when the final goal is that of the analysis of the data for forecasting purposes or, more generally, for predictive tasks. This paper proposes a framework which supports predictive modeling tasks from streaming data coming from multiple geo-referenced sensors. In particular, we propose a distance-based anomaly detection strategy which considers objects described by embedding features learned via a stacked auto-encoder. We then devise a repair strategy which repairs the data detected as anomalous exploiting non-anomalous data measured by sensors in nearby spatial locations. Subsequently, we adopt Gradient Boosted Trees (GBTs)to predict/forecast values assumed by a target variable of interest for the repaired newly arriving (unlabeled)data, using the original feature representation or the embedding feature representation learned via the stacked auto-encoder. The workflow is implemented with distributed Apache Spark programming primitives and tested on a cluster environment. We perform experiments to assess the performance of each module, separately and in a combined manner, considering the predictive modeling of one-day-ahead energy production, for multiple renewable energy sites. Accuracy results show that the proposed framework allows reducing the error up to 13.56%. Moreover, scalability results demonstrate the efficiency of the proposed framework in terms of speedup, scaleup and execution time under a stress test

Archivio istituzionale della ricerca - Università di Bari

Anomaly Detection and Repair for Accurate Predictions in Geo-distributed Big Data

Author: Appice
Belsley
Bengio
Bessa
Bohannon
Cavallo
Cavallo
Ceci
Chandola
Chandola
Chong
Chu
Dowell
Fang
Gehring
Hinton
Huang
Japkowicz
Japkowicz
Karim
Keogh
Kou
Liu
Liu
Masci
Mason
Miller
Moran
Najafabadi
Pei
Persson
Pukelsheim
Rashkovska
Sakr
Sakurada
Shekhar
Sohrabi
Song
Stojanova
Tian
Weigend
Zhang
Zhao
Zhou
Zhu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Etterretningsanalyse og stordata. Stordatadrevet ACH – bedre analyser eller enda en tidstyv?

Author: Dugstad Lars Roar Uggerud
Publication venue: FHS
Publication date: 01/01/2021
Field of study

Ved å ta i bruk nye teknologier tilknyttet stordataanalyse, hevder denne oppgaven at det finnes et stort potensial for å kunne gjøre etterretningsanalyse raskere, og sette den enkelte analytiker i stand til å dekke over en større informasjonsmengde med høyere presisjon. Dette muliggjøres av den teknologiske utviklingen innenfor databehandling og stordatasystemer som gjør det mulig å overføre, analysere og sammenstille informasjon raskere og mer effektivt, og gjennom dette kunne tolke store datamengder. Utgangspunktet for denne oppgaven var en observasjon av at stordata- og etterretningsanalyse har en del fellestrekk, og at stordataanalyse derfor kan ha et potensiale for å bidra til mer effektiv etterretningsanalyse. Målsetningen ble derfor å utforske om, og på hvilken måte, stordataanalyse kan understøtte etterretningsanalyse. For å gjøre dette blir det foreslått en metode som kombinerer stordata og etterretningsanalyse. Metoden har fått navnet stordatadrevet ACH. Metoden benytter stordataanalyse til å avdekke mønstre og foreslå konklusjoner, mens ACH benyttes som et rammeverk for å utvikle hypoteser, vurdere kontekst og ta de endelige beslutningene. Metoden blir evaluert gjennom et eksperiment. Eksperimentet tester en hypotese om at det foregår ulovlig, urapportert eller uregulert (UUU) fiske i norske farvann. Med bakgrunn i tre indikatorer for UUU-fiske ble det beskrevet algoritmer som kunne svare på de nevnte indikatorene. For å gjennomføre eksperimentet ble det etablert en stordatainfrastruktur, og Kystverkets åpne AIS-strøm ble benyttet som stordatakilde. I løpet av eksperimentet ble mer enn 5 000 000 AIS-meldinger analysert, og svar på de ulike indikatorene ble presentert i sanntid. Resultatet av eksperimentet var at det ble identifisert ett fiskefartøy som vi kan hevde at det er økt sannsynlighet for at driver med UUU-fiske. Et sentralt funn er at det er tidkrevende og komplisert å etablere stordataløsninger. Stordatainfrastruktur krever spesifikk kompetanse både for å sette opp, konfigurere og drifte. Det er derfor viktig at det etableres et tett samarbeid mellom etterretningsanalytiker og dataingeniør(er) når det skal etableres slike løsninger. Studien viser at stordatadrevet ACH er et kraftig verktøy når det anvendes riktig. Den endelige konklusjonen er derfor at stordatadrevet ACH kan øke omfanget og anvendeligheten til ACH spesielt og til etterretningsanalyse generelt, men dette forutsetter at det benyttes på rett problemstillinger. Det må ses på som et supplement til eksisterende prosesser og systemer, ikke en erstatning

FHS Brage

NORA - Norwegian Open Research Archives

Recommended from our members

Improved integration of information to reduce subsurface model bias

Author: Mabadeje Ademide O.
Publication venue
Publication date: 17/07/2024
Field of study

Subsurface modeling deals with data-related issues like cognitive and sampling biases, and model-related challenges including statistical assumptions, misspecification, and algorithmic biases. These challenges introduce four critical implications during subsurface modeling. Firstly, subsurface sampling is subject to sampling bias, which compromises statistical representativeness. Secondly, analog selection methodologies rely on multivariate statistics and expert judgment that overlook spatial information and data dimensionality. Thirdly, subsurface inferential workflows that utilize dimensionality reduction seldom provide repeatable frameworks that maintain model stability and are invariant to Euclidean transformations. Lastly, deep learning methods for dimensionality reduction, characterized as black-box models, lack interpretability and robust evaluation metrics, increasing susceptibility to algorithmic bias. Consequently, neglecting these challenges in subsurface modeling could lead to erroneous predictions, inconsistent inferences, diminished model reliability, and suboptimal decision-making that impacts project economics. This dissertation integrates information within subsurface models to reduce model bias and significantly improve their accuracy, robustness, and generalizability. First, I create spatial declustering methods to debias spatial datasets with single and multiscale preferential sampling in stationary populations. Second, I introduce a novel geostatistics-based machine learning method for identifying subsurface resource analogs that integrate spatial information in subsurface datasets with high dimensionality. Next, I efficiently combine machine learning and computational geometry methods to stabilize lower dimensional spaces for uncertainty quantification and interpretation. Finally, I create a methodology to assess, evaluate, and interpret the stability of deep learning latent feature spaces. These novel methodologies demonstrate the importance of improved techniques for information integration in subsurface modeling and show better results over naïve methods. This results in objective sampling debiasing in spatial stationary populations with single or multiple data scales, improving statistical representativity. Also, the results show better generalization and accurate identification of spatial analogs in high-dimensional datasets. Moreover, the methods yield Euclidean transformation-invariant lower-dimensional spaces, ensuring unique and repeatable solutions that improve model reliability and interpretability, for rational comparisons. Finally, the results indicate that deep learning models for dimensionality reduction exhibit algorithmic biases and instabilities, including sample, structural, and inferential instability, affecting their reliability and interpretability. Together, these innovations ultimately reduce model bias and significantly improve subsurface modeling.Petroleum and Geosystems Engineerin

Texas ScholarWorks