Search CORE

11 research outputs found

Метод создания коллекций со вложенными документами для баз данных типа ключ-документ с учетом выполняемых запросов

Author: Ха Ван Муон
Шичкина Юлия Александровна
Publication venue: СПб ФИЦ РАН
Publication date: 01/08/2020
Field of study

In the recent decades, NoSQL databases have become more popular day by day. And increasingly, developers and database administrators, for whatever reason, have to solve the problems of database migration from a relational model in the model NoSQL databases like the document-oriented database MongoDB database. This article discusses the approach to this migration data based on set theory. A new formal method of determining the optimal runtime searches aggregate collections with the attached documents NoSQL databases such as the key document. The attributes of the database objects are included in optimizing the number of collections and their structures in search queries. The initial data are object properties (attributes, relationships between attributes) on which information is stored in the database, and query the properties that are most often performed, or the speed of which should be maximal. This article discusses the basic types of connections (1-1, 1-M, M-M), typical of the relational model. The proposed method is the following step of the method of creating a collection without embedded documents. The article also provides a method for determining what methods should be used in the reasonable cases to make work with databases more effectively. At the end, this article shows the results of testing of the proposed method on databases with different initial schemes. Experimental results show that the proposed method helps reduce the execution time of queries can also significantly as well as reduce the amount of memory required to store the data in a new database.В последние десятилетия все большую популярность набирают NoSQL базы данных, и все чаще разработчикам и администраторам таких баз по той или иной причине приходится решать задачу миграции баз данных из реляционной модели в модель NoSQL, например документно-ориентированную базу данных MongoDB. Описывается подход к такой миграции данных на основе теории множеств. Предлагаются правила для определения совокупности коллекций со вложенными документами NoSQL базы данных типа ключ-документ, оптимальной по времени выполнения поисковых запросов. Оптимизация числа коллекций и их структуры проводится с учетом атрибутов объектов базы данных, участвующих в поисковых запросах. Исходными данными являются свойства объектов (атрибуты, связи между атрибутами), информация о которых хранится в базе данных, и свойства запросов, которые наиболее часто выполняются или скорость их выполнения максимальна. В правилах учитываются основные типы связей (1-1, 1-М, М-М), свойственные реляционной модели. Рассматриваемая совокупность правил является дополнением к методу создания коллекций без вложенных документов. Также приводится методика для определения, в каких случаях какие методы надо использовать, чтобы сделать работу с базами данных более эффективной. В заключении приведены результаты тестирования предлагаемого метода на базах данных с различными начальными схемами. Результаты экспериментов показывают, что предлагаемый метод помимо сокращения времени выполнения запросов позволяет также значительно сократить объем памяти, необходимый для хранения данных в новой базе данных

Информатика и автоматизация

Метод создания коллекций со вложенными документами для баз данных типа ключ-документ с учетом выполняемых запросов

Author: Van Muon Ha
Yulia Aleksandrovna Shichkina
Publication venue: Russian Academy of Sciences, St. Petersburg Federal Research Center
Publication date: 01/08/2020
Field of study

В последние десятилетия все большую популярность набирают NoSQL базы данных, и все чаще разработчикам и администраторам таких баз по той или иной причине приходится решать задачу миграции баз данных из реляционной модели в модель NoSQL, например документно-ориентированную базу данных MongoDB. Описывается подход к такой миграции данных на основе теории множеств. Предлагаются правила для определения совокупности коллекций со вложенными документами NoSQL базы данных типа ключ-документ, оптимальной по времени выполнения поисковых запросов. Оптимизация числа коллекций и их структуры проводится с учетом атрибутов объектов базы данных, участвующих в поисковых запросах. Исходными данными являются свойства объектов (атрибуты, связи между атрибутами), информация о которых хранится в базе данных, и свойства запросов, которые наиболее часто выполняются или скорость их выполнения максимальна. В правилах учитываются основные типы связей (1-1, 1-М, М-М), свойственные реляционной модели. Рассматриваемая совокупность правил является дополнением к методу создания коллекций без вложенных документов. Также приводится методика для определения, в каких случаях какие методы надо использовать, чтобы сделать работу с базами данных более эффективной. В заключении приведены результаты тестирования предлагаемого метода на базах данных с различными начальными схемами. Результаты экспериментов показывают, что предлагаемый метод помимо сокращения времени выполнения запросов позволяет также значительно сократить объем памяти, необходимый для хранения данных в новой базе данных

Directory of Open Access Journals

ANALISIS ANTENATAL CARE (ANC) PADA SURVEILANS KESEHATAN IBU DAN ANAK DENGAN TAHAPAN AGREGASI PIPELINE NOSQL

Author: Bahrudin M. J. U. Haris
Fauziyah Anni Karimatul
Heksaputra Dadang
Wijaya Dhina Puspasari
Publication venue: 'Alma Ata University Press'
Publication date: 27/06/2021
Field of study

Case 30.8 percent of Indonesian children under five are stunted. Bantul is a district in the Province of D.I. Yogyakarta, Indonesia, is a locus of stunting. Bantul has ten villages. The ten villages include Patalan Jetis Village, Canden Jetis Village, Terong Dlingo Village, Argodadi Sedayu Village, Triharjo Pandak Village, Triwidadi Pajangan Village, Jatimulyo Dlingo Village, Datangharjo Sewon Village, Sendangsari Pajangan Village, and Trimulyo Jetis Village. The research focuses on the village of Argodadi Sedayu. In the village of Argodadi Sedayu, Antenatal Care (ANC) research would be conducted. Antenatal Care (ANC) is a pregnancy check by a doctor or midwife. Therefore, Antenatal Care Analysis (ANC) is needed to determine whether diet, parenting, and sanitation are well programmed. Antenatal care (ANC) research framework was a model of method improvement. The method improvement model consists of indicators, proposed methods, objectives, and measurements. The indicators consist of monitoring instruments and health visits. The proposed method uses an aggregation pipeline stage. The data was processed in the aggregation pipeline stage. The data were obtained from the time series data surveillance dataset. The research objective was to analyze the research results accurately according to the proposed method. Measurement of indicator analysis with the application of the dashboard as a performance indicator on the research results. Practically, it is hoped that the research results could consider the health office and related institutions in reducing or even elevating Argodadi Sedayu Village in Yogyakarta as a non-locus of stunting using massive monitoring of diet, parenting, and sanitation well programmed

Ejournal Alma Ata University Yogyakarta

Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

Author: Alam Mansaf
Ali Syed Arshad
Khan Samiya
Liu Xiufeng
Publication venue
Publication date: 01/01/2019
Field of study

Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

arXiv.org e-Print Archive

Online Research Database In Technology

MongoDB modernissa sovelluskehityksessä

Author: Kukkonen Mikko
Publication venue
Publication date: 03/03/2023
Field of study

Kahden viime vuosikymmenen aikana internetin ja mobiililaitteiden käytön räjähdysmäinen kasvu on lisännyt valtavasti tallennettavan datan määrää. Tämä datan määrän kasvu on lisännyt tarvetta uusien innovatiivisten tietokantaratkaisujen kehittämiseen. Yksi näistä uuden sukupolven tietokantajärjestelmistä on MongoDB. Tässä kirjallisuuskatsauksessa tutustutaan MongoDB:n käyttöön modernissa sovelluskehityksessä. Aiheesta julkaistun kirjallisuuden avulla pyritään selvittämään, miksi MongoDB on noussut viime vuosikymmenen aikana yhdeksi suosituimmista tietokantajärjestelmistä. Työn alussa esitellään yleisesti tietokantojen perusominaisuuksia, sekä yleisimpiä tietokantatyyppejä. Julkaistun kirjallisuuden avulla tutustutaan MongoDB:n ominaisuuksiin ja heikkouksiin, sekä selvitetään MongoDB:n suosioon vaikuttavia tekijöitä. MongoDB:n suorituskykyä vertaillaan muihin tietokantaratkaisuihin aiheesta julkaistujen vertaisarvioitujen tutkimusten avulla. Tämän kirjallisuuskatsauksen tuloksina havaittiin, että MongoDB:n dokumenttipohjainen datamalli tarjoaa yksinkertaisuutensa takia nopeaa datan saatavuutta ja tietokannan skaalautuvuutta. MongoDB:n ongelmiksi havaittiin, että luontaisesti relaatiomallisen datan muuntaminen MongoDB:lle sopivaan muotoon aiheuttaa ylimääräistä työtä sovelluskehittäjälle. Myös MongoDB:n skeemattomuuden havaittiin aiheuttavan ongelmia, koska datamallin yhtenäisyys jää sovelluskehittäjän vastuulle ja aiheuttaa näin ylimääräistä työtä. MongoDB:n suorituskykyä käsittelevien tutkimusten tulokset osoittivat, että MongoDB on erityisen tehokas suoritettaessa haku- ja lisäysoperaatioita. Havaittiin, että MongoDB voi olla suorituskyvyltään hyvin tehokas tietokanta, mutta saavutettavat tehokkuusedut vaihtelevat käsiteltävän datan ja sovelluksen käyttötarkoituksen mukaan. Tutkimuksen johdolla havaittiin, että MongoDB soveltuu horisontaalisen skaalautuvuutensa ansiosta suuria datamääriä käsitteleville sovelluksille. Havaittiin myös, että skeemattomuuden ansiosta MongoDB sopii hyvin tietokannaksi moderneille sovelluksille, joiden kehitys on nopeaa ja tietokannan arkkitehtuuria joudutaan muuttamaan useita kertoja kehityksen eri vaiheissa. Näiden ominaisuuksien ansiosta MongoDB on noussut viimeisen vuosikymmenen aikana yhdeksi suosituimmista tietokantajärjestelmistä

Trepo - Institutional Repository of Tampere University

An Organized Repository of Ethereum Smart Contracts’ Source Codes and Metrics

Author: Marchesi Michele
Pierro Giuseppe Antonio
Tonelli Roberto
Publication venue: 'MDPI AG'
Publication date: 01/11/2020
Field of study

International audienceMany empirical software engineering studies show that there is a need for repositories where source codes are acquired, filtered and classified. During the last few years, Ethereum block explorer services have emerged as a popular project to explore and search for Ethereum blockchain data such as transactions, addresses, tokens, smart contracts’ source codes, prices and other activities taking place on the Ethereum blockchain. Despite the availability of this kind of service, retrieving specific information useful to empirical software engineering studies, such as the study of smart contracts’ software metrics, might require many subtasks, such as searching for specific transactions in a block, parsing files in HTML format, and filtering the smart contracts to remove duplicated code or unused smart contracts. In this paper, we afford this problem by creating Smart Corpus, a corpus of smart contracts in an organized, reasoned and up-to-date repository where Solidity source code and other metadata about Ethereum smart contracts can easily and systematically be retrieved. We present Smart Corpus’s design and its initial implementation, and we show how the data set of smart contracts’ source codes in a variety of programming languages can be queried and processed to get useful information on smart contracts and their software metrics. Smart Corpus aims to create a smart-contract repository where smart-contract data (source code, application binary interface (ABI) and byte code) are freely and immediately available and are classified based on the main software metrics identified in the scientific literature. Smart contracts’ source codes have been validated by EtherScan, and each contract comes with its own associated software metrics as computed by the freely available software PASO. Moreover, Smart Corpus can be easily extended as the number of new smart contracts increases day by day

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Occam's Razor For Big Data?

Author: Dresp-Langley Birgitta
Publication venue
Publication date: 01/01/2019
Field of study

Detecting quality in large unstructured datasets requires capacities far beyond the limits of human perception and communicability and, as a result, there is an emerging trend towards increasingly complex analytic solutions in data science to cope with this problem. This new trend towards analytic complexity represents a severe challenge for the principle of parsimony (Occam’s razor) in science. This review article combines insight from various domains such as physics, computational science, data engineering, and cognitive science to review the specific properties of big data. Problems for detecting data quality without losing the principle of parsimony are then highlighted on the basis of specific examples. Computational building block approaches for data clustering can help to deal with large unstructured datasets in minimized computation time, and meaning can be extracted rapidly from large sets of unstructured image or video data parsimoniously through relatively simple unsupervised machine learning algorithms. Why we still massively lack in expertise for exploiting big data wisely to extract relevant information for specific tasks, recognize patterns and generate new information, or simply store and further process large amounts of sensor data is then reviewed, and examples illustrating why we need subjective views and pragmatic methods to analyze big data contents are brought forward. The review concludes on how cultural differences between East and West are likely to affect the course of big data analytics, and the development of increasingly autonomous artificial intelligence (AI) aimed at coping with the big data deluge in the near future. Keywords: big data; non-dimensionality; applied data science; paradigm shift; artificial intelligence; principle of parsimony (Occam’s razor

PhilPapers

Control patrimonial y el sistema de información en una de las dirección de redes integradas de Salud, Lima Perú – 2021

Author: Santos García Francisco del Carmen
Publication venue: 'Universidad Cesar Vallejo'
Publication date: 01/01/2022
Field of study

El presente trabajo titulado “Control Patrimonial y el Sistema de Información en una de las Dirección de Redes Integradas de Salud, Lima Perú – 2021” tuvo como objetivo general Determinar si existe relación entre el control patrimonial y el sistema de información en una de las direcciones de redes integradas de salud, Lima Perú – 2021. El estudio fue de tipo básico, con diseño no experimental, descriptivo - correlacional, de corte transversal y con enfoque cuantitativo. Se recogió información a través de la encuesta a una población conformada por 101 especialistas encargados del área de patrimonio de diferentes establecimientos de salud de diris Lima - Perú. Los instrumentos utilizados fueron de elaboración propia, obteniendo valores de alta confiabilidad del cuestionario, a través del Alfa de Cronbach de las variables Control Patrimonial: 0.986 y Sistema de Información: 0.987. De acuerdo con los datos obtenidos en la prueba de relación r = 0,968 (Pearson) entre las variables Control Patrimonial y el Sistema de Información el grado de correlación es positiva alta, la significancia de p < 0,001 muestra que p es menor a 0,05, por lo tanto, se rechaza la hipótesis nula y se acepta la hipótesis alternativa

Repositorio Institucional Universidad César Vallejo: Página de inicio