121 research outputs found
Impact of Fuzzy Logic in Object-Oriented Database Through Blockchain
In this article, we show that applying fuzzy reasoning to an object-arranged data set produces noticeably better results than applying it to a social data set by applying it to both social and object-situated data sets. A Relational Data Base Management System (RDBMS) product structure offers a practical and efficient way to locate, store, and retrieve accurate data included inside a data collection. In any case, clients typically have to make vague, ambiguous, or fanciful requests. Our work allows clients the freedom to utilise FRDB to examine the database in everyday language, enabling us to provide a range of solutions that would benefit clients in a variety of ways. Given that the degree of attributes in a fuzzy knowledge base goes from 0 to 1, the term "fuzzy" was coined. This is due to the base's fictitious formalization's reliance on fuzzy reasoning. In order to lessen the fuzziness of the fuzzy social data set as a result of the abundance of uncertainty and vulnerabilities in clinical medical services information, a fuzzy article located information base is designed here for the Health-Care space. In order to validate the presentation and sufficiency of the fuzzy logic on both data sets, certain fuzzy questions are thus posed of the fuzzy social data set and the fuzzy item-situated information base.
Enhancing health risk prediction with deep learning on big data and revised fusion node paradigm
With recent advances in health systems, the amount of health data is expanding rapidly in various formats. This data originates from many new sources including digital records, mobile devices, and wearable health devices. Big health data offers more opportunities for health data analysis and enhancement of health services via innovative approaches. The objective of this research is to develop a framework to enhance health prediction with the revised fusion node and deep learning paradigms. Fusion node is an information fusion model for constructing prediction systems. Deep learning involves the complex application of machine-learning algorithms, such as Bayesian fusions and neural network, for data extraction and logical inference. Deep learning, combined with information fusion paradigms, can be utilized to provide more comprehensive and reliable predictions from big health data. Based on the proposed framework, an experimental system is developed as an illustration for the framework implementatio
Wiki-health: from quantified self to self-understanding
Today, healthcare providers are experiencing explosive growth in data, and medical imaging represents a significant portion of that data. Meanwhile, the pervasive use of mobile phones and the rising adoption of sensing devices, enabling people to collect data independently at any time or place is leading to a torrent of sensor data. The scale and richness of the sensor data currently being collected and analysed is rapidly growing. The key challenges that we will be facing are how to effectively manage and make use of this abundance of easily-generated and diverse health data.
This thesis investigates the challenges posed by the explosive growth of available healthcare data and proposes a number of potential solutions to the problem. As a result, a big data service platform, named Wiki-Health, is presented to provide a unified solution for collecting, storing, tagging, retrieving, searching and analysing personal health sensor data. Additionally, it allows users to reuse and remix data, along with analysis results and analysis models, to make health-related knowledge discovery more available to individual users on a massive scale.
To tackle the challenge of efficiently managing the high volume and diversity of big data, Wiki-Health introduces a hybrid data storage approach capable of storing structured, semi-structured and unstructured sensor data and sensor metadata separately. A multi-tier cloud storage system—CACSS has been developed and serves as a component for the Wiki-Health platform, allowing it to manage the storage of unstructured data and semi-structured data, such as medical imaging files. CACSS has enabled comprehensive features such as global data de-duplication, performance-awareness and data caching services. The design of such a hybrid approach allows Wiki-Health to potentially handle heterogeneous formats of sensor data.
To evaluate the proposed approach, we have developed an ECG-based health monitoring service and a virtual sensing service on top of the Wiki-Health platform. The two services demonstrate the feasibility and potential of using the Wiki-Health framework to enable better utilisation and comprehension of the vast amounts of sensor data available from different sources, and both show significant potential for real-world applications.Open Acces
Study, selection and evaluation of an IoT platform for data collection and analysis for medical sensors
Dissertação de mestrado integrado em Medical InformaticsEvery day, huge amounts of data are generated in the healthcare environments from several
sources, such as medical sensors, EMRs, pharmacy and medical imaging. All of this
data provides a great opportunity for big data applications to discover and understand patterns
or associations between data, in order to support medical decision-making processes.
Big data technologies carry several benefits for the healthcare sector, including preventive
care, better diagnosis, personalized treatment to each patient and even reduce medical costs.
However, the storage and management of big data presents a challenge that traditional data
base management systems can not fulfill. On the contrary, NoSQL databases are distributed
and horizontally scalable data stores, representing a suitable solution for handling big data.
Most of medical data is generated from sensor embedded devices. The concept of IoT,
in the healthcare environment, enables the connection and communication of those devices
and other available resources over the Internet, to perform or help in healthcare activities
such as diagnosing, monitoring or even surgeries. IoT technologies applied to the healthcare
sector aim to improve the access and quality of care for every patient, as well as to
reduce medical costs.
This master thesis presents the integration of both big data and IoT concepts, by developing
an IoT platform designed for data collection and analysis for medical sensors. For
that purpose, an open source platform, Kaa, was deployed with both HBase and Cassandra
as NoSQL database solutions. Furthermore, a big data processing engine, Spark, was also
implemented on the system.
From the results obtained by executing several performance experiments, it is possible
to conclude that the developed platform is suitable for implementation on an healthcare
environment, where huge amounts of data are rapidly generated. The results also made it
possible to perform a comparison between the performance of the platform with Cassandra
and HBase, showing that the last one presents slightly better results in terms of the average
response time.Atualmente, uma grande quantidade de dados é gerada todos os dias em ambientes
hospitalares provenientes de diversas fontes, como por exemplo sensores médicos, registos
eletrónicos, farmácias e imagens médicas. Todos estes dados proporcionam uma grande
oportunidade para aplicações de big data, permitindo revelar e interpretar padrões ou
associações entre os dados de forma a auxiliar no processo de tomada de decisão médica.
As tecnologias de big data comportam diversos benefícios para o sector de saúde, incluindo
a prestação de cuidados preventivos, diagnósticos mais eficientes, tratamento personalizado
para cada paciente e até mesmo reduzir os custos médicos. No entanto, o armazenamento
e a gestão da big data apresenta um desafio que os sistemas de gestão de base de dados
tradicionais não são capazes de ultrapassar. Não obstante, as bases de dados NoSQL representam
uma solução de armazenamento de dados distribuída e escalável horizontalmente,
sendo, portanto, apropriadas para lidar com big data.
Uma grande parte dos dados médicos é gerada através de dispositivos embebidos com
sensores. O conceito de IoT, no ambiente das unidades de saúde, permite a conexão
e comunicação desses dispositivos e outros recursos disponíveis através da Internet, de
forma a realizar ou auxiliar nas atividades de saúde, como por exemplo o diagnóstico, a
monitorização ou atá mesmo em cirurgias. As tecnologias IoT visam melhorar o acesso e
qualidade dos cuidados de saúde para todos os pacientes, bem como reduzir os custos na
prestação dos mesmos.
Esta tese de mestrado apresenta, assim, a integração de ambos os conceitos de big data
e IoT, propondo o desenvolvimento de uma plataforma projetada para a recolha e análise
de dados de sensores médicos. Para essa finalidade, foi utilizada uma plataforma IoT
de código aberto, Kaa, juntamente com duas bases de dados NoSQL, HBase e Cassandra.
Adicionalmente, foi também implementado um mecanismo de processamento de dados,
também de código aberto, o Spark.
Com base nos resultados obtidos através da realização de diversas experiências de avaliação
de desempenho, foi possível concluir que a plataforma desenvolvida é adequada para a
implementação em ambientes de prestação de cuidados de saúde, onde grandes quantidades
de dados são rapidamente geradas. Os resultados permitiram também realizar uma
comparação entre o desempenho da plataforma com Cassandra e com HBase, realçando
que esta última apresenta resultados ligeiramente melhores em termos do tempo médio de
resposta
Framework for resource efficient profiling of spatial model performance, A
2022 Summer.Includes bibliographical references.We design models to understand phenomena, make predictions, and/or inform decision-making. This study targets models that encapsulate spatially evolving phenomena. Given a model M, our objective is to identify how well the model predicts across all geospatial extents. A modeler may expect these validations to occur at varying spatial resolutions (e.g., states, counties, towns, census tracts). Assessing a model with all available ground-truth data is infeasible due to the data volumes involved. We propose a framework to assess the performance of models at scale over diverse spatial data collections. Our methodology ensures orchestration of validation workloads while reducing memory strain, alleviating contention, enabling concurrency, and ensuring high throughput. We introduce the notion of a validation budget that represents an upper-bound on the total number of observations that are used to assess the performance of models across spatial extents. The validation budget attempts to capture the distribution characteristics of observations and is informed by multiple sampling strategies. Our design allows us to decouple the validation from the underlying model-fitting libraries to interoperate with models designed using different libraries and analytical engines; our advanced research prototype currently supports Scikit-learn, PyTorch, and TensorFlow. We have conducted extensive benchmarks that demonstrate the suitability of our methodology
Large spatial datasets: Present Challenges, future opportunities
The key advantages of a well-designed multidimensional database is its ability to allow as many users as possible across an organisation to simultaneously gain access and view of the same data. Large spatial datasets evolve from scientific activities (from recent days) that tends to generate large databases which always come in a scale nearing terabyte of data size and in most cases are multidimensional. In this paper, we look at the issues pertaining to large spatial datasets; its feature (for example views), architecture, access methods and most importantly design technologies. We also looked at some ways of possibly improving the performance of some of the existing algorithms for managing large spatial datasets. The study reveals that the major challenges militating against effective management of large spatial datasets is storage utilization and computational complexity (both of which are characterised by the size of spatial big data which now tends to exceeds the capacity of commonly used spatial computing systems owing to their volume, variety and velocity). These problems fortunately can be combated by employing functional programming method or parallelization techniques
Big Data and the Internet of Things
Advances in sensing and computing capabilities are making it possible to
embed increasing computing power in small devices. This has enabled the sensing
devices not just to passively capture data at very high resolution but also to
take sophisticated actions in response. Combined with advances in
communication, this is resulting in an ecosystem of highly interconnected
devices referred to as the Internet of Things - IoT. In conjunction, the
advances in machine learning have allowed building models on this ever
increasing amounts of data. Consequently, devices all the way from heavy assets
such as aircraft engines to wearables such as health monitors can all now not
only generate massive amounts of data but can draw back on aggregate analytics
to "improve" their performance over time. Big data analytics has been
identified as a key enabler for the IoT. In this chapter, we discuss various
avenues of the IoT where big data analytics either is already making a
significant impact or is on the cusp of doing so. We also discuss social
implications and areas of concern.Comment: 33 pages. draft of upcoming book chapter in Japkowicz and Stefanowski
(eds.) Big Data Analysis: New algorithms for a new society, Springer Series
on Studies in Big Data, to appea
Real-world Machine Learning Systems: A survey from a Data-Oriented Architecture Perspective
Machine Learning models are being deployed as parts of real-world systems
with the upsurge of interest in artificial intelligence. The design,
implementation, and maintenance of such systems are challenged by real-world
environments that produce larger amounts of heterogeneous data and users
requiring increasingly faster responses with efficient resource consumption.
These requirements push prevalent software architectures to the limit when
deploying ML-based systems. Data-oriented Architecture (DOA) is an emerging
concept that equips systems better for integrating ML models. DOA extends
current architectures to create data-driven, loosely coupled, decentralised,
open systems. Even though papers on deployed ML-based systems do not mention
DOA, their authors made design decisions that implicitly follow DOA. The
reasons why, how, and the extent to which DOA is adopted in these systems are
unclear. Implicit design decisions limit the practitioners' knowledge of DOA to
design ML-based systems in the real world. This paper answers these questions
by surveying real-world deployments of ML-based systems. The survey shows the
design decisions of the systems and the requirements these satisfy. Based on
the survey findings, we also formulate practical advice to facilitate the
deployment of ML-based systems. Finally, we outline open challenges to
deploying DOA-based systems that integrate ML models.Comment: Under revie
- …