9,189 research outputs found
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
The CUBIST Project: Combining and Uniting Business Intelligence with Semantic Technologies
As a preface to this Special 'CUBIST' Edition of the International Journal of Intelligent Information Technologies IJIIT, this article describes the European Framework Seven Combining and Unifying Business Intelligence with Semantic Technologies CUBIST project, which ran from October 2010 to September 2013. The project aimed to combine the best elements of traditional BI with the newer, semantic, technologies of the Sematic Web, in the form of the Resource Description Framework RDF, and Formal Concept Analysis FCA. CUBIST's purpose was to provide end-users with "conceptually relevant and user friendly visual analytics" to allow them to explore their data in new ways, discovering hidden meaning and solving hitherto difficult problems. To this end, three of the partners in CUBIST were use-cases: recruitment consultancy, computational biology and the space industry. Each use-case provided their own requirements and problems that were finally addressed by the prototype CUBIST visual-analytics developed in the project
A Conceptual Framework for Adapation
This paper presents a white-box conceptual framework for adaptation that promotes a neat separation of the adaptation logic from the application logic through a clear identification of control data and their role in the adaptation logic. The framework provides an original perspective from which we survey archetypal approaches to (self-)adaptation ranging from programming languages and paradigms, to computational models, to engineering solutions
A Conceptual Framework for Adapation
We present a white-box conceptual framework for adaptation. We called it CODA, for COntrol Data Adaptation, since it is based on the notion of control data. CODA promotes a neat separation between application and adaptation logic through a clear identification of the set of data that is relevant for the latter. The framework provides an original perspective from which we survey a representative set of approaches to adaptation ranging from programming languages and paradigms, to computational models and architectural solutions
A Conceptual Framework for Adapation
This paper presents a white-box conceptual framework for adaptation that promotes a neat separation of the adaptation logic from the application logic through a clear identification of control data and their role in the adaptation logic. The framework provides an original perspective from which we survey archetypal approaches to (self-)adaptation ranging from programming languages and paradigms, to computational models, to engineering solutions
Feature Extraction and Duplicate Detection for Text Mining: A Survey
Text mining, also known as Intelligent Text Analysis is an important research area. It is very difficult to focus on the most appropriate information due to the high dimensionality of data. Feature Extraction is one of the important techniques in data reduction to discover the most important features. Proce- ssing massive amount of data stored in a unstructured form is a challenging task. Several pre-processing methods and algo- rithms are needed to extract useful features from huge amount of data. The survey covers different text summarization, classi- fication, clustering methods to discover useful features and also discovering query facets which are multiple groups of words or phrases that explain and summarize the content covered by a query thereby reducing time taken by the user. Dealing with collection of text documents, it is also very important to filter out duplicate data. Once duplicates are deleted, it is recommended to replace the removed duplicates. Hence we also review the literature on duplicate detection and data fusion (remove and replace duplicates).The survey provides existing text mining techniques to extract relevant features, detect duplicates and to replace the duplicate data to get fine grained knowledge to the user
An Ensemble Framework Approach to Crop Type Prediction Using Feature Selection and Multiclass Classification
Crop type classification plays a crucial role in modern agriculture, aiding in yield prediction, resource management, and land-use planning. This paper presents a comprehensive framework for crop type classification utilizing a combination of feature selection techniques, robust classification Algorithm, and a Support Vector Machine (SVM)-based multiclass classification approach. The proposed framework begins with a novel feature selection process that identifies the most relevant attributes from the Agricultural Data and Rainfall data. This feature selection step is essential for reducing data dimensionality, enhancing classification accuracy, and improving model interpretability. Following feature selection, a state-of-the-art multiclass classification strategy based on Support Vector Machines is employed. SVMs are known for their capability to handle high-dimensional data and have demonstrated superior performance in various classification tasks. In this framework, SVMs are adapted to handle multiclass crop type classification efficiently. The model is trained on the selected features and optimized using hyperparameter tuning techniques to ensure robust performance
Smart campuses : extensive review of the last decade of research and current challenges
Novel intelligent systems to assist energy transition and improve sustainability can be deployed at different scales, ranging from a house to an entire region. University campuses are an interesting intermediate size (big enough to matter and small enough to be tractable) for research, development, test and training on the integration of smartness at all levels, which has led to the emergence of the concept of āsmart campusā over the last few years. This review article proposes an extensive analysis of the scientific literature on smart campuses from the last decade (2010-2020). The 182 selected publications are distributed into seven categories of smartness: smart building, smart environment, smart mobility, smart living, smart people, smart governance and smart data. The main open questions and challenges regarding smart campuses are presented at the end of the review and deal with sustainability and energy transition, acceptability and ethics, learning models, open data policies and interoperability. The present work was carried out within the framework of the Energy Network of the Regional Leaders Summit (RLS-Energy) as part of its multilateral research efforts on smart region
- ā¦