6,205 research outputs found
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
Vectorwise: Beyond Column Stores
textabstractThis paper tells the story of Vectorwise, a high-performance analytical database system, from multiple perspectives: its history from academic project to commercial product, the evolution of its technical
architecture, customer reactions to the product and its future research and development roadmap. One take-away from this story is that the novelty in Vectorwise is much more than just column-storage:
it boasts many query processing innovations in its vectorized execution model, and an adaptive mixed
row/column data storage model with indexing support tailored to analytical workloads. Another one is that there is a long road from research prototype to commercial product, though database research continues to achieve a strong innovative influence on product development
A Technology Proposal for a Management Information System for the Director’s Office, NAL.
This technology proposal attempts in giving a viable solution for a Management Information System (MIS) for the Director's Office. In today's IT scenario, an Organization's success greatly depends on its ability to get accurate and timely data on its operations of varied nature and to manage this data effectively to guide its activities and meet its goals. To cater to the information needs of an Organization or an Office like the Director's Office, information systems are developed and deployed to gather and process data in ways that produce a variety of information to the end-user. MIS can therefore can be defined as an integrated user-machine system for providing information to support operations, management and decision-making functions in an Organization. The system in a nutshell, utilizes computer hardware and software, manual procedures, models for analysis planning, control and decision-making and a database. Using state-of-the-art front-end and back-end web based tools, this technology proposal attempts to provide a single-point Information Management, Information Storage, Information Querying and Information Retrieval interface to the Director and his office for handling all information traffic flow in and out of the Director's Office
CaosDB - Research Data Management for Complex, Changing, and Automated Research Workflows
Here we present CaosDB, a Research Data Management System (RDMS) designed to
ensure seamless integration of inhomogeneous data sources and repositories of
legacy data. Its primary purpose is the management of data from biomedical
sciences, both from simulations and experiments during the complete research
data lifecycle. An RDMS for this domain faces particular challenges: Research
data arise in huge amounts, from a wide variety of sources, and traverse a
highly branched path of further processing. To be accepted by its users, an
RDMS must be built around workflows of the scientists and practices and thus
support changes in workflow and data structure. Nevertheless it should
encourage and support the development and observation of standards and
furthermore facilitate the automation of data acquisition and processing with
specialized software. The storage data model of an RDMS must reflect these
complexities with appropriate semantics and ontologies while offering simple
methods for finding, retrieving, and understanding relevant data. We show how
CaosDB responds to these challenges and give an overview of the CaosDB Server,
its data model and its easy-to-learn CaosDB Query Language. We briefly discuss
the status of the implementation, how we currently use CaosDB, and how we plan
to use and extend it
Principles of Multimedia News Systems for Business Applications
In the past few years considerable demand for business oriented multimedia information systems has developed. A multimedia information system is one that can create, import, integrate, store, retrieve, edit, and delete two or more types of media materials in digital form, such as audio, image, full-motion video, and text information. Multimedia information systems play a central role in many business activities. They represent a very special class complex computing systems. This paper surveys a special type of multimedia information systems: multimedia news systems. Multimedia news systems deal with architectures to manage complex multimedia news databases, online presentation and distribution services or the integration of several existing services to meta-services using intelligent news retrieval engines. The leading presentation platform in multimedia news presentation is news networks providing television services and Internet content distribution. The primary focus is on advanced multimedia news systems infrastructure, document standards, application architecture and principles for multimedia news on the Web that suggest long-term trends in this increasingly important area.multimedia, information systems, business applications
RETHINK big: European roadmap for hardware anc networking optimizations for big data
This paper discusses the results of the RETHINK big Project, a 2-year Collaborative Support Action funded by the European Commission in order to write the European Roadmap for Hardware and Networking optimizations for Big Data. This industry-driven project was led by the Barcelona Supercomputing Center (BSC), and it included large industry partners, SMEs and academia. The roadmap identifies business opportunities from 89 in-depth interviews with 70 European industry stakeholders in the area of Big Data and predicts the future technologies that will disrupt the state of the art in Big Data processing in terms of hardware and networking optimizations. Moreover, it presents coordinated technology development recommendations (focused on optimizations in networking and hardware) that would be in the best interest of European Big Data companies to undertake in concert as a matter of competitive advantage.This project has received funding from the European Union’s Seventh Framework Programme for research, technological
development and demonstration under grant agreement n° 619788. It has also been supported by the Spanish Government (grant SEV2015-0493 of the Severo Ochoa
Program), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316) and by Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272).Peer ReviewedPostprint (author's final draft
Extract, Transform, and Load data from Legacy Systems to Azure Cloud
Internship report presented as partial requirement for obtaining the Master’s degree in Information
Management, with a specialization in Knowledge Management and Business IntelligenceIn a world with continuously evolving technologies and hardened competitive markets, organisations need to continually be on guard to grasp cutting edge technology and tools that will help them to surpass any competition that arises. Modern data platforms that incorporate cloud technologies, support organisations to strive and get ahead of their competitors by providing solutions that help them capture and optimally use untapped data, and scalable storages to adapt to ever-growing data quantities. Also, adopt data processing and visualisation tools that help to improve the decision-making process. With many cloud providers available in the market, from small players to major technology corporations, this offers much flexibility to organisations to choose the best cloud technology that will align with their use cases and overall products and services strategy. This internship came up at the time when one of Accenture’s significant client in the financial industry decided to migrate from legacy systems to a cloud-based data infrastructure that is Microsoft Azure cloud. During this internship, development of the data lake, which is a core part of the MDP, was done to understand better the type of challenges that can be faced when migrating data from on-premise legacy systems to a cloud-based infrastructure. Also, provided in this work, are the main recommendations and guidelines when it comes to performing a large scale data migration
- …