226 research outputs found
Data Integration over NoSQL Stores Using Access Path Based Mappings
International audienceDue to the large amount of data generated by user interactions on the Web, some companies are currently innovating in the domain of data management by designing their own systems. Many of them are referred to as NoSQL databases, standing for 'Not only SQL'. With their wide adoption will emerge new needs and data integration will certainly be one of them. In this paper, we adapt a framework encountered for the integration of relational data to a broader context where both NoSQL and relational databases can be integrated. One important extension consists in the efficient answering of queries expressed over these data sources. The highly denormalized aspect of NoSQL databases results in varying performance costs for several possible query translations. Thus a data integration targeting NoSQL databases needs to generate an optimized translation for a given query. Our contributions are to propose (i) an access path based mapping solution that takes benefit of the design choices of each data source, (ii) integrate preferences to handle conflicts between sources and (iii) a query language that bridges the gap between the SQL query expressed by the user and the query language of the data sources. We also present a prototype implementation, where the target schema is represented as a set of relations and which enables the integration of two of the most popular NoSQL database models, namely document and a column family stores
Analyzing performance of Apache Tez and MapReduce with hadoop multinode cluster on Amazon cloud
Bilateral Ovarian Endometriomas Presenting as Nonprogress of Labor: First Case Report in the Literature Is Concomitant Surgical Excision during Cesarean Section Advisable?
Objective. To report the first case of bilateral ovarian endometriomas, leading to nonprogress of labour, successfully excised during cesarean section.Design. Case report.Setting. Department of Obstetrics & Gynecology of Dr. RPGMC Tanda, Kangra, India.Patients. A primigravida in labour at term gestation.Interventions. Surgical management.Main Outcome Measures. Description and treatment of a pregnant woman with bilateral ovarian endometriomas during cesarean section.Results. Successful excision of ovarian endometriomas and reconstruction of the ovaries during cesarean section.Conclusion. Management of incidentally detected endometriomas during cesarean section should be individualized, taking into account the symptoms, size, bilaterality, and adhesion with adjacent organs.</jats:p
Big Data Analysis
The value of big data is predicated on the ability to detect trends and patterns and more generally to make sense of the large volumes of data that is often comprised of a heterogeneous mix of format, structure, and semantics. Big data analysis is the component of the big data value chain that focuses on transforming raw acquired data into a coherent usable resource suitable for analysis. Using a range of interviews with key stakeholders in small and large companies and academia, this chapter outlines key insights, state of the art, emerging trends, future requirements, and sectorial case studies for data analysis
Auditoría de marca basada en las variables de marketing mix y brand equity. Caso: IKARUS
El presente trabajo de investigación aborda el análisis de la marca corporativa Ikarus
mediante una auditoria de marca desarrollada en base a las variables del marketing mix y las
dimensiones de brand equity de David Aaker. Esta investigación resulta relevante debido al actual
mercado competitivo al que se enfrentan los emprendimientos peruanos, el cual obliga a dichas
empresas a proponer nuevas acciones dirigidas a la calidad, diferenciación, segmentación y
políticas de precio que contribuyan a su crecimiento.
La investigación se realiza mediante un estudio de caso que se centra en un
emprendimiento peruano del sector textil y confecciones, específicamente de ropa urbana.
Propone entender cuáles son las acciones realizadas por la empresa y la perspectiva de los clientes
de la marca, así como presentar recomendaciones que aporten a su mejora. La marca corporativa
Ikarus fabrica y comercializa prendas de ropa urbana cuyo valor agregado son sus diseños únicos.
Desarrollándose en un contexto tan competitivo, se considera necesaria una evaluación que
permita a la empresa encontrar tanto sus puntos de mejora como sus fortalezas, para así tomar
decisiones estratégicas respecto a sus marcas, apuntando hacia su crecimiento en el sector.
El análisis se realiza a través de un enfoque mixto, con ayuda de herramientas como focus
group, observaciones en puntos de venta, entrevistas a profundidad a colaboradores internos y
externos, y encuestas a clientes de la marca, las cuales brindaron información útil al estudio.
Asimismo, se plantearon en la investigación dos hipótesis importantes: las acciones de la marca
Ikarus han servido de apoyo para su crecimiento y el brand equity de la marca es positivo.
En base al análisis del marketing mix para estudiar las acciones realizadas por la marca y
al análisis de las dimensiones del brand equity para estudiar la perspectiva del cliente, se
comprueban las hipótesis planteadas. La marca Ikarus posee un conjunto de acciones que logran
generar un crecimiento, así como un nivel alto en las distintas dimensiones del brand equity:
calidad percibida/medidas de liderazgo, lealtad, reconocimiento y medidas de asociación/
diferenciación.Tesi
Just-In-Time Data Distribution for Analytical Query Processing
Distributed processing commonly requires data spread across machines using a
priori static or hash-based data allocation. In this paper, we explore
an alternative approach that starts from a master node in control of the
complete database, and a variable number of worker nodes for delegated
query processing. Data is shipped just-in-time to the worker nodes using
a need to know policy, and is being reused, if possible, in subsequent
queries. A bidding mechanism among the workers yields a scheduling with
the most efficient reuse of previously shipped data, minimizing the data
transfer costs.
Just-in-time data shipment allows our system to benefit from locally
available idle resources to boost overall performance. The system is
maintenance-free and allocation is fully transparent to users. Our
experiments show that the proposed adaptive distributed architecture is a
viable and flexible alternative for small scale MapReduce-type of
settings
Hadoop-BAM: directly manipulating next generation sequencing data in the cloud
Summary: Hadoop-BAM is a novel library for the scalable manipulation of aligned next-generation sequencing data in the Hadoop distributed computing framework. It acts as an integration layer between analysis applications and BAM files that are processed using Hadoop. Hadoop-BAM solves the issues related to BAM data access by presenting a convenient API for implementing map and reduce functions that can directly operate on BAM records. It builds on top of the Picard SAM JDK, so tools that rely on the Picard API are expected to be easily convertible to support large-scale distributed processing. In this article we demonstrate the use of Hadoop-BAM by building a coverage summarizing tool for the Chipster genome browser. Our results show that Hadoop offers good scalability, and one should avoid moving data in and out of Hadoop between analysis steps
Challenges in managing real-time data in health information system (HIS)
© Springer International Publishing Switzerland 2016. In this paper, we have discussed the challenges in handling real-time medical big data collection and storage in health information system (HIS). Based on challenges, we have proposed a model for realtime analysis of medical big data. We exemplify the approach through Spark Streaming and Apache Kafka using the processing of health big data Stream. Apache Kafka works very well in transporting data among different systems such as relational databases, Apache Hadoop and nonrelational databases. However, Apache Kafka lacks analyzing the stream, Spark Streaming framework has the capability to perform some operations on the stream. We have identified the challenges in current realtime systems and proposed our solution to cope with the medical big data streams
Supply chain hybrid simulation: From Big Data to distributions and approaches comparison
The uncertainty and variability of Supply Chains paves the way for simulation to be employed to mitigate such risks. Due to the amounts of data generated by the systems used to manage relevant Supply Chain processes, it is widely recognized that Big Data technologies may bring benefits to Supply Chain simulation models. Nevertheless, a simulation model should also consider statistical distributions, which allow it to be used for purposes such as testing risk scenarios or for prediction. However, when Supply Chains are complex and of huge-scale, performing distribution fitting may not be feasible, which often results in users focusing on subsets of problems or selecting samples of elements, such as suppliers or materials. This paper proposed a hybrid simulation model that runs using data stored in a Big Data Warehouse, statistical distributions or a combination of both approaches. The results show that the former approach brings benefits to the simulations and is essential when setting the model to run based on statistical distributions. Furthermore, this paper also compared these approaches, emphasizing the pros and cons of each, as well as their differences in computational requirements, hence establishing a milestone for future researches in this domain.This work has been supported by national funds through FCT -Fundacao para a Ciencia e Tecnologia within the Project Scope: UID/CEC/00319/2019 and by the Doctoral scholarship PDE/BDE/114566/2016 funded by FCT, the Portuguese Ministry of Science, Technology and Higher Education, through national funds, and co-financed by the European Social Fund (ESF) through the Operational Programme for Human Capital (POCH)
On the use of simulation as a Big Data semantic validator for supply chain management
Simulation stands out as an appropriate method for the Supply Chain Management (SCM) field. Nevertheless, to produce accurate simulations of Supply Chains (SCs), several business processes must be considered. Thus, when using real data in these simulation models, Big Data concepts and technologies become necessary, as the involved data sources generate data at increasing volume, velocity and variety, in what is known as a Big Data context. While developing such solution, several data issues were found, with simulation proving to be more efficient than traditional data profiling techniques in identifying them. Thus, this paper proposes the use of simulation as a semantic validator of the data, proposed a classification for such issues and quantified their impact in the volume of data used in the final achieved solution. This paper concluded that, while SC simulations using Big Data concepts and technologies are within the grasp of organizations, their data models still require considerable improvements, in order to produce perfect mimics of their SCs. In fact, it was also found that simulation can help in identifying and bypassing some of these issues.This work has been supported by FCT (Fundacao para a Ciencia e Tecnologia) within the Project Scope: UID/CEC/00319/2019 and by the Doctoral scholarship PDE/BDE/114566/2016 funded by FCT, the Portuguese Ministry of Science, Technology and Higher Education, through national funds, and co-financed by the European Social Fund (ESF) through the Operational Programme for Human Capital (POCH)
- …
