Search CORE

118 research outputs found

Multidimensional models :A state of art

Author: Carpani Fernando
Publication venue: UR. FI – INCO.
Publication date
Field of study

In the last four years, some multidimensional models were defined. Some of these models are focused on query specifications, but only a little has really a conceptual approach. The nex section is an attempt to present some fundamentals over some of "query models". These models are aummarized in [Car 97] and [Sap 99]. For such models, only some comparisions are made with a special emphasis on data structures. The following sections are dedicated to some relevant works on Multidimensional Conceptual and Logic Modeling ([Cab 97], [Gol 98], [Fra 99a], [Sap 99a]). Later, some conclusions, about the works are presented

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CMDM: un método conceptual para la especificación de bases multidimensionales

Author: Carpani Fernando
Publication venue: UR. FI-INCO,
Publication date
Field of study

En los últimos años, el área de Data Warehouse y Aplicaciones OLAP ha tenido un desarrollo importante. En este tipo de aplicaciones, se construye una base de datos con visión multidimensional de la realidad. A pesar del amplio desarrollo del área, aún no hay mecanismos de especificación de este tipo de sistemas que permita tener en cuenta la mayor parte de los detalles relevantes de una determinada porción de la realidad. Las especificaciones relacionales [Kim 96] dejan de lado algunos aspectos multidimensionales tales como la dimensionalidad genérica [Cod 93]. Las especificaciones multidimensionales que manejan la dimensionalidad genérica, contemplan los aspectos de carga y/o limpieza de los datos desde datos de un nivel inferior de una forma externa al modelo o directamente, no los contemplan. Pocas propuestas son capaces de permitir la especificación de qué manipulaciones están autorizadas y cuales no, sobre determinado aspecto de la realidad. En este trabajo, se presenta un modelo que permite la especificación detallada de una base de datos multidimensional. Esta especificación se construye mediante un lenguaje gráfico que permite describir las estructuras de datos y algunas restricciones de integridad, y un lenguaje de restricciones de integridad que permite dar una descripción precisa de las relaciones entre los datos

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

A Biased Topic Modeling Approach for Case Control Study from Health Related Social Media Postings

Author
Publication venue
Publication date: 01/01/2017
Field of study

abstract: Online social networks are the hubs of social activity in cyberspace, and using them to exchange knowledge, experiences, and opinions is common. In this work, an advanced topic modeling framework is designed to analyse complex longitudinal health information from social media with minimal human annotation, and Adverse Drug Events and Reaction (ADR) information is extracted and automatically processed by using a biased topic modeling method. This framework improves and extends existing topic modelling algorithms that incorporate background knowledge. Using this approach, background knowledge such as ADR terms and other biomedical knowledge can be incorporated during the text mining process, with scores which indicate the presence of ADR being generated. A case control study has been performed on a data set of twitter timelines of women that announced their pregnancy, the goals of the study is to compare the ADR risk of medication usage from each medication category during the pregnancy. In addition, to evaluate the prediction power of this approach, another important aspect of personalized medicine was addressed: the prediction of medication usage through the identification of risk groups. During the prediction process, the health information from Twitter timeline, such as diseases, symptoms, treatments, effects, and etc., is summarized by the topic modelling processes and the summarization results is used for prediction. Dimension reduction and topic similarity measurement are integrated into this framework for timeline classification and prediction. This work could be applied to provide guidelines for FDA drug risk categories. Currently, this process is done based on laboratory results and reported cases. Finally, a multi-dimensional text data warehouse (MTD) to manage the output from the topic modelling is proposed. Some attempts have been also made to incorporate topic structure (ontology) and the MTD hierarchy. Results demonstrate that proposed methods show promise and this system represents a low-cost approach for drug safety early warning.Dissertation/ThesisDoctoral Dissertation Computer Science 201

ASU Digital Repository

ExpRalytics: analyse expressive et efficace de graphes RDF

Author: Guzewicz Pawel
Publication venue: HAL CCSD
Publication date: 06/10/2021
Field of study

Large (Linked) Open Data are increasingly shared as RDF graphs today. However, such data does not yet reach its full potential in terms of sharing and reuse. We provide new methods to meaningfully summarize data graphs, with a particular focus on RDF graphs. One class of tools for this task are structural RDF graph summaries, which allow users to grasp the different connections between RDF graph nodes. To this end, we introduce our novel RDFQuotient tool that finds compact yet informative RDF graph summaries that can serve as first-sight visualizations of an RDF graph’s structure. We also consider the problem of automatically identifying the k most interesting aggregate queries that can be evaluated on an RDF graph, given an integer k and a user-specified interestingness function. Aggregate queries are routinely used to learn insights from relational data warehouses, and some prior research has addressed the problem of automatically recommending interesting aggregate queries.Les données ouvertes sont souvent partagées sous la forme de graphes RDF, qui sont une incarnation du principe Linked Open Data (données ouvertes liées). De telles données n’ont toutefois pas atteint leur entier potentiel d’utilisation et de partage. L’obstacle pour ce faire réside principalement au niveau de la capacité des utilisateurs à explorer, découvrir et saisir le contenu et des graphes RDF; cette tâche est complexe car les graphes sont naturellement hétérogènes, et peuvent être à la fois volumineux et complexes. Nous proposons de nouvelles méthodes pour résumer de grands graphes de données, avec un accent particulier sur les graphes RDF. A cette fin, nous avons proposé une nouvelle approché pour la construction de résumés structurels de graphes RDF, à savoir RDFQuotient.Nous considérons aussi le problème d’identifier automatiquement les requêtes d’agrégation les plus intéressantes qui peuvent être évaluées sur un graphe RDF

INRIA a CCSD electronic archive server

Analytic Extensions to the Data Model for Management Analytics and Decision Support in the Big Data Environment

Author: Akpakpan Nsikak Etim
Publication venue: 'IUScholarWorks'
Publication date: 01/01/2018
Field of study

From 2006 to 2016, an estimated average of 50% of big data analytics and decision support projects failed to deliver acceptable and actionable outputs to business users. The resulting management inefficiency came with high cost, and wasted investments estimated at $2.7 trillion in 2016 for companies in the United States. The purpose of this quantitative descriptive study was to examine the data model of a typical data analytics project in a big data environment for opportunities to improve the information created for management problem-solving. The research questions focused on finding artifacts within enterprise data to model key business scenarios for management action. The foundations of the study were information and decision sciences theories, especially information entropy and high-dimensional utility theories. The design-based research in a nonexperimental format was used to examine the data model for the functional forms that mapped the available data to the conceptual formulation of the management problem by combining ontology learning, data engineering, and analytic formulation methodologies. Semantic, symbolic, and dimensional extensions emerged as key functional forms of analytic extension of the data model. The data-modeling approach was applied to 15-terabyte secondary data set from a multinational medical product distribution company with profit growth problem. The extended data model simplified the composition of acceptable analytic insights, the derivation of business solutions, and the design of programs to address the ill-defined management problem. The implication for positive social change was the potential for overall improvement in management efficiency and increasing participation in advocacy and sponsorship of social initiatives

Walden University

Personnalisation d'analyses décisionnelles sur des données multidimensionnelles

Author: Jerbi Houssem
Publication venue: HAL CCSD
Publication date: 20/01/2012
Field of study

This thesis investigates OLAP analysis personalization within multidimensional databases. OLAP analyse is modeled through a graph where nodes represent the analysis contexts and graph edges represent the user operations. The analysis context regroups the user query as well as result. It is well described by a specific tree structure that is independent on the visualization structures of data and query languages. We provided a model for user preferences on the multidimensional schema and values. Each preference is associated with a specific analysis context. Based on previous models, we proposed a generic framework that includes two personalization processes. First process, denoted query personalization, aims to enhancing user query with related preferences in order to produce a new one that generates a personalized result. Second personalization process is query recommendation that allows helping user throughout the OLAP data exploration phase. Our recommendation framework supports three recommendation scenarios, i.e., assisting user in query composition, suggesting the forthcoming query, and suggesting alternative queries. Recommendations are built progressively basing on user preferences. In order to implement our framework, we developed a prototype system that supports query personalization and query recommendation processes. We present experimental results showing the efficiency and the effectiveness of our approaches.Le travail présenté dans cette thèse aborde la problématique de la personnalisation des analyses OLAP au sein des bases de données multidimensionnelles. Une analyse OLAP est modélisée par un graphe dont les noeuds représentent les contextes d'analyse et les arcs traduisent les opérations de l'utilisateur. Le contexte d'analyse regroupe la requête et le résultat. Il est décrit par un arbre spécifique qui est indépendant des structures de visualisation des données et des langages de requête. Par ailleurs, nous proposons un modèle de préférences utilisateur exprimées sur le schéma multidimensionnel et sur les valeurs. Chaque préférence est associée à un contexte d'analyse particulier. En nous basant sur ces modèles, nous proposons un cadre générique comportant deux mécanismes de personnalisation. Le premier mécanisme est la personnalisation de requête. Il permet d'enrichir la requête utilisateur à l'aide des préférences correspondantes afin de générer un résultat qui satisfait au mieux aux besoins de l'usager. Le deuxième mécanisme de personnalisation est la recommandation de requêtes qui permet d'assister l'utilisateur tout au long de son exploration des données OLAP. Trois scénarios de recommandation sont définis : l'assistance à la formulation de requête, la proposition de la prochaine requête et la suggestion de requêtes alternatives. Ces recommandations sont construites progressivement à l'aide des préférences de l'utilisateur. Afin valider nos différentes contributions, nous avons développé un prototype qui intègre les mécanismes de personnalisation et de recommandation de requête proposés. Nous présentons les résultats d'expérimentations montrant la performance et l'efficacité de nos approches. Mots-clés: OLAP, analyse décisionnelle, personnalisation de requête, système de recommandation, préférence utilisateur, contexte d'analyse, appariement d'arbres de contexte

Thèses en Ligne

Scientific Publications of the University of Toulouse II Le Mirail

Toulouse Capitole Publications

Toulouse 1 Capitole Publications

New Fundamental Technologies in Data Mining

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

Directory of Open Access Books (DOAB)

Yavaa: supporting data workflows from discovery to visualization

Author: Schindler Sirko
Publication venue
Publication date: 01/01/2022
Field of study

Recent years have witness an increasing number of data silos being opened up both within organizations and to the general public: Scientists publish their raw data as supplements to articles or even standalone artifacts to enable others to verify and extend their work. Governments pass laws to open up formerly protected data treasures to improve accountability and transparency as well as to enable new business ideas based on this public good. Even companies share structured information about their products and services to advertise their use and thus increase revenue. Exploiting this wealth of information holds many challenges for users, though. Oftentimes data is provided as tables whose sheer endless rows of daunting numbers are barely accessible. InfoVis can mitigate this gap. However, offered visualization options are generally very limited and next to no support is given in applying any of them. The same holds true for data wrangling. Only very few options to adjust the data to the current needs and barely any protection are in place to prevent even the most obvious mistakes. When it comes to data from multiple providers, the situation gets even bleaker. Only recently tools emerged to search for datasets across institutional borders reasonably. Easy-to-use ways to combine these datasets are still missing, though. Finally, results generally lack proper documentation of their provenance. So even the most compelling visualizations can be called into question when their coming about remains unclear. The foundations for a vivid exchange and exploitation of open data are set, but the barrier of entry remains relatively high, especially for non-expert users. This thesis aims to lower that barrier by providing tools and assistance, reducing the amount of prior experience and skills required. It covers the whole workflow ranging from identifying proper datasets, over possible transformations, up until the export of the result in the form of suitable visualizations

Digitale Bibliothek Thüringen

Economic indicators used for EU projects, in other criteria of aggregation than national / regional

Author: Săvoiu Gheorghe
Publication venue
Publication date
Field of study

Economical and social indicators are created and published for national and regional dimensions. Nowadays, both local and territorial indicators are really able to define more adequate the stage of social and economical development and to illustrate the impact of European programs and projects in fields like: long lasting development, entrepreneurial development, scientific research development and strategies, education and learning resources, IT resources, dissemination of European culture etc. If in the first part, there is only quantitative information, offered by our National Institute of Statistics (NIS), in the following few examples of some useful economical and social indicators provide a dynamic vision in defining objectives, methods and implementation Thus the need for a quantitative framework of local and territorial indicators demands for an original statistical methodology.gross domestic product, indicators in macro, mezo and micro economics, weight of selected, factors, representative methodology

Research Papers in Economics

Modélisation des bases de données multidimensionnelles : analyse par fonctions d'agrégation multiples

Author: Hassan Ali
Publication venue
Publication date: 01/12/2014
Field of study

Le résumé en français n'a pas été communiqué par l'auteur.Le résumé en anglais n'a pas été communiqué par l'auteur

Toulouse Capitole Publications