118 research outputs found

    Multidimensional models :A state of art

    Get PDF
    In the last four years, some multidimensional models were defined. Some of these models are focused on query specifications, but only a little has really a conceptual approach. The nex section is an attempt to present some fundamentals over some of "query models". These models are aummarized in [Car 97] and [Sap 99]. For such models, only some comparisions are made with a special emphasis on data structures. The following sections are dedicated to some relevant works on Multidimensional Conceptual and Logic Modeling ([Cab 97], [Gol 98], [Fra 99a], [Sap 99a]). Later, some conclusions, about the works are presented

    CMDM: un método conceptual para la especificación de bases multidimensionales

    Get PDF
    En los Ășltimos años, el ĂĄrea de Data Warehouse y Aplicaciones OLAP ha tenido un desarrollo importante. En este tipo de aplicaciones, se construye una base de datos con visiĂłn multidimensional de la realidad. A pesar del amplio desarrollo del ĂĄrea, aĂșn no hay mecanismos de especificaciĂłn de este tipo de sistemas que permita tener en cuenta la mayor parte de los detalles relevantes de una determinada porciĂłn de la realidad. Las especificaciones relacionales [Kim 96] dejan de lado algunos aspectos multidimensionales tales como la dimensionalidad genĂ©rica [Cod 93]. Las especificaciones multidimensionales que manejan la dimensionalidad genĂ©rica, contemplan los aspectos de carga y/o limpieza de los datos desde datos de un nivel inferior de una forma externa al modelo o directamente, no los contemplan. Pocas propuestas son capaces de permitir la especificaciĂłn de quĂ© manipulaciones estĂĄn autorizadas y cuales no, sobre determinado aspecto de la realidad. En este trabajo, se presenta un modelo que permite la especificaciĂłn detallada de una base de datos multidimensional. Esta especificaciĂłn se construye mediante un lenguaje grĂĄfico que permite describir las estructuras de datos y algunas restricciones de integridad, y un lenguaje de restricciones de integridad que permite dar una descripciĂłn precisa de las relaciones entre los datos

    A Biased Topic Modeling Approach for Case Control Study from Health Related Social Media Postings

    Get PDF
    abstract: Online social networks are the hubs of social activity in cyberspace, and using them to exchange knowledge, experiences, and opinions is common. In this work, an advanced topic modeling framework is designed to analyse complex longitudinal health information from social media with minimal human annotation, and Adverse Drug Events and Reaction (ADR) information is extracted and automatically processed by using a biased topic modeling method. This framework improves and extends existing topic modelling algorithms that incorporate background knowledge. Using this approach, background knowledge such as ADR terms and other biomedical knowledge can be incorporated during the text mining process, with scores which indicate the presence of ADR being generated. A case control study has been performed on a data set of twitter timelines of women that announced their pregnancy, the goals of the study is to compare the ADR risk of medication usage from each medication category during the pregnancy. In addition, to evaluate the prediction power of this approach, another important aspect of personalized medicine was addressed: the prediction of medication usage through the identification of risk groups. During the prediction process, the health information from Twitter timeline, such as diseases, symptoms, treatments, effects, and etc., is summarized by the topic modelling processes and the summarization results is used for prediction. Dimension reduction and topic similarity measurement are integrated into this framework for timeline classification and prediction. This work could be applied to provide guidelines for FDA drug risk categories. Currently, this process is done based on laboratory results and reported cases. Finally, a multi-dimensional text data warehouse (MTD) to manage the output from the topic modelling is proposed. Some attempts have been also made to incorporate topic structure (ontology) and the MTD hierarchy. Results demonstrate that proposed methods show promise and this system represents a low-cost approach for drug safety early warning.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    ExpRalytics: analyse expressive et efficace de graphes RDF

    Get PDF
    Large (Linked) Open Data are increasingly shared as RDF graphs today. However, such data does not yet reach its full potential in terms of sharing and reuse. We provide new methods to meaningfully summarize data graphs, with a particular focus on RDF graphs. One class of tools for this task are structural RDF graph summaries, which allow users to grasp the different connections between RDF graph nodes. To this end, we introduce our novel RDFQuotient tool that finds compact yet informative RDF graph summaries that can serve as first-sight visualizations of an RDF graph’s structure. We also consider the problem of automatically identifying the k most interesting aggregate queries that can be evaluated on an RDF graph, given an integer k and a user-specified interestingness function. Aggregate queries are routinely used to learn insights from relational data warehouses, and some prior research has addressed the problem of automatically recommending interesting aggregate queries.Les donnĂ©es ouvertes sont souvent partagĂ©es sous la forme de graphes RDF, qui sont une incarnation du principe Linked Open Data (donnĂ©es ouvertes liĂ©es). De telles donnĂ©es n’ont toutefois pas atteint leur entier potentiel d’utilisation et de partage. L’obstacle pour ce faire rĂ©side principalement au niveau de la capacitĂ© des utilisateurs Ă  explorer, dĂ©couvrir et saisir le contenu et des graphes RDF; cette tĂąche est complexe car les graphes sont naturellement hĂ©tĂ©rogĂšnes, et peuvent ĂȘtre Ă  la fois volumineux et complexes. Nous proposons de nouvelles mĂ©thodes pour rĂ©sumer de grands graphes de donnĂ©es, avec un accent particulier sur les graphes RDF. A cette fin, nous avons proposĂ© une nouvelle approchĂ© pour la construction de rĂ©sumĂ©s structurels de graphes RDF, Ă  savoir RDFQuotient.Nous considĂ©rons aussi le problĂšme d’identifier automatiquement les requĂȘtes d’agrĂ©gation les plus intĂ©ressantes qui peuvent ĂȘtre Ă©valuĂ©es sur un graphe RDF

    Analytic Extensions to the Data Model for Management Analytics and Decision Support in the Big Data Environment

    Get PDF
    From 2006 to 2016, an estimated average of 50% of big data analytics and decision support projects failed to deliver acceptable and actionable outputs to business users. The resulting management inefficiency came with high cost, and wasted investments estimated at $2.7 trillion in 2016 for companies in the United States. The purpose of this quantitative descriptive study was to examine the data model of a typical data analytics project in a big data environment for opportunities to improve the information created for management problem-solving. The research questions focused on finding artifacts within enterprise data to model key business scenarios for management action. The foundations of the study were information and decision sciences theories, especially information entropy and high-dimensional utility theories. The design-based research in a nonexperimental format was used to examine the data model for the functional forms that mapped the available data to the conceptual formulation of the management problem by combining ontology learning, data engineering, and analytic formulation methodologies. Semantic, symbolic, and dimensional extensions emerged as key functional forms of analytic extension of the data model. The data-modeling approach was applied to 15-terabyte secondary data set from a multinational medical product distribution company with profit growth problem. The extended data model simplified the composition of acceptable analytic insights, the derivation of business solutions, and the design of programs to address the ill-defined management problem. The implication for positive social change was the potential for overall improvement in management efficiency and increasing participation in advocacy and sponsorship of social initiatives

    Personnalisation d'analyses décisionnelles sur des données multidimensionnelles

    Get PDF
    This thesis investigates OLAP analysis personalization within multidimensional databases. OLAP analyse is modeled through a graph where nodes represent the analysis contexts and graph edges represent the user operations. The analysis context regroups the user query as well as result. It is well described by a specific tree structure that is independent on the visualization structures of data and query languages. We provided a model for user preferences on the multidimensional schema and values. Each preference is associated with a specific analysis context. Based on previous models, we proposed a generic framework that includes two personalization processes. First process, denoted query personalization, aims to enhancing user query with related preferences in order to produce a new one that generates a personalized result. Second personalization process is query recommendation that allows helping user throughout the OLAP data exploration phase. Our recommendation framework supports three recommendation scenarios, i.e., assisting user in query composition, suggesting the forthcoming query, and suggesting alternative queries. Recommendations are built progressively basing on user preferences. In order to implement our framework, we developed a prototype system that supports query personalization and query recommendation processes. We present experimental results showing the efficiency and the effectiveness of our approaches.Le travail prĂ©sentĂ© dans cette thĂšse aborde la problĂ©matique de la personnalisation des analyses OLAP au sein des bases de donnĂ©es multidimensionnelles. Une analyse OLAP est modĂ©lisĂ©e par un graphe dont les noeuds reprĂ©sentent les contextes d'analyse et les arcs traduisent les opĂ©rations de l'utilisateur. Le contexte d'analyse regroupe la requĂȘte et le rĂ©sultat. Il est dĂ©crit par un arbre spĂ©cifique qui est indĂ©pendant des structures de visualisation des donnĂ©es et des langages de requĂȘte. Par ailleurs, nous proposons un modĂšle de prĂ©fĂ©rences utilisateur exprimĂ©es sur le schĂ©ma multidimensionnel et sur les valeurs. Chaque prĂ©fĂ©rence est associĂ©e Ă  un contexte d'analyse particulier. En nous basant sur ces modĂšles, nous proposons un cadre gĂ©nĂ©rique comportant deux mĂ©canismes de personnalisation. Le premier mĂ©canisme est la personnalisation de requĂȘte. Il permet d'enrichir la requĂȘte utilisateur Ă  l'aide des prĂ©fĂ©rences correspondantes afin de gĂ©nĂ©rer un rĂ©sultat qui satisfait au mieux aux besoins de l'usager. Le deuxiĂšme mĂ©canisme de personnalisation est la recommandation de requĂȘtes qui permet d'assister l'utilisateur tout au long de son exploration des donnĂ©es OLAP. Trois scĂ©narios de recommandation sont dĂ©finis : l'assistance Ă  la formulation de requĂȘte, la proposition de la prochaine requĂȘte et la suggestion de requĂȘtes alternatives. Ces recommandations sont construites progressivement Ă  l'aide des prĂ©fĂ©rences de l'utilisateur. Afin valider nos diffĂ©rentes contributions, nous avons dĂ©veloppĂ© un prototype qui intĂšgre les mĂ©canismes de personnalisation et de recommandation de requĂȘte proposĂ©s. Nous prĂ©sentons les rĂ©sultats d'expĂ©rimentations montrant la performance et l'efficacitĂ© de nos approches. Mots-clĂ©s: OLAP, analyse dĂ©cisionnelle, personnalisation de requĂȘte, systĂšme de recommandation, prĂ©fĂ©rence utilisateur, contexte d'analyse, appariement d'arbres de contexte

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

    Yavaa: supporting data workflows from discovery to visualization

    Get PDF
    Recent years have witness an increasing number of data silos being opened up both within organizations and to the general public: Scientists publish their raw data as supplements to articles or even standalone artifacts to enable others to verify and extend their work. Governments pass laws to open up formerly protected data treasures to improve accountability and transparency as well as to enable new business ideas based on this public good. Even companies share structured information about their products and services to advertise their use and thus increase revenue. Exploiting this wealth of information holds many challenges for users, though. Oftentimes data is provided as tables whose sheer endless rows of daunting numbers are barely accessible. InfoVis can mitigate this gap. However, offered visualization options are generally very limited and next to no support is given in applying any of them. The same holds true for data wrangling. Only very few options to adjust the data to the current needs and barely any protection are in place to prevent even the most obvious mistakes. When it comes to data from multiple providers, the situation gets even bleaker. Only recently tools emerged to search for datasets across institutional borders reasonably. Easy-to-use ways to combine these datasets are still missing, though. Finally, results generally lack proper documentation of their provenance. So even the most compelling visualizations can be called into question when their coming about remains unclear. The foundations for a vivid exchange and exploitation of open data are set, but the barrier of entry remains relatively high, especially for non-expert users. This thesis aims to lower that barrier by providing tools and assistance, reducing the amount of prior experience and skills required. It covers the whole workflow ranging from identifying proper datasets, over possible transformations, up until the export of the result in the form of suitable visualizations

    Economic indicators used for EU projects, in other criteria of aggregation than national / regional

    Get PDF
    Economical and social indicators are created and published for national and regional dimensions. Nowadays, both local and territorial indicators are really able to define more adequate the stage of social and economical development and to illustrate the impact of European programs and projects in fields like: long lasting development, entrepreneurial development, scientific research development and strategies, education and learning resources, IT resources, dissemination of European culture etc. If in the first part, there is only quantitative information, offered by our National Institute of Statistics (NIS), in the following few examples of some useful economical and social indicators provide a dynamic vision in defining objectives, methods and implementation Thus the need for a quantitative framework of local and territorial indicators demands for an original statistical methodology.gross domestic product, indicators in macro, mezo and micro economics, weight of selected, factors, representative methodology

    Modélisation des bases de données multidimensionnelles : analyse par fonctions d'agrégation multiples

    Get PDF
    Le résumé en français n'a pas été communiqué par l'auteur.Le résumé en anglais n'a pas été communiqué par l'auteur
