1,696 research outputs found

    Model-to-Model Transformation - From UML Class Diagrams to Labeled Property Graphs

    Get PDF
    Conceptual schemas are the basis to build well-grounded Information Systems, by representing the main concepts of a domain of knowledge, as well as the relationships among them. Since conceptual schemas focus on the concepts, they are independent of the specific technological platform used to implement them. This allows a single conceptual schema to be transformed into different platform-specific models according to the implementation requirements. This is a non-trivial process that is crucial for the performance and maintainability of the system, as well as for the accomplishment of the domain data requirements. Much research has been done on transforming conceptual schemas into relational data models. Nevertheless, less work has been done on transforming conceptual schemas into property graphs, a data structure indispensable to building appropriate and efficient systems based on graph databases. The work proposes a systematic approach to transform conceptual schemas, represented as UML class diagrams, into property graphs by using a set of transformation rules and patterns applied in a systematic way. Besides a practical example used to help the presentation of the proposed approach, the evaluation has been done by measuring different quality dimensions such as semantic equivalence, readability, maintainability, complexity, size, and performance

    “What will become of my work?”: Genius, Gender, and Legacy in the Life of Clara Wieck/Schumann

    Get PDF
    Clara Wieck/Schumann (1819 – 1896) was a musician living in an era increasingly concerned with posterity and canon formation, yet she believed that as a performer, she was destined for posthumous obscurity. On this matter, she clearly misjudged her historical significance. More than 125 years after her death, Wieck/Schumann is still remembered as a child prodigy whose father trained her to become one of the greatest pianists of the day and whose dedication to the highest artistic ideals was matched only by her unconditional devotion to her husband and children. Wieck/Schumann’s artistic accomplishments as an individual—her remarkable success as a virtuosa and her significant compositional output—are frequently juxtaposed with her roles as a mother of eight and the romantic(ized) partner to fellow composer Robert Schumann. The seeming incongruity of her public and private lives has also lent a certain ambiguity and openness in biographical treatments. As such, Wieck/Schumann has been a canvas upon which writers could project contradictory dogmatic images: eternally faithful and adulterous wife, selfless and neglectful mother, humble and glamorous performer, contentedly domestic and artistically stifled.Through close reading of archival material (diaries and correspondence of Wieck/Schumann and her closest associates) and secondary sources (biographies, historical news and entertainment media, and music analyses), this dissertation traces the origins and evolutions of Wieck/Schumann’s legacies through the dual lenses of gender and genius. This project presents a historiographic study of prescriptive ideologies in Wieck/Schumann biographies with attention to shifts in genre conventions and feminist ideas, engages in psychobiographic analysis of her attitudes toward and justification of her own creativity as a function of her personal relationships, investigates how her image was used in Nazi propaganda with consequent backlash, and culminates in critical consideration of the limitations of biography as a tool to analyze her compositions. Taken together, these components demonstrate the (potentially dangerous) cultural power of biography to perpetuate narratives of gender values and genius with direct implications for the musical and cultural reception of creative women like Wieck/Schumann

    DuckPGQ: Efficient property graph queries in an analytical RDBMS

    Get PDF
    In the past decade, property graph databases have emerged as a growing niche in data management. Many native graph systems and query languages have been created, but the functionality and performance still leave much room for improvement. The upcoming SQL:2023 will introduce the Property Graph Queries (SQL/PGQ) sub-language, giving relational systems the opportunity to standard- ize graph queries, and provide mature graph query functionality. We argue that (i) competent graph data systems must build on all technology that makes up a state-of-the-art relational system, (ii) the graph use case requires the addition to that of a many- source/destination path-finding algorithm and compact graph rep- resentation, and (iii) incites research in practical worst-case-optimal joins and factorized query processing techniques. We outline our design of DuckPGQ that follows this recipe, by adding efficient SQL/PGQ support to the popular open-source “embeddable analytics” relational database system DuckDB, also originally developed at CWI. Our design aims at minimizing techni- cal debt using an approach that relies on efficient vectorized UDFs. We benchmark DuckPGQ showing encouraging performance and scalability on large graph data sets, but also reinforcing the need for future research under (iii)

    The Modernization Process of a Data Pipeline

    Get PDF
    Data plays an integral part in a company’s decision-making. Therefore, decision-makers must have the right data available at the right time. Data volumes grow constantly, and new data is continuously needed for analytical purposes. Many companies use data warehouses to store data in an easy-to-use format for reporting and analytics. The challenge with data warehousing is displaying data using one unified structure. The source data is often gathered from many systems that are structured in various ways. A process called extract, transform, and load (ETL) or extract, load, and transform (ELT) is used to load data into the data warehouse. This thesis describes the modernization process of one such pipeline. The previous solution, which used an on-premises Teradata platform for computation and SQL stored procedures for the transformation logic, is replaced by a new solution. The goal of the new solution is a process that uses modern tools, is scalable, and follows programming best practises. The cloud-based Databricks platform is used for computation, and dbt is used as the transformation tool. Lastly, a comparison is made between the new and old solutions, and their benefits and drawbacks are discussed

    Economic and Financial Performance of the Portuguese Wine Market: Business Intelligence Approach

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsIn the past years, the international wine industry has been subject to intensive globalization and international competition as a result of the aggressive entrance of new players (New World Wine) as well as the consolidation of traditional producers’ countries (Old World Wine), which brings, simultaneously, challenges and opportunities to the industry. In Portugal, winemaking is one of the most relevant socio-economic activities, which makes it one of the most critical industries in Portugal. This study aims to develop a prototype business analytics solution that delivers the necessary inputs to evaluate Portuguese winemaking enterprises’ financial and economic health. Consequently, three research questions were defined: How can one create segments of Portuguese winemaking companies based on their financial performance? What was the evolution of the main financial indicators and margins within each segment and vineyard region? What has been the impact of COVID-19 on these companies’ profitability? Multiple financial ratios for 148 Portuguese companies were computed based on information from 2016 through 2020. Additionally, data of the entire sector was retrieved from the Bank of Portugal for the same period. This study uses analytical tools and dashboards to analyze the wine sector performance and the firms’ performance under each region by comparing their financial results. This study contributes to the literature by expanding on previous research on specific regions of Portugal. Additionally, this study presents the results employing a business analytics solution, including cluster techniques and AI insights. It was concluded, through the analysis of the economic-financial performance, that large companies have higher financial performance. Additionally, the sector, in general, suffered a drop in financial results because of the impacts of the COVID-19 pandemic

    MetsÀkonedatan hyödyntÀminen pilvipalveluympÀristössÀ

    Get PDF
    MetsÀnhoitotyön koneellistumisen myötÀ hakkuuyrityksiltÀ vaaditaan yhÀ tehokkaampaa puuntuotantoa ja työn jÀlkeÀ seurataan maanlaajuisesti. Yritykset ovat velvollisia raportoimaan tuotantonsa valvoville tahoille, mutta jÀrjestelmÀt eivÀt ole tarkoitukseen jÀrin tehokkaita. Pilvipalveluiden avulla prosessia voidaan tehostaa ja informaation saatavuus sekÀ koostaminen helpottuu. Pilvipalvelut mahdollistavat jÀrjestelmien kehitykseen sujuvuutta verrattaen vanhoihin menetelmiin. KehitystyössÀ ei olla enÀÀ sidottuina yhteen monimutkaiseen ratkaisuun, vaan työkaluja voidaan kÀyttÀÀ monipuolisemmin kÀyttötapauksesta riippuen. Oleellisia osia pilvipalveluissa on yhtenÀinen tiedonlÀhde ja resurssien ehtymÀttömyys. Ohjelmistoversioiden julkaisut ovat saumattomia niin kehittÀjille kuin kÀyttÀjillekin. TÀmÀn tutkielman tarkoituksena on kehittÀÀ John Deere Forestry Oy:n tarjoamia ohjelmistoja ja antaa katsaus metsÀkonetyöhön liittyvÀÀn standardiin sekÀ pilvipalveluiden arkkitehtuuriin. KeskeisenÀ aiheena on siirtÀÀ TimberOffice 5 -työpöytÀsovelluksen ominaisuuksia pilvipalveluihin ja kehittÀÀ ratkaisuja ongelmakohtiin, kuten kuljettajien erottaminen toisistaan hakkuuorganisaation laajuisesti sekÀ tarjota standardin mukaiset raportit kÀytettÀvÀksi helpommin. Tutkielman laajuus on rajattu pohjoismaissa kÀytettÀvien puunkatkontamenetelmien tuottamaan dataan ja pilvipalveluarkkitehtuuriin. Aluksi kÀsitellÀÀn kansainvÀlisen standardin muutoksia sekÀ ongelmakohtia, jonka jÀlkeen aihetta tarkennetaan metsÀkonedatan kÀsittelyyn pilvipalveluympÀristöissÀ. Lopuksi kÀydÀÀn lÀpi mahdollisia ratkaisuja, joista on valikoitu mahdollisimman hyvin kÀyttötarkoitukseen sopivat sovellutukset

    Learning Easily Updated General Purpose Text Representations with Adaptable Task-Specific Prefixes

    Full text link
    Many real-world applications require making multiple predictions from the same text. Fine-tuning a large pre-trained language model for each downstream task causes computational burdens in the inference time due to several times of forward passes. To amortize the computational cost, freezing the language model and building lightweight models for downstream tasks based on fixed text representations are common solutions. Accordingly, how to learn fixed but general text representations that can generalize well to unseen downstream tasks becomes a challenge. Previous works have shown that the generalizability of representations can be improved by fine-tuning the pre-trained language model with some source tasks in a multi-tasking way. In this work, we propose a prefix-based method to learn the fixed text representations with source tasks. We learn a task-specific prefix for each source task independently and combine them to get the final representations. Our experimental results show that prefix-based training performs better than multi-tasking training and can update the text representations at a smaller computational cost than multi-tasking training.Comment: Preprin

    Using cloud infrastructure to facilitate data collection and conversion of HLA diagnostic data for the 18th International HLA and Immunogenetics Workshop

    Get PDF
    The International HLA and Immunogenetics Workshop (IHIW) is a recurring gathering of researchers, technologists and clinicians where participants contribute to collaborative projects with a variety of goals, and come to consensus on definitions and standards for representing HLA and immunogenic determinants. The collaborative and international nature of these workshops, combined with the multifaceted goals of several specific workshop components, necessitates the collection and curation of a wide assortment of data, as well as an adaptable platform for export and analysis. With the aim of ensuring data quality and creation of reusable datasets, specific standards and nomenclature conventions are continuously being developed, and are an integral part of IHIW. Here we present the 18th IHIW Database, a purpose-built and extensible cloud-based file repository and web application for collecting and analyzing project-specific data. This platform is based on open-source software and uses established HLA data standards and web technologies to facilitate de-centralized data repository ownership, reduce duplicated efforts, and promote continuity for future IHIWs
    • 

    corecore