1,696 research outputs found
Model-to-Model Transformation - From UML Class Diagrams to Labeled Property Graphs
Conceptual schemas are the basis to build well-grounded Information Systems, by representing the main concepts of a domain of knowledge, as well as the relationships among them. Since conceptual schemas focus on the concepts, they are independent of the specific technological platform used to implement them. This allows a single conceptual schema to be transformed into different platform-specific models according to the implementation requirements. This is a non-trivial process that is crucial for the performance and maintainability of the system, as well as for the accomplishment of the domain data requirements. Much research has been done on transforming conceptual schemas into relational data models. Nevertheless, less work has been done on transforming conceptual schemas into property graphs, a data structure indispensable to building appropriate and efficient systems based on graph databases. The work proposes a systematic approach to transform conceptual schemas, represented as UML class diagrams, into property graphs by using a set of transformation rules and patterns applied in a systematic way. Besides a practical example used to help the presentation of the proposed approach, the evaluation has been done by measuring different quality dimensions such as semantic equivalence, readability, maintainability, complexity, size, and performance
âWhat will become of my work?â: Genius, Gender, and Legacy in the Life of Clara Wieck/Schumann
Clara Wieck/Schumann (1819 â 1896) was a musician living in an era increasingly concerned with posterity and canon formation, yet she believed that as a performer, she was destined for posthumous obscurity. On this matter, she clearly misjudged her historical significance. More than 125 years after her death, Wieck/Schumann is still remembered as a child prodigy whose father trained her to become one of the greatest pianists of the day and whose dedication to the highest artistic ideals was matched only by her unconditional devotion to her husband and children. Wieck/Schumannâs artistic accomplishments as an individualâher remarkable success as a virtuosa and her significant compositional outputâare frequently juxtaposed with her roles as a mother of eight and the romantic(ized) partner to fellow composer Robert Schumann. The seeming incongruity of her public and private lives has also lent a certain ambiguity and openness in biographical treatments. As such, Wieck/Schumann has been a canvas upon which writers could project contradictory dogmatic images: eternally faithful and adulterous wife, selfless and neglectful mother, humble and glamorous performer, contentedly domestic and artistically stifled.Through close reading of archival material (diaries and correspondence of Wieck/Schumann and her closest associates) and secondary sources (biographies, historical news and entertainment media, and music analyses), this dissertation traces the origins and evolutions of Wieck/Schumannâs legacies through the dual lenses of gender and genius. This project presents a historiographic study of prescriptive ideologies in Wieck/Schumann biographies with attention to shifts in genre conventions and feminist ideas, engages in psychobiographic analysis of her attitudes toward and justification of her own creativity as a function of her personal relationships, investigates how her image was used in Nazi propaganda with consequent backlash, and culminates in critical consideration of the limitations of biography as a tool to analyze her compositions. Taken together, these components demonstrate the (potentially dangerous) cultural power of biography to perpetuate narratives of gender values and genius with direct implications for the musical and cultural reception of creative women like Wieck/Schumann
DuckPGQ: Efficient property graph queries in an analytical RDBMS
In the past decade, property graph databases have emerged as a
growing niche in data management. Many native graph systems
and query languages have been created, but the functionality and
performance still leave much room for improvement. The upcoming
SQL:2023 will introduce the Property Graph Queries (SQL/PGQ)
sub-language, giving relational systems the opportunity to standard-
ize graph queries, and provide mature graph query functionality.
We argue that (i) competent graph data systems must build on
all technology that makes up a state-of-the-art relational system,
(ii) the graph use case requires the addition to that of a many-
source/destination path-finding algorithm and compact graph rep-
resentation, and (iii) incites research in practical worst-case-optimal
joins and factorized query processing techniques.
We outline our design of DuckPGQ that follows this recipe,
by adding efficient SQL/PGQ support to the popular open-source
âembeddable analyticsâ relational database system DuckDB, also
originally developed at CWI. Our design aims at minimizing techni-
cal debt using an approach that relies on efficient vectorized UDFs.
We benchmark DuckPGQ showing encouraging performance and
scalability on large graph data sets, but also reinforcing the need
for future research under (iii)
The Modernization Process of a Data Pipeline
Data plays an integral part in a companyâs decision-making. Therefore, decision-makers must have the right data available at the right time. Data volumes grow constantly, and new data is continuously needed for analytical purposes. Many companies use data warehouses to store data in an easy-to-use format for reporting and analytics. The challenge with data warehousing is displaying data using one unified structure. The source data is often gathered from many systems that are structured in various ways.
A process called extract, transform, and load (ETL) or extract, load, and transform (ELT) is used to load data into the data warehouse. This thesis describes the modernization process of one such pipeline. The previous solution, which used an on-premises Teradata platform for computation and SQL stored procedures for the transformation logic, is replaced by a new solution. The goal of the new solution is a process that uses modern tools, is scalable, and follows programming best practises. The cloud-based Databricks platform is used for computation, and dbt is used as the transformation tool. Lastly, a comparison is made between the new and old solutions, and their benefits and drawbacks are discussed
Economic and Financial Performance of the Portuguese Wine Market: Business Intelligence Approach
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsIn the past years, the international wine industry has been subject to intensive globalization and
international competition as a result of the aggressive entrance of new players (New World Wine) as
well as the consolidation of traditional producersâ countries (Old World Wine), which brings,
simultaneously, challenges and opportunities to the industry. In Portugal, winemaking is one of the
most relevant socio-economic activities, which makes it one of the most critical industries in Portugal.
This study aims to develop a prototype business analytics solution that delivers the necessary inputs
to evaluate Portuguese winemaking enterprisesâ financial and economic health. Consequently, three
research questions were defined: How can one create segments of Portuguese winemaking companies
based on their financial performance? What was the evolution of the main financial indicators and
margins within each segment and vineyard region? What has been the impact of COVID-19 on these
companiesâ profitability?
Multiple financial ratios for 148 Portuguese companies were computed based on information from
2016 through 2020. Additionally, data of the entire sector was retrieved from the Bank of Portugal for
the same period. This study uses analytical tools and dashboards to analyze the wine sector
performance and the firmsâ performance under each region by comparing their financial results.
This study contributes to the literature by expanding on previous research on specific regions of
Portugal. Additionally, this study presents the results employing a business analytics solution, including
cluster techniques and AI insights.
It was concluded, through the analysis of the economic-financial performance, that large companies
have higher financial performance. Additionally, the sector, in general, suffered a drop in financial
results because of the impacts of the COVID-19 pandemic
MetsÀkonedatan hyödyntÀminen pilvipalveluympÀristössÀ
MetsÀnhoitotyön koneellistumisen myötÀ hakkuuyrityksiltÀ vaaditaan yhÀ tehokkaampaa puuntuotantoa ja työn jÀlkeÀ seurataan maanlaajuisesti. Yritykset ovat velvollisia raportoimaan tuotantonsa valvoville tahoille, mutta jÀrjestelmÀt eivÀt ole tarkoitukseen jÀrin tehokkaita. Pilvipalveluiden avulla prosessia voidaan tehostaa ja informaation saatavuus sekÀ koostaminen helpottuu. Pilvipalvelut mahdollistavat jÀrjestelmien kehitykseen sujuvuutta verrattaen vanhoihin menetelmiin. KehitystyössÀ ei olla enÀÀ sidottuina yhteen monimutkaiseen ratkaisuun, vaan työkaluja voidaan kÀyttÀÀ monipuolisemmin kÀyttötapauksesta riippuen. Oleellisia osia pilvipalveluissa on yhtenÀinen tiedonlÀhde ja resurssien ehtymÀttömyys. Ohjelmistoversioiden julkaisut ovat saumattomia niin kehittÀjille kuin kÀyttÀjillekin.
TÀmÀn tutkielman tarkoituksena on kehittÀÀ John Deere Forestry Oy:n tarjoamia ohjelmistoja ja antaa katsaus metsÀkonetyöhön liittyvÀÀn standardiin sekÀ pilvipalveluiden arkkitehtuuriin. KeskeisenÀ aiheena on siirtÀÀ TimberOffice 5 -työpöytÀsovelluksen ominaisuuksia pilvipalveluihin ja kehittÀÀ ratkaisuja ongelmakohtiin, kuten kuljettajien erottaminen toisistaan hakkuuorganisaation laajuisesti sekÀ tarjota standardin mukaiset raportit kÀytettÀvÀksi helpommin. Tutkielman laajuus on rajattu pohjoismaissa kÀytettÀvien puunkatkontamenetelmien tuottamaan dataan ja pilvipalveluarkkitehtuuriin. Aluksi kÀsitellÀÀn kansainvÀlisen standardin muutoksia sekÀ ongelmakohtia, jonka jÀlkeen aihetta tarkennetaan metsÀkonedatan kÀsittelyyn pilvipalveluympÀristöissÀ. Lopuksi kÀydÀÀn lÀpi mahdollisia ratkaisuja, joista on valikoitu mahdollisimman hyvin kÀyttötarkoitukseen sopivat sovellutukset
Learning Easily Updated General Purpose Text Representations with Adaptable Task-Specific Prefixes
Many real-world applications require making multiple predictions from the
same text. Fine-tuning a large pre-trained language model for each downstream
task causes computational burdens in the inference time due to several times of
forward passes. To amortize the computational cost, freezing the language model
and building lightweight models for downstream tasks based on fixed text
representations are common solutions. Accordingly, how to learn fixed but
general text representations that can generalize well to unseen downstream
tasks becomes a challenge. Previous works have shown that the generalizability
of representations can be improved by fine-tuning the pre-trained language
model with some source tasks in a multi-tasking way. In this work, we propose a
prefix-based method to learn the fixed text representations with source tasks.
We learn a task-specific prefix for each source task independently and combine
them to get the final representations. Our experimental results show that
prefix-based training performs better than multi-tasking training and can
update the text representations at a smaller computational cost than
multi-tasking training.Comment: Preprin
Using cloud infrastructure to facilitate data collection and conversion of HLA diagnostic data for the 18th International HLA and Immunogenetics Workshop
The International HLA and Immunogenetics Workshop (IHIW) is a recurring gathering of researchers, technologists and clinicians where participants contribute to collaborative projects with a variety of goals, and come to consensus on definitions and standards for representing HLA and immunogenic determinants. The collaborative and international nature of these workshops, combined with the multifaceted goals of several specific workshop components, necessitates the collection and curation of a wide assortment of data, as well as an adaptable platform for export and analysis. With the aim of ensuring data quality and creation of reusable datasets, specific standards and nomenclature conventions are continuously being developed, and are an integral part of IHIW. Here we present the 18th IHIW Database, a purpose-built and extensible cloud-based file repository and web application for collecting and analyzing project-specific data. This platform is based on open-source software and uses established HLA data standards and web technologies to facilitate de-centralized data repository ownership, reduce duplicated efforts, and promote continuity for future IHIWs
- âŠ