23 research outputs found
On Making in the Digital Humanities
On Making in the Digital Humanities fills a gap in our understanding of digital humanities projects and craft by exploring the processes of making as much as the products that arise from it.
The volume draws focus to the interwoven layers of human and technological textures that constitute digital humanities scholarship. To do this, it assembles a group of well-known, experienced and emerging scholars in the digital humanities to reflect on various forms of making (we privilege here the creative and applied side of the digital humanities). The volume honours the work of John Bradley, as it is totemic of a practice of making that is deeply informed by critical perspectives. A special chapter also honours the profound contributions that this volume’s co-editor, Stéfan Sinclair, made to the creative, applied and intellectual praxis of making and the digital humanities. Stéfan Sinclair passed away on 6 August 2020.
The chapters gathered here are individually important, but together provide a very human view on what it is to do the digital humanities, in the past, present and future. This book will accordingly be of interest to researchers, teachers and students of the digital humanities; creative humanities, including maker spaces and culture; information studies; the history of computing and technology; and the history of science and the humanities
Identifying reusable knowledge in developer instant messaging communication.
Context and background: Software engineering is a complex and knowledge-intensive
activity. Required knowledge (e.g., about technologies, frameworks, and design decisions)
changes fast and the knowledge needs of those who design, code, test and maintain
software constantly evolve. On the other hand, software developers use a wide range of
processes, practices and tools where developers explicitly and implicitly “produce” and
capture different types of knowledge.
Problem: Software developers use instant messaging tools (e.g., Slack, Microsoft
Teams and Gitter) to discuss development-related problems, share experiences and to
collaborate in projects. This communication takes place in chat rooms that accumulate
potentially relevant knowledge to be reused by other developers. Therefore, in this
research we analyze whether there is reusable knowledge in developer instant messaging
communication by exploring (a) which instant messaging platforms can be a source
of reusable knowledge, and (b) software engineering themes that represent the main
discussions of developers in instant messaging communication. We also analyze how
this reusable knowledge can be identified with the use of topic modeling (a natural
language processing technique to discover abstract topics in text) by (c) surveying the
literature on how topic modeling has been applied in software engineering research, and
(d) evaluating how topic models perform with developer instant messages.
Method: First, we conducted a Field Study through an exploratory case study and a
reflexive thematic analysis to check whether there is reusable knowledge in developer
instant messaging communication, and if so, what this knowledge (main themes discussed)
is. Then, we conducted a Sample Study to explore how reusable knowledge in
developer instant messaging communication can we identified. In this study, we applied
a literature survey and software repository mining (i.e. short text topic modeling).
Findings and contributions: We (a) developed a comparison framework for instant
messaging tools, (b) identified a map of the main themes discussed in chat rooms of an
instant messaging tool (Gitter, a platform used by software developers), (c) provided a
comprehensive literature review that offers insights and references on the use of topic
modeling in software engineering, and (d) provided an evaluation of the performance of
topic models applied to developer instant messages based on topic coherence metrics
and human judgment for topic quality
Окружење за анализу и оцену квалитета великих и повезаних података
Linking and publishing data in the Linked Open Data format increases the interoperability
and discoverability of resources over the Web. To accomplish this, the process comprises
several design decisions, based on the Linked Data principles that, on one hand, recommend to
use standards for the representation and the access to data on the Web, and on the other hand
to set hyperlinks between data from different sources.
Despite the efforts of the World Wide Web Consortium (W3C), being the main international
standards organization for the World Wide Web, there is no one tailored formula for publishing
data as Linked Data. In addition, the quality of the published Linked Open Data (LOD) is a
fundamental issue, and it is yet to be thoroughly managed and considered.
In this doctoral thesis, the main objective is to design and implement a novel framework for
selecting, analyzing, converting, interlinking, and publishing data from diverse sources,
simultaneously paying great attention to quality assessment throughout all steps and modules
of the framework. The goal is to examine whether and to what extent are the Semantic Web
technologies applicable for merging data from different sources and enabling end-users to
obtain additional information that was not available in individual datasets, in addition to the
integration into the Semantic Web community space. Additionally, the Ph.D. thesis intends to
validate the applicability of the process in the specific and demanding use case, i.e. for creating
and publishing an Arabic Linked Drug Dataset, based on open drug datasets from selected
Arabic countries and to discuss the quality issues observed in the linked data life-cycle. To that
end, in this doctoral thesis, a Semantic Data Lake was established in the pharmaceutical domain
that allows further integration and developing different business services on top of the
integrated data sources. Through data representation in an open machine-readable format, the
approach offers an optimum solution for information and data dissemination for building
domain-specific applications, and to enrich and gain value from the original dataset. This thesis
showcases how the pharmaceutical domain benefits from the evolving research trends for
building competitive advantages. However, as it is elaborated in this thesis, a better
understanding of the specifics of the Arabic language is required to extend linked data
technologies utilization in targeted Arabic organizations.Повезивање и објављивање података у формату "Повезани отворени подаци" (енг.
Linked Open Data) повећава интероперабилност и могућности за претраживање ресурса
преко Web-а. Процес је заснован на Linked Data принципима (W3C, 2006) који са једне
стране елаборира стандарде за представљање и приступ подацима на Wебу (RDF, OWL,
SPARQL), а са друге стране, принципи сугеришу коришћење хипервеза између података
из различитих извора.
Упркос напорима W3C конзорцијума (W3C је главна међународна организација за
стандарде за Web-у), не постоји јединствена формула за имплементацију процеса
објављивање података у Linked Data формату. Узимајући у обзир да је квалитет
објављених повезаних отворених података одлучујући за будући развој Web-а, у овој
докторској дисертацији, главни циљ је (1) дизајн и имплементација иновативног оквира
за избор, анализу, конверзију, међусобно повезивање и објављивање података из
различитих извора и (2) анализа примена овог приступа у фармацeутском домену.
Предложена докторска дисертација детаљно истражује питање квалитета великих и
повезаних екосистема података (енг. Linked Data Ecosystems), узимајући у обзир
могућност поновног коришћења отворених података. Рад је мотивисан потребом да се
омогући истраживачима из арапских земаља да употребом семантичких веб технологија
повежу своје податке са отвореним подацима, као нпр. DBpedia-јом. Циљ је да се испита
да ли отворени подаци из Арапских земаља омогућавају крајњим корисницима да добију
додатне информације које нису доступне у појединачним скуповима података, поред
интеграције у семантички Wеб простор.
Докторска дисертација предлаже методологију за развој апликације за рад са
повезаним (Linked) подацима и имплементира софтверско решење које омогућује
претраживање консолидованог скупа података о лековима из изабраних арапских
земаља. Консолидовани скуп података је имплементиран у облику Семантичког језера
података (енг. Semantic Data Lake).
Ова теза показује како фармацеутска индустрија има користи од примене
иновативних технологија и истраживачких трендова из области семантичких
технологија. Међутим, како је елаборирано у овој тези, потребно је боље разумевање
специфичности арапског језика за имплементацију Linked Data алата и њухову примену
са подацима из Арапских земаља
Computational Methods for Interactive and Explorative Study Design and Integration of High-throughput Biological Data
The increase in the use of high-throughput methods to gain insights into biological systems has come with new challenges. Genomics, transcriptomics, proteomics, and metabolomics lead to a massive amount of data and metadata. While this wealth of information has resulted in many scientific discoveries, new strategies are needed to cope with the ever-growing variety and volume of metadata. Despite efforts to standardize the collection of study metadata, many experiments cannot be reproduced or replicated. One reason for this is the difficulty to provide the necessary metadata. The large sample sizes that modern omics experiments enable, also make it increasingly complicated for scientists to keep track of every sample and the needed annotations. The many data transformations that are often needed to normalize and analyze omics data require a further collection of all parameters and tools involved. A second possible cause is missing knowledge about statistical design of studies, both related to study factors as well as the required sample size to make significant discoveries.
In this thesis, we develop a multi-tier model for experimental design and a portlet for interactive web-based study design. Through the input of experimental factors and the number of replicates, users can easily create large, factorial experimental designs. Changes or additional metadata can be quickly uploaded via user-defined spreadsheets including sample identifiers. In order to comply with existing standards and provide users with a quick way to import existing studies, we provide full interoperability with the ISA-Tab format. We show that both data model and portlet are easily extensible to create additional tiers of samples annotated with technology-specific metadata.
We tackle the problem of unwieldy experimental designs by creating an aggregation graph. Based on our multi-tier experimental design model, similar samples, their sources, and analytes are summarized, creating an interactive summary graph that focuses on study factors and replicates. Thus, we give researchers a quick overview of sample sizes and the aim of different studies. This graph can be included in our portlets or used as a stand alone application and is compatible with the ISA-Tab format. We show that this approach can be used to explore the quality of publicly available experimental designs and metadata annotation.
The third part of this thesis contributes to a more statistically sound experiment planning for differential gene expression experiments. We integrate two tools for the prediction of statistical power and sample size estimation into our portal. This integration enables the use of existing data, in order to arrive at more accurate calculation for sample variability. Additionally, the statistical power of existing experimental designs of certain sample sizes can be analyzed. All results and parameters are stored and can be used for later comparison.
Even perfectly planned and annotated experiments cannot eliminate human error. Based on our model we develop an automated workflow for microarray quality control, enabling users to inspect the quality of normalization and cluster samples by study factor levels. We import a publicly available microarray dataset to assess our contributions to reproducibility and explore alternative analysis methods based on statistical power analysis
Cyber Security of Critical Infrastructures
Critical infrastructures are vital assets for public safety, economic welfare, and the national security of countries. The vulnerabilities of critical infrastructures have increased with the widespread use of information technologies. As Critical National Infrastructures are becoming more vulnerable to cyber-attacks, their protection becomes a significant issue for organizations as well as nations. The risks to continued operations, from failing to upgrade aging infrastructure or not meeting mandated regulatory regimes, are considered highly significant, given the demonstrable impact of such circumstances. Due to the rapid increase of sophisticated cyber threats targeting critical infrastructures with significant destructive effects, the cybersecurity of critical infrastructures has become an agenda item for academics, practitioners, and policy makers. A holistic view which covers technical, policy, human, and behavioural aspects is essential to handle cyber security of critical infrastructures effectively. Moreover, the ability to attribute crimes to criminals is a vital element of avoiding impunity in cyberspace. In this book, both research and practical aspects of cyber security considerations in critical infrastructures are presented. Aligned with the interdisciplinary nature of cyber security, authors from academia, government, and industry have contributed 13 chapters. The issues that are discussed and analysed include cybersecurity training, maturity assessment frameworks, malware analysis techniques, ransomware attacks, security solutions for industrial control systems, and privacy preservation methods
Assessing Comment Quality in Object-Oriented Languages
Previous studies have shown that high-quality code comments support developers in software maintenance and program comprehension tasks. However, the semi-structured nature of comments, several conventions to write comments, and the lack of quality assessment tools for all aspects of comments make comment evaluation and maintenance a non-trivial problem. To understand the specification of high-quality comments to build effective assessment tools, our thesis emphasizes acquiring a multi-perspective view of the comments, which can be approached by analyzing (1) the academic support for comment quality assessment, (2) developer commenting practices across languages, and (3) developer concerns about comments.
Our findings regarding the academic support for assessing comment quality showed that researchers primarily focus on Java in the last decade even though the trend of using polyglot environments in software projects is increasing. Similarly, the trend of analyzing specific types of code comments (method comments, or inline comments) is increasing, but the studies rarely analyze class comments. We found 21 quality attributes that researchers consider to assess comment quality, and manual assessment is still the most commonly used technique to assess various quality attributes. Our analysis of developer commenting practices showed that developers embed a mixed level of details in class comments, ranging from high-level class overviews to low-level implementation details across programming languages. They follow style guidelines regarding what information to write in class comments but violate the structure and syntax guidelines. They primarily face problems locating relevant guidelines to write consistent and informative comments, verifying the adherence of their comments to the guidelines, and evaluating the overall state of comment quality.
To help researchers and developers in building comment quality assessment tools, we contribute: (i) a systematic literature review (SLR) of ten years (2010–2020) of research on assessing comment quality, (ii) a taxonomy of quality attributes used to assess comment quality, (iii) an empirically validated taxonomy of class comment information types from three programming languages, (iv) a multi-programming-language approach to automatically identify the comment information types, (v) an empirically validated taxonomy of comment convention-related questions and recommendation from various Q&A forums, and (vi) a tool to gather discussions from multiple developer sources, such as Stack Overflow, and mailing lists.
Our contributions provide various kinds of empirical evidence of the developer’s interest in reducing efforts in the software documentation process, of the limited support developers get in automatically assessing comment quality, and of the challenges they face in writing high-quality comments. This work lays the foundation for future effective comment quality assessment tools and techniques
Multimodal Emotion Recognition among Couples from Lab Settings to Daily Life using Smartwatches
Couples generally manage chronic diseases together and the management takes
an emotional toll on both patients and their romantic partners. Consequently,
recognizing the emotions of each partner in daily life could provide an insight
into their emotional well-being in chronic disease management. The emotions of
partners are currently inferred in the lab and daily life using self-reports
which are not practical for continuous emotion assessment or observer reports
which are manual, time-intensive, and costly. Currently, there exists no
comprehensive overview of works on emotion recognition among couples.
Furthermore, approaches for emotion recognition among couples have (1) focused
on English-speaking couples in the U.S., (2) used data collected from the lab,
and (3) performed recognition using observer ratings rather than partner's
self-reported / subjective emotions. In this body of work contained in this
thesis (8 papers - 5 published and 3 currently under review in various
journals), we fill the current literature gap on couples' emotion recognition,
develop emotion recognition systems using 161 hours of data from a total of
1,051 individuals, and make contributions towards taking couples' emotion
recognition from the lab which is the status quo, to daily life. This thesis
contributes toward building automated emotion recognition systems that would
eventually enable partners to monitor their emotions in daily life and enable
the delivery of interventions to improve their emotional well-being.Comment: PhD Thesis, 2022 - ETH Zuric
Agile Processes in Software Engineering and Extreme Programming – Workshops
This open access book constitutes papers from the 5 research workshops, the poster presentations, as well as two panel discussions which were presented at XP 2021, the 22nd International Conference on Agile Software Development, which was held online during June 14-18, 2021. XP is the premier agile software development conference combining research and practice. It is a unique forum where agile researchers, practitioners, thought leaders, coaches, and trainers get together to present and discuss their most recent innovations, research results, experiences, concerns, challenges, and trends. XP conferences provide an informal environment to learn and trigger discussions and welcome both people new to agile and seasoned agile practitioners. The 18 papers included in this volume were carefully reviewed and selected from overall 37 submissions. They stem from the following workshops: 3rd International Workshop on Agile Transformation 9th International Workshop on Large-Scale Agile Development 1st International Workshop on Agile Sustainability 4th International Workshop on Software-Intensive Business 2nd International Workshop on Agility with Microservices Programmin
Utilizing educational technology in computer science and programming courses : theory and practice
There is one thing the Computer Science Education researchers seem to agree: programming is a difficult skill to learn. Educational technology can potentially solve a number of difficulties associated with programming and computer science education by automating assessment, providing immediate feedback and by gamifying the learning process. Still, there are two very important issues to solve regarding the use of technology: what tools to use, and how to apply them?
In this thesis, I present a model for successfully adapting educational technology to computer science and programming courses. The model is based on several years of studies conducted while developing and utilizing an exercise-based educational tool in various courses. The focus of the model is in improving student performance, measured by two easily quantifiable factors: the pass rate of the course and the average grade obtained from the course.
The final model consists of five features that need to be considered in order to adapt technology effectively into a computer science course: active learning and continuous assessment, heterogeneous exercise types, electronic examination, tutorial-based learning, and continuous feedback cycle. Additionally, I recommend that student mentoring is provided and cognitive load of adapting the tools considered when applying the model. The features are classified as core components, supportive components or evaluation components based on their role in the complete model.
Based on the results, it seems that adapting the complete model can increase the pass rate statistically significantly and provide higher grades when compared with a “traditional” programming course. The results also indicate that although adapting the model partially can create some improvements to the performance, all features are required for the full effect to take place.
Naturally, there are some limits in the model. First, I do not consider it as the only possible model for adapting educational technology into programming or computer science courses. Second, there are various other factors in addition to students’ performance for creating a satisfying learning experience that need to be considered when refactoring courses. Still, the model presented can provide significantly better results, and as such, it works as a base for future improvements in computer science education.Ohjelmoinnin oppimisen vaikeus on yksi harvoja asioita, joista lähes kaikki tietojenkäsittelyn opetuksen tutkijat ovat jokseenkin yksimielisiä. Opetusteknologian avulla on mahdollista ratkaista useita ohjelmoinnin oppimiseen liittyviä ongelmia esimerkiksi hyödyntämällä automaattista arviointia, välitöntä palautetta ja pelillisyyttä. Teknologiaan liittyy kuitenkin kaksi olennaista kysymystä: mitä työkaluja käyttää ja miten ottaa ne kursseilla tehokkaasti käyttöön?
Tässä väitöskirjassa esitellään malli opetusteknologian tehokkaaseen hyödyntämiseen tietojenkäsittelyn ja ohjelmoinnin kursseilla. Malli perustuu tehtäväpohjaisen oppimisjärjestelmän runsaan vuosikymmenen pituiseen kehitys- ja tutkimusprosessiin. Mallin painopiste on opiskelijoiden suoriutumisen parantamisessa. Tätä arvioidaan kahdella kvantitatiivisella mittarilla: kurssin läpäisyprosentilla ja arvosanojen keskiarvolla.
Malli koostuu viidestä tekijästä, jotka on otettava huomioon tuotaessa opetusteknologiaa ohjelmoinnin kursseille. Näitä ovat aktiivinen oppiminen ja jatkuva arviointi, heterogeeniset tehtävätyypit, sähköinen tentti, tutoriaalipohjainen oppiminen sekä jatkuva palautesykli. Lisäksi opiskelijamentoroinnin järjestäminen kursseilla ja järjestelmän käyttöönottoon liittyvän kognitiivisen kuorman arviointi tukevat mallin käyttöä. Malliin liittyvät tekijät on tässä työssä lajiteltu kolmeen kategoriaan: ydinkomponentteihin, tukikomponentteihin ja arviontiin liittyviin komponentteihin.
Tulosten perusteella vaikuttaa siltä, että mallin käyttöönotto parantaa kurssien läpäisyprosenttia tilastollisesti merkittävästi ja nostaa arvosanojen keskiarvoa ”perinteiseen” kurssimalliin verrattuna. Vaikka mallin yksittäistenkin ominaisuuksien käyttöönotto voi sinällään parantaa kurssin tuloksia, väitöskirjaan kuuluvien tutkimusten perusteella näyttää siltä, että parhaat tulokset saavutetaan ottamalla malli käyttöön kokonaisuudessaan.
On selvää, että malli ei ratkaise kaikkia opetusteknologian käyttöönottoon liittyviä kysymyksiä. Ensinnäkään esitetyn mallin ei ole tarkoituskaan olla ainoa mahdollinen tapa hyödyntää opetusteknologiaa ohjelmoinnin ja tietojenkäsittelyn kursseilla. Toiseksi tyydyttävään oppimiskokemukseen liittyy opiskelijoiden suoriutumisen lisäksi paljon muitakin tekijöitä, jotka tulee huomioida kurssien uudelleensuunnittelussa. Esitetty malli mahdollistaa kuitenkin merkittävästi parempien tulosten saavuttamisen kursseilla ja tarjoaa sellaisena perustan entistä parempaan opetukseen
Front-Line Physicians' Satisfaction with Information Systems in Hospitals
Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe