961 research outputs found
Machine Learning practices and infrastructures
Machine Learning (ML) systems, particularly when deployed in high-stakes
domains, are deeply consequential. They can exacerbate existing inequities,
create new modes of discrimination, and reify outdated social constructs.
Accordingly, the social context (i.e. organisations, teams, cultures) in which
ML systems are developed is a site of active research for the field of AI
ethics, and intervention for policymakers. This paper focuses on one aspect of
social context that is often overlooked: interactions between practitioners and
the tools they rely on, and the role these interactions play in shaping ML
practices and the development of ML systems. In particular, through an
empirical study of questions asked on the Stack Exchange forums, the use of
interactive computing platforms (e.g. Jupyter Notebook and Google Colab) in ML
practices is explored. I find that interactive computing platforms are used in
a host of learning and coordination practices, which constitutes an
infrastructural relationship between interactive computing platforms and ML
practitioners. I describe how ML practices are co-evolving alongside the
development of interactive computing platforms, and highlight how this risks
making invisible aspects of the ML life cycle that AI ethics researchers' have
demonstrated to be particularly salient for the societal impact of deployed ML
systems
Opinion Mining for Software Development: A Systematic Literature Review
Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies.
SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in
code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take
considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils
these approaches entail.
We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion
mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in
other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4)
concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques.
The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide
critical insights for the further development of opinion mining techniques in the SE domain
An HMM-Based Framework for Supporting Accurate Classification of Music Datasets
open3In this paper, we use Hidden Markov Models (HMM) and Mel-Frequency Cepstral Coecients (MFCC) to build statistical models of classical music composers directly from the music datasets. Several
musical pieces are divided by instruments (String, Piano, Chorus, Orchestra), and, for each instrument, statistical models of the composers are computed.We selected 19 dierent composers spanning four centuries by using a total number of 400 musical pieces. Each musical piece is classied as belonging to a composer if the corresponding HMM gives the highest likelihood for that piece. We show that the so-developed models can be used to obtain useful information on the correlation between the composers. Moreover, by using the maximum likelihood approach, we also classied the instrumentation used by the same composer. Besides as an analysis tool, the described approach has been used as a classier. This overall originates an HMM-based framework for supporting accurate classication of music datasets. On a dataset of String Quartet movements, we obtained an average composer classication accuracy of more than 96%. As regards instrumentation classication, we obtained an average classication of slightly less than 100% for Piano, Orchestra and String Quartet. In this paper, the most signicant results coming from our experimental assessment and analysis are reported and discussed in detail.openCuzzocrea, Alfredo; Mumolo, Enzo; Vercelli, GianniCuzzocrea, Alfredo; Mumolo, Enzo; Vercelli, Giann
Bots in software engineering: a systematic mapping study
Bots have emerged from research prototypes to deployable systems due to the recent developments in machine learning, natural language processing and understanding techniques. In software engineering, bots range from simple automated scripts to decision-making autonomous systems. The spectrum of applications of bots in software engineering is so wide and diverse, that a comprehensive overview and categorization of such bots is needed. Existing works considered selective bots to be analyzed and failed to provide the overall picture. Hence it is significant to categorize bots in software engineering through analyzing why, what and how the bots are applied in software engineering. We approach the problem with a systematic mapping study based on the research articles published in this topic. This study focuses on classification of bots used in software engineering, the various dimensions of the characteristics, the more frequently researched area, potential research spaces to be explored and the perception of bots in the developer community. This study aims to provide an introduction and a broad overview of bots used in software engineering. Discussions of the feedback and results from several studies provide interesting insights and prospective future directions
Do Judge a Test by its Cover: Combining Combinatorial and Property-Based Testing
Property-based testing uses randomly generated inputs to validate high-level program specifications. It can be shockingly effective at finding bugs, but it often requires generating a very large number of inputs to do so. In this paper, we apply ideas from combinatorial testing, a powerful and widely studied testing methodology, to modify the distributions of our random generators so as to find bugs with fewer tests. The key concept is combinatorial coverage, which measures the degree to which a given set of tests exercises every possible choice of values for every small combination of input features. In its “classical” form, combinatorial coverage only applies to programs whose inputs have a very particular shape—essentially, a Cartesian product of finite sets. We generalize combinatorial coverage to the richer world of algebraic data types by formalizing a class of sparse test descriptions based on regular tree expressions. This new definition of coverage inspires a novel combinatorial thinning algorithm for improving the coverage of random test generators, requiring many fewer tests to catch bugs. We evaluate this algorithm on two case studies, a typed evaluator for System F terms and a Haskell compiler, showing significant improvements in both
The Extent and Coverage of Current Knowledge of Connected Health: Systematic Mapping Study
Background: This paper examines the development of the Connected Health research landscape with a view on providing a historical perspective on existing Connected Health research. Connected Health has become a rapidly growing research field as our healthcare system is facing pressured to become more proactive and patient centred. Objective: We aimed to identify the extent and coverage of the current body of knowledge in Connected Health. With this, we want to identify which topics have drawn the attention of Connected health researchers, and if there are gaps or interdisciplinary opportunities for further research. Methods: We used a systematic mapping study that combines scientific contributions from research on medicine, business, computer science and engineering. We analyse the papers with seven classification criteria, publication source, publication year, research types, empirical types, contribution types research topic and the condition studied in the paper. Results: Altogether, our search resulted in 208 papers which were analysed by a multidisciplinary group of researchers. Our results indicate a slow start for Connected Health research but a more recent steady upswing since 2013. The majority of papers proposed healthcare solutions (37%) or evaluated Connected Health approaches (23%). Case studies (28%) and experiments (26%) were the most popular forms of scientific validation employed. Diabetes, cancer, multiple sclerosis, and heart conditions are among the most prevalent conditions studied. Conclusions: We conclude that Connected Health research seems to be an established field of research, which has been growing strongly during the last five years. There seems to be more focus on technology driven research with a strong contribution from medicine, but business aspects of Connected health are not as much studied
A Closer Look into Recent Video-based Learning Research: A Comprehensive Review of Video Characteristics, Tools, Technologies, and Learning Effectiveness
People increasingly use videos on the Web as a source for learning. To
support this way of learning, researchers and developers are continuously
developing tools, proposing guidelines, analyzing data, and conducting
experiments. However, it is still not clear what characteristics a video should
have to be an effective learning medium. In this paper, we present a
comprehensive review of 257 articles on video-based learning for the period
from 2016 to 2021. One of the aims of the review is to identify the video
characteristics that have been explored by previous work. Based on our
analysis, we suggest a taxonomy which organizes the video characteristics and
contextual aspects into eight categories: (1) audio features, (2) visual
features, (3) textual features, (4) instructor behavior, (5) learners
activities, (6) interactive features (quizzes, etc.), (7) production style, and
(8) instructional design. Also, we identify four representative research
directions: (1) proposals of tools to support video-based learning, (2) studies
with controlled experiments, (3) data analysis studies, and (4) proposals of
design guidelines for learning videos. We find that the most explored
characteristics are textual features followed by visual features, learner
activities, and interactive features. Text of transcripts, video frames, and
images (figures and illustrations) are most frequently used by tools that
support learning through videos. The learner activity is heavily explored
through log files in data analysis studies, and interactive features have been
frequently scrutinized in controlled experiments. We complement our review by
contrasting research findings that investigate the impact of video
characteristics on the learning effectiveness, report on tasks and technologies
used to develop tools that support learning, and summarize trends of design
guidelines to produce learning video
Semantic information for robot navigation: a survey
There is a growing trend in robotics for implementing behavioural mechanisms based on human psychology, such as the processes associated with thinking. Semantic knowledge has opened new paths in robot navigation, allowing a higher level of abstraction in the representation of information. In contrast with the early years, when navigation relied on geometric navigators that interpreted the environment as a series of accessible areas or later developments that led to the use of graph theory, semantic information has moved robot navigation one step further. This work presents a survey on the concepts, methodologies and techniques that allow including semantic information in robot navigation systems. The techniques involved have to deal with a range of tasks from modelling the environment and building a semantic map, to including methods to learn new concepts and the representation of the knowledge acquired, in many cases through interaction with users. As understanding the environment is essential to achieve high-level navigation, this paper reviews techniques for acquisition of semantic information, paying attention to the two main groups: human-assisted and autonomous techniques. Some state-of-the-art semantic knowledge representations are also studied, including ontologies, cognitive maps and semantic maps. All of this leads to a recent concept, semantic navigation, which integrates the previous topics to generate high-level navigation systems able to deal with real-world complex situationsThe research leading to these results has received funding from HEROITEA: Heterogeneous 480 Intelligent Multi-Robot Team for Assistance of Elderly People (RTI2018-095599-B-C21), funded by Spanish 481 Ministerio de EconomĂa y Competitividad. The research leading to this work was also supported project "Robots sociales para estimulacĂłn fĂsica, cognitiva y afectiva de mayores"; funded by the Spanish State Research Agency under grant 2019/00428/001. It is also funded by WASP-AI Sweden; and by Spanish project Robotic-Based Well-Being Monitoring and Coaching for Elderly People during Daily Life Activities (RTI2018-095599-A-C22)
- …