17 research outputs found

    A complex network analysis of the Comprehensive R Archive Network (CRAN) package ecosystem

    Get PDF
    Free and open source software package ecosystems have existed for a long time and are among the most sophisticated human-made systems. One of the oldest and most popular software package ecosystems is CRAN, the repository of packages of the statistical language R, which is also one of the most popular environments for statistical computing nowadays. CRAN stores a large number of packages that are updated regularly and depend on a number of other packages in a complex graph of relations; such graph is empirically studied from the perspective of complex network analysis (CNA) in the current article, showing how network theory and measures proposed by previous work can help profiling the ecosystem and detecting strengths, good practices and potential risks in three perspectives: macroscopic properties of the ecosystem (structure and complexity of the network), microscopic properties of individual packages (represented as nodes), and modular properties (community detection). Results show how complex network analysis tools can be used to assess a package ecosystem and, in particular, that of CRAN

    A systematic literature review on Wikidata

    Get PDF
    To review the current status of research on Wikidata and, in particular, of articles that either describe applications of Wikidata or provide empirical evidence, in order to uncover the topics of interest, the fields that are benefiting from its applications and which researchers and institutions are leading the work

    Alternative Computer Assisted Communicative Task-based Language Testing: New Communicational and Interactive Online Skills

    Get PDF
    [EN] Computer-assisted language learning knowledge tests should no longer be designed on traditional skills to measure individual competence through traditional skills such as reading, comprehension and writing, but instead, it should diagnose interactive and communication skills in foreign languages. In recent years in online education, it has been necessary to review the concept of interactive competence in digital environments in a complementary way to its traditional use. It is important to promote a new typology of alternative tasks and items in tests where examinees can prove a real interactive performance in communication and interaction through the digital scenario. This should be done through tools that facilitate oral negotiation, the management and understanding of the information extracted from online repositories, the search for suitable online digital material, and the use of new modes of audio-visual communication. Although some of these tasks have been used in a complementary way in the design of language tests previously: it is true that they have not been applied in a coherent way to be used as an assessment tool. A first approach was made by Miguel Alvarez, Garcia Laborda & Magal-Royo (2021) in the development of oral negotiation skills through the use of interactive tools. The current online assessment models analyzed by Garcia Laborda & Alvarez Fernandez (2021) indicate the need to seek new ways of assessing foreign languages through the design of tests that fit in the current digital and interactive world.Magal-Royo, T.; García Laborda, J.; Mora Cantallops, M.; Sánchez Alonso, S. (2021). Alternative Computer Assisted Communicative Task-based Language Testing: New Communicational and Interactive Online Skills. International Journal of Emerging Technologies in Learning (Online). 16(19):251-259. https://doi.org/10.3991/ijet.v16i19.26035S251259161

    Network analysis for food safety: Quantitative and structural study of data gathered through the RASFF system in the European Union.

    Get PDF
    This paper reports a quantitative and structural analysis of data gathered on the food issues reported by the European Union members over the last forty years. The study applies statistical measures and network analysis techniques. For this purpose, a graph was constructed of how different contaminated products have been distributed through countries. The work aims to leverage insights into the structure formed by the involvement of European countries in the exchange of goods that can cause problems for populations. The results obtained show the roles of different countries in the detection of sensitive routes. In particular, the analysis identifies problematic origin countries, such as China or Turkey, whereas European countries, in general, do have good border control policies for the import/export of food.pre-print1210 K

    Evolution and prospects of the Comprehensive R Archive Network (CRAN) package ecosystem

    Get PDF
    Free and open source software package ecosystems have existed for a long time, but such collaborative development practice has surged in recent years. One of the oldest and most popular package ecosystems is Comprehensive R Archive Network (CRAN), the repository of packages of the statistical language R, a popular statistical computing environment. CRAN stores a large number of packages that are updated regularly and depend on many other packages in a complex graph of relations. As the repository grows, its sustainability could be threatened by that complexity or nonuniform evolution of some packages. This paper provides an empirical analysis of the evolution of the CRAN repository in the last 20 years, considering the laws of software evolution and the effect of CRAN's policies on such development. Results show how the progress of CRAN is consistent with the laws of continuous growth and change and how there seems to be a relevant increase in complexity in recent years. Significant challenges are raising related to the scale and scope of software package managers and the services they provide; understanding how they change over time and what might endanger their sustainability are key factors for their future improvement, maintenance, policies, and, eventually, sustainability of the ecosystem

    Traceability for trustworthy AI: a review of models and tools

    Get PDF
    Traceability is considered a key requirement for trustworthy artificial intelligence (AI), related to the need to maintain a complete account of the provenance of data, processes, and artifacts involved in the production of an AI model. Traceability in AI shares part of its scope with general purpose recommendations for provenance as W3C PROV, and it is also supported to different extents by specific tools used by practitioners as part of their efforts in making data analytic processes reproducible or repeatable. Here, we review relevant tools, practices, and data models for traceability in their connection to building AI models and systems. We also propose some minimal requirements to consider a model traceable according to the assessment list of the High-Level Expert Group on AI. Our review shows how, although a good number of reproducibility tools are available, a common approach is currently lacking, together with the need for shared semantics. Besides, we have detected that some tools have either not achieved full maturity, or are already falling into obsolescence or in a state of near abandonment by its developers, which might compromise the reproducibility of the research trusted to them

    Authority-based conversation tracking in Twitter: an unattended methodological approach

    Get PDF
    Twitter is undoubtedly one of the most widely used data sources to analyze human communication. The literature is full of examples where Twitter is accessed, and data are downloaded as the previous step to a more in-depth analysis in a wide variety of knowledge areas. Unfortunately, the extraction of relevant information from the opinions that users freely express in Twitter is complicated, both because of the volume generated—more than 6000 tweets per second—and the difficulties related to filtering out only what is pertinent to our research. Inspired by the fact that a large part of users use Twitter to communicate or receive political information, we created a method that allows for the monitoring of a set of users (which we will call authorities) and the tracking of the information published by them about an event. Our approach consists of dynamically and automatically monitoring the hottest topics among all the conversations where the authorities are involved, and retrieving the tweets in connection with those topics, filtering other conversations out. Although our case study involves the method being applied to the political discussions held during the Spanish general, local, and European elections of April/May 2019, the method is equally applicable to many other contexts, such as sporting events, marketing campaigns, or health crises

    Modeling Bacterial Species: Using Sequence Similarity with Clustering Techniques

    Get PDF
    Existing studies have challenged the current definition of named bacterial species, especially in the case of highly recombinogenic bacteria. This has led to considering the use of computational procedures to examine potential bacterial clusters that are not identified by species naming. This paper describes the use of sequence data obtained from MLST databases as input for a k-means algorithm extended to deal with housekeeping gene sequences as a metric of similarity for the clustering process. An implementation of the k-means algorithm has been developed based on an existing source code implementation, and it has been evaluated against MLST data. Results point out to potential bacterial clusters that are close to more than one different named species and thus may become candidates for alternative classifications accounting for genotypic information. The use of hierarchical clustering with sequence comparison as similarity metric has the potential to find clusters different from named species by using a more informed cluster formation strategy than a conventional nominal variant of the algorithm
    corecore