10,437 research outputs found

    A systematic literature review on source code similarity measurement and clone detection: techniques, applications, and challenges

    Full text link
    Measuring and evaluating source code similarity is a fundamental software engineering activity that embraces a broad range of applications, including but not limited to code recommendation, duplicate code, plagiarism, malware, and smell detection. This paper proposes a systematic literature review and meta-analysis on code similarity measurement and evaluation techniques to shed light on the existing approaches and their characteristics in different applications. We initially found over 10000 articles by querying four digital libraries and ended up with 136 primary studies in the field. The studies were classified according to their methodology, programming languages, datasets, tools, and applications. A deep investigation reveals 80 software tools, working with eight different techniques on five application domains. Nearly 49% of the tools work on Java programs and 37% support C and C++, while there is no support for many programming languages. A noteworthy point was the existence of 12 datasets related to source code similarity measurement and duplicate codes, of which only eight datasets were publicly accessible. The lack of reliable datasets, empirical evaluations, hybrid methods, and focuses on multi-paradigm languages are the main challenges in the field. Emerging applications of code similarity measurement concentrate on the development phase in addition to the maintenance.Comment: 49 pages, 10 figures, 6 table

    Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations

    Full text link
    The local explanation provides heatmaps on images to explain how Convolutional Neural Networks (CNNs) derive their output. Due to its visual straightforwardness, the method has been one of the most popular explainable AI (XAI) methods for diagnosing CNNs. Through our formative study (S1), however, we captured ML engineers' ambivalent perspective about the local explanation as a valuable and indispensable envision in building CNNs versus the process that exhausts them due to the heuristic nature of detecting vulnerability. Moreover, steering the CNNs based on the vulnerability learned from the diagnosis seemed highly challenging. To mitigate the gap, we designed DeepFuse, the first interactive design that realizes the direct feedback loop between a user and CNNs in diagnosing and revising CNN's vulnerability using local explanations. DeepFuse helps CNN engineers to systemically search "unreasonable" local explanations and annotate the new boundaries for those identified as unreasonable in a labor-efficient manner. Next, it steers the model based on the given annotation such that the model doesn't introduce similar mistakes. We conducted a two-day study (S2) with 12 experienced CNN engineers. Using DeepFuse, participants made a more accurate and "reasonable" model than the current state-of-the-art. Also, participants found the way DeepFuse guides case-based reasoning can practically improve their current practice. We provide implications for design that explain how future HCI-driven design can move our practice forward to make XAI-driven insights more actionable.Comment: 32 pages, 6 figures, 5 tables. Accepted for publication in the Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW 202

    Machine Learning Approaches for the Prioritisation of Cardiovascular Disease Genes Following Genome- wide Association Study

    Get PDF
    Genome-wide association studies (GWAS) have revealed thousands of genetic loci, establishing itself as a valuable method for unravelling the complex biology of many diseases. As GWAS has grown in size and improved in study design to detect effects, identifying real causal signals, disentangling from other highly correlated markers associated by linkage disequilibrium (LD) remains challenging. This has severely limited GWAS findings and brought the method’s value into question. Although thousands of disease susceptibility loci have been reported, causal variants and genes at these loci remain elusive. Post-GWAS analysis aims to dissect the heterogeneity of variant and gene signals. In recent years, machine learning (ML) models have been developed for post-GWAS prioritisation. ML models have ranged from using logistic regression to more complex ensemble models such as random forests and gradient boosting, as well as deep learning models (i.e., neural networks). When combined with functional validation, these methods have shown important translational insights, providing a strong evidence-based approach to direct post-GWAS research. However, ML approaches are in their infancy across biological applications, and as they continue to evolve an evaluation of their robustness for GWAS prioritisation is needed. Here, I investigate the landscape of ML across: selected models, input features, bias risk, and output model performance, with a focus on building a prioritisation framework that is applied to blood pressure GWAS results and tested on re-application to blood lipid traits

    Investigating neural differentiation capacity in Alzheimer’s disease iPSC-derived neural stem cells

    Get PDF
    Neurodegeneration in Alzheimer’s disease (AD) may be exacerbated by dysregulated hippocampal neurogenesis. Neural stem cells (NSC) maintain adult neurogenesis and depletion of the NSC niche has been associated with age-related cognitive decline and dementia. We hypothesise that familial AD (FAD) mutations bias NSC toward premature neural specification, reducing the stem cell niche over time and accelerating disease progression. Somatic cells derived from patients with FAD (PSEN1 A246E and PSEN1 M146L heterozygous mutations) and healthy controls were reprogrammed to generate induced pluripotent stem cells (iPSC). Pluripotency for patient and control iPSC lines was confirmed, then cells were amplified and cryopreserved as stores. iPSC were subjected to neural specification to rosette-forming SOX2+/nestin+ NSCs for comparative evaluations between FAD and age-matched controls. FAD patient and control NSC were passaged under defined steady state culture conditions to assess stem cell maintenance using quantitative molecular markers (SOX2, nestin, NeuN, MAP2 and βIII-tubulin). We observed trends towards downregulated expression of the nestin coding gene NES (p=0.051) and upregulated expression of MAP2 (p=0.16) in PSEN1 NSC compared with control NSC, indicative of a premature differentiation phenotype induced by presence of the PSEN1 mutation. Cell cycle analysis of PSEN1 NSC showed that compared with controls, a greater number of PSEN1 NSC were retained in G0/G1 phase of the cell cycle (p=0.39), fewer progressed to S-phase (p=0.11) and fewer still reached G2 phase (p=0.23), suggesting cell cycle progression may be impaired in PSEN1 NSC. Nuclear DNA fragmentation was increased (p=0.10) in FAD NSC compared with controls, indicative of elevated cell death/apoptosis. Flow cytometry-based analysis of live, nestin+ NSC and NPC indicated increased apoptosis (p=0.14) in FAD NSC compared with controls, as well as increasing levels of apoptosis (p=0.33) in FAD NSC as they specified to neural progenitor cells. Global RNA sequencing was used to identify transcriptomic changes occurring during both disease and control neural specification. GO analysis of DEGs between PSEN1 and control NSC at P3 revealed significant upregulation (FDR<0.0000259) of 5 biological processes related to transcription and gene expression as well as significant upregulation (FDR<0.000000725) of 12 molecular functions related to DNA binding and transcription factor activity. These data suggest significant changes in gene expression were occurring in PSEN1 NSC at P3 compared with control NSC at the same stage in neural specification. The number of DEGs (p<0.05) between PSEN1 and control NSC at P3 was 9.92-fold higher than the number of DEGs between PSEN1 and control NSC at P2, suggesting transcriptomic differences between PSEN1 and control NSC become more pronounced as cells specify further down the neural lineage. Gene ontology (GO) analysis of differentially expressed genes (DEGs) specific to AD neural differentiation revealed significant dysregulation (FDR p<0.05) of genes related to neurogenesis, apoptosis, cell cycle, transcriptional control, and cell growth/maintenance as PSEN1 NSC matured from P2 to P3. The number of DEGs (p<0.05) in PSEN1 neural differentiation was 4.7-fold higher than the number of DEGs seen in control neural differentiation, indicating more transcriptional changes occurred in PSEN1 NSC than in controls at the same time point in neural specification. Dysregulation of Notch signalling was specific to PSEN1 neural differentiation and Notch related DEGs significantly upregulated (p<0.05) in PSEN1 NSC at P3 compared with P2 included NCOR2, JAG2, CHAC1 and RFNG. qPCR based validation displayed significant upregulation of RFNG (p=0.04) in PSEN1 NSC at P3 compared with PSEN1 NSC at P2, and indicated a trend towards upregulation of JAG2 expression, correlating with RNA sequencing data. Data generated in this study indicate that presence of the PSEN1 mutation significantly increases the number of transcriptional changes occurring during neural differentiation. It is plausible that transcriptional changes to Notch signalling cause dysregulated neural specification and increased apoptosis in PSEN1 NSC, ultimately resulting in depletion of the NSC niche

    Urbanised forested landscape: Urbanisation, timber extraction and forest care on the Vișeu Valley, northern Romania

    Get PDF
    By looking at urbanisation processes from the vantage point of the forest, and the ways in which it both constitutes our living space while having been separated from the bounded space of the urban in modern history, the thesis asks: How can we (re)imagine urbanisation beyond the limits of the urban? How can a feminine line of thinking engage with the forest beyond the capitalist-colonial paradigm and its extractive project? and How can we “think with care” (Puig de la Bellacasa 2017) towards the forest as an inhabitant of our common world, instead of perpetuating the image of the forest as a space outside the delimited boundaries of the city? Through a case study research, introducing the Vișeu Valley in northern Romania as both a site engaged in the circulation of the global timber flow, a part of what Brenner and Schmid (2014) name “planetary urbanisation”, where the extractive logging operations beginning in the late XVIIIth century have constructed it as an extractive landscape, and a more than human landscape inhabited by a multitude of beings (animal, plant, and human) the thesis argues towards the importance of forest care and indigenous knowledge in landscape management understood as a trans-generational transmission of knowledge, that is interdependent with the persistence of the landscape as such. Having a trans-scalar approach, the thesis investigates the ways in which the extractive projects of the capitalist-colonial paradigm have and still are shaping forested landscapes across the globe in order to situate the case as part of a planetary forest landscape and the contemporary debates it is engaged in. By engaging with emerging paradigms within the fields of plant communication, forestry, legal scholarship and landscape urbanism that present trees and forests as intelligent beings, and look at urbanisation as a way of inhabiting the landscape in both indigenous and modern cultures, the thesis argues towards viewing forested landscapes as more than human living spaces. Thinking urbanisation through the case of the Vișeu Valley’s urbanised forested landscape, the thesis aligns with alternate ways of viewing urbanisation as co-habitation with more than human beings, particularly those emerging from interdisciplinary research in the Amazon river basin (Tavares 2017, Heckenberger 2012) and, in light of emerging discourses on the rights of nature, proposes an expanded concept of planetary citizenship, to include non-human personhood

    Beam scanning by liquid-crystal biasing in a modified SIW structure

    Get PDF
    A fixed-frequency beam-scanning 1D antenna based on Liquid Crystals (LCs) is designed for application in 2D scanning with lateral alignment. The 2D array environment imposes full decoupling of adjacent 1D antennas, which often conflicts with the LC requirement of DC biasing: the proposed design accommodates both. The LC medium is placed inside a Substrate Integrated Waveguide (SIW) modified to work as a Groove Gap Waveguide, with radiating slots etched on the upper broad wall, that radiates as a Leaky-Wave Antenna (LWA). This allows effective application of the DC bias voltage needed for tuning the LCs. At the same time, the RF field remains laterally confined, enabling the possibility to lay several antennas in parallel and achieve 2D beam scanning. The design is validated by simulation employing the actual properties of a commercial LC medium

    Diachronic profile of startup companies through social media

    Get PDF
    Peixoto, A. R., Almeida, A. D., António, N., Batista, F., & Ribeiro, R. (2023). Diachronic profile of startup companies through social media. Social Network Analysis and Mining, 13(1), 1-18. [52]. https://doi.org/10.1007/s13278-023-01055-2 --- Funding: Open access funding provided by FCT|FCCN (b-on). This work was partially supported by Fundação para a Ciência e a Tecnologia, I.P. (FCT) namely by ISTAR Projects: UIDB/04466/2020 and UIDP/04466/2020; UIDB/04152/2020 (MagIC/NOVA IMS); and UIDB/50021/2020 (INESC-ID).Social media platforms have become powerful tools for startups, helping them find customers and raise funding. In this study, we applied a social media intelligence-based methodology to analyze startups’ content and to understand how their communication strategies may differ during their scaling process. To understand if a startup’s social media content reflects its current business maturation position, we first defined an adequate life cycle model for startups based on funding rounds and product maturity. Using Twitter as the source of information and selecting a sample of known Portuguese IT startups at different phases of their life cycle, we analyzed their Twitter data. After preprocessing the data, using latent Dirichlet allocation, topic modeling techniques enabled the categorization of the data according to the topics arising in the published contents of the startups, making it possible to discover that contents can be grouped into five specific topics: “Fintech and ML,” “IT,” “Business Operations,” “Product/Service R&D,” and “Bank and Funding.” By comparing those profiles against the startup’s life cycle, we were able to understand how contents change over time. This provided a diachronic profile for each company, showing that while certain topics remain prevalent in the startup’s scaling, others depend on a particular phase of the startup’s cycle. Our analysis revealed that startups’ social media content differs along their life cycle, highlighting the importance of understanding how startups use social media at different stages of their development.publishersversioninpres

    A review of abnormal behavior detection in activities of daily living

    Get PDF
    Abnormal behavior detection (ABD) systems are built to automatically identify and recognize abnormal behavior from various input data types, such as sensor-based and vision-based input. As much as the attention received for ABD systems, the number of studies on ABD in activities of daily living (ADL) is limited. Owing to the increasing rate of elderly accidents in the home compound, ABD in ADL research should be given as much attention to preventing accidents by sending out signals when abnormal behavior such as falling is detected. In this study, we compare and contrast the formation of the ABD system in ADL from input data types (sensor-based input and vision-based input) to modeling techniques (conventional and deep learning approaches). We scrutinize the public datasets available and provide solutions for one of the significant issues: the lack of datasets in ABD in ADL. This work aims to guide new research to understand the field of ABD in ADL better and serve as a reference for future study of better Ambient Assisted Living with the growing smart home trend

    KYT2022 Finnish Research Programme on Nuclear Waste Management 2019–2022 : Final Report

    Get PDF
    KYT2022 (Finnish Research Programme on Nuclear Waste Management 2019–2022), organised by the Ministry of Economic Affairs and Employment, was a national research programme with the objective to ensure that the authorities have sufficient levels of nuclear expertise and preparedness that are needed for safety of nuclear waste management. The starting point for public research programs on nuclear safety is that they create the conditions for maintaining the knowledge required for the continued safe and economic use of nuclear energy, developing new know-how and participating in international collaboration. The content of the KYT2022 research programme was composed of nationally important research topics, which are the safety, feasibility and acceptability of nuclear waste management. KYT2022 research programme also functioned as a discussion and information-sharing forum for the authorities, those responsible for nuclear waste management and the research organizations, which helped to make use of the limited research resources. The programme aimed to develop national research infrastructure, ensure the continuing availability of expertise, produce high-level scientific research and increase general knowledge of nuclear waste management

    The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions

    Full text link
    The Metaverse offers a second world beyond reality, where boundaries are non-existent, and possibilities are endless through engagement and immersive experiences using the virtual reality (VR) technology. Many disciplines can benefit from the advancement of the Metaverse when accurately developed, including the fields of technology, gaming, education, art, and culture. Nevertheless, developing the Metaverse environment to its full potential is an ambiguous task that needs proper guidance and directions. Existing surveys on the Metaverse focus only on a specific aspect and discipline of the Metaverse and lack a holistic view of the entire process. To this end, a more holistic, multi-disciplinary, in-depth, and academic and industry-oriented review is required to provide a thorough study of the Metaverse development pipeline. To address these issues, we present in this survey a novel multi-layered pipeline ecosystem composed of (1) the Metaverse computing, networking, communications and hardware infrastructure, (2) environment digitization, and (3) user interactions. For every layer, we discuss the components that detail the steps of its development. Also, for each of these components, we examine the impact of a set of enabling technologies and empowering domains (e.g., Artificial Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on its advancement. In addition, we explain the importance of these technologies to support decentralization, interoperability, user experiences, interactions, and monetization. Our presented study highlights the existing challenges for each component, followed by research directions and potential solutions. To the best of our knowledge, this survey is the most comprehensive and allows users, scholars, and entrepreneurs to get an in-depth understanding of the Metaverse ecosystem to find their opportunities and potentials for contribution
    • …
    corecore