119 research outputs found

    A Dataset and an Approach for Identity Resolution of 38 Million Author IDs extracted from 2B Git Commits

    Full text link
    The data collected from open source projects provide means to model large software ecosystems, but often suffer from data quality issues, specifically, multiple author identification strings in code commits might actually be associated with one developer. While many methods have been proposed for addressing this problem, they are either heuristics requiring manual tweaking, or require too much calculation time to do pairwise comparisons for 38M author IDs in, for example, the World of Code collection. In this paper, we propose a method that finds all author IDs belonging to a single developer in this entire dataset, and share the list of all author IDs that were found to have aliases. To do this, we first create blocks of potentially connected author IDs and then use a machine learning model to predict which of these potentially related IDs belong to the same developer. We processed around 38 million author IDs and found around 14.8 million IDs to have an alias, which belong to 5.4 million different developers, with the median number of aliases being 2 per developer. This dataset can be used to create more accurate models of developer behaviour at the entire OSS ecosystem level and can be used to provide a service to rapidly resolve new author IDs

    SAMScore: A Semantic Structural Similarity Metric for Image Translation Evaluation

    Full text link
    Image translation has wide applications, such as style transfer and modality conversion, usually aiming to generate images having both high degrees of realism and faithfulness. These problems remain difficult, especially when it is important to preserve semantic structures. Traditional image-level similarity metrics are of limited use, since the semantics of an image are high-level, and not strongly governed by pixel-wise faithfulness to an original image. Towards filling this gap, we introduce SAMScore, a generic semantic structural similarity metric for evaluating the faithfulness of image translation models. SAMScore is based on the recent high-performance Segment Anything Model (SAM), which can perform semantic similarity comparisons with standout accuracy. We applied SAMScore on 19 image translation tasks, and found that it is able to outperform all other competitive metrics on all of the tasks. We envision that SAMScore will prove to be a valuable tool that will help to drive the vibrant field of image translation, by allowing for more precise evaluations of new and evolving translation models. The code is available at https://github.com/Kent0n-Li/SAMScore

    Sculpting the maturation, softening and ethylene pathway: The influences of microRNAs on tomato fruits

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs), a ubiquitous class of short RNAs, play vital roles in physiological and biochemical processes in plants by mediating gene silencing at post-transcriptional (PTGS) level. Tomato is a model system to study molecular basis of fleshy fruit ripening and senescence, ethylene biosynthesis and signal transduction owing to its genetic and molecular tractability. To study the functions of miRNAs in tomato fruit ripening and senescence, and their possible roles in ethylene response, the next generation sequencing method was employed to identify miRNAs in tomato fruit. Bioinformatics and molecular biology approaches were combined to profile the miRNAs expression patterns at three different fruit ripening stages and by exogenous ethylene treatment.</p> <p>Results</p> <p>In addition to 7 novel miRNA families, 103 conserved miRNAs belonging to 24 families and 10 non-conserved miRNAs matching 9 families were identified in our libraries. The targets of many these miRNAs were predicted to be transcriptional factors. Other targets are known to play roles in the regulation of metabolic processes. Interestingly, some targets were predicted to be involved in fruit ripening and softening, such as Pectate Lyase, beta-galactosidase, while a few others were predicted to be involved in ethylene biosynthesis and signaling pathway, such as ACS, EIN2 and CTR1. The expression patterns of a number of such miRNAs at three ripening stages were confirmed by stem-loop RT-PCR, which showed a strong negative correlation with that of their targets. The regulation of exogenous ethylene on miRNAs expression profiles were analyzed simultaneously, and 3 down-regulated, 5 up-regulated miRNAs were found in this study.</p> <p>Conclusions</p> <p>A combination of high throughput sequencing and molecular biology approaches was used to explore the involvement of miRNAs during fruit ripening. Several miRNAs showed differential expression profiles during fruit ripening, and a number of miRNAs were influenced by ethylene treatment. The results suggest the importance of miRNAs in fruit ripening and ethylene response.</p

    Aboveground net primary productivity of vegetation along a climate-related gradient in a Eurasian temperate grassland: spatiotemporal patterns and their relationships with climate factors

    Get PDF
    Accurate assessments of spatiotemporal patterns in net primary productivity and their links to climate are important to obtain a deeper understanding of the function, stability and sustainability of grassland ecosystems. We combined a satellite-derived NDVI time-series dataset and field-based samples to investigate spatiotemporal patterns in aboveground net primary productivity (ANPP), and we examined the effect of growing season air temperate (GST) and precipitation (GSP) on these patterns along a climaterelated gradient in an eastern Eurasian grassland. Our results indicated that the ANPP fluctuated with no significant trend during 2001-2012. The spatial distribution of ANPP was heterogeneous and decreased from northeast to southwest. The interannual changes in ANPP were mainly controlled by year-to-year GSP; a strong correlation of interannual variability between ANPP and GSP was observed. Similarly, GSP strongly influenced spatial variations in ANPP, and the slopes of fitted linear functions of the GSP-ANPP relationship increased from arid temperate desert grassland to humid meadow grassland. An exponential function could be used to fit the GSP-ANPP relationship for the entire region. An improved moisture index that combines the effects of GST and GSP better explained the variations in ANPP compared with GSP alone. In comparisons with the previous studies, we found that the relationships between spatiotemporal variations in ANPP and climate factors were probably scale dependent. We imply that the quantity and spatial range of analyzed samples contribute to these different results. Multi-scale studies are necessary to improve our knowledge of the response of grassland ANPP to climate change.ArticleENVIRONMENTAL EARTH SCIENCES.76(1):56(2017)journal articl

    Rapid FRD determination for multiplexed fibre systems -- I. The quasi-near field model and its uncertainties

    Full text link
    Focal Ratio Degradation (FRD) in fibres is a crucial factor to control in astronomical instruments in order to minimize light loss. As astronomical instrumentation has advanced, the integration of large populations of fibres has become common. However, determining FRD in multiplexed fibre systems has become a challenging and time-consuming task. The Integral Field Unit for the Fiber Arrayed Solar Optical Telescope (FASOT-IFU) represents the most densely arranged fibre-based IFU in a single unit. Due to the close packing of fibres in the V-groove of the slit end, measuring FRD is particularly challenging as the output spots are prone to overlapping with adjacent fibres. In this paper, a novel method based on the quasi-near field model is proposed to enable rapid FRD measurement in highly multiplexed fibre systems like IFUs and multi-object observation systems. The principle and uncertainties associated with the method are investigated. The method's validity is demonstrated by applying it to determine the FRD in FASOT-IFU, with the achieved FRD performance meeting the acceptable requirements of FASOT-IFU, where the output focal ratio primarily falls within the range of 5.0-7.0. The results indicate that the proposed method offers several advantages, including the simultaneous and rapid measurement of FRD in multiple fibres with high accuracy (error smaller than 0.35 in F-ratio). Furthermore, besides FRD, the method exhibits potential for extensive measurements of throughput, scrambling, and spectral analysis.Comment: 10 pages, 12 figures, submitted to MNRA

    Experimental study on mechanical properties of filling-bulk ce-menting combination body

    Get PDF
    In order to study the influence of caved rocks in the goaf on the backfilling body in the backfilling mining, uniaxial compression test are carried out on the backfilling body-cemented granular body combination with different granular heights, discrete element lithology and backfilling body strength. The uniaxial compression failure of the combination body specimen is monitored in real time by using the three-dimensional acoustic emission positioning technology. The deformation and failure corresponding to the AE events in the loading process is characterized by combining the time parameters of AE events with the starting time points of the four stages of the stress-strain curve. Based on this, the failure model for the interface of the combination body is established. The results show that the height of granular is negatively correlated with the strength of the combination body, and the uniaxial compressive strength of the combination body with the backfilling height ratio of 1:4 is only 55.0 % of that of the single backfilling body. The discrete element lithology and the strength of backfilling body are positively correlated with the strength of the combination body. Although high-strength backfilling body can improve the uniaxial compressive strength of the combination body, the higher the strength of filling body in the combination body, the more serious the strength reduction of the combination body. When the particle lithology in cemented bulk is siltstone with low strength, the uniaxial compressive strength of the combination body is only 42.9% of that of single combination body. The siltstone with smaller compressive strength will have a fracture plane due to shear failure during the failure, and the limestone with larger compressive strength can withstand shear load by using the shear strength of the granular particles. When the cementing matrix in the cemented granular fails or the particles in the cemented granular are broken, the interface of the backfilling body and the cemented granular undergoes non-uniform compression deformation, resulting in the stress concentration on the backfilling body on the interface damaged by the cemented granular, resulting in the shear failure of the upper backfilling body locally, and the failure of backfilling body is the contribution of both axial stress and non-uniform deformation of the interface

    Analysis of Prognostic Risk Factors Determining Poor Functional Recovery After Comprehensive Rehabilitation Including Motor-Imagery Brain-Computer Interface Training in Stroke Patients: A Prospective Study

    Get PDF
    Objective: Upper limb (UL) motor function recovery, especially distal function, is one of the main goals of stroke rehabilitation as this function is important to perform activities of daily living (ADL). The efficacy of the motor-imagery brain-computer interface (MI-BCI) has been demonstrated in patients with stroke. Most patients with stroke receive comprehensive rehabilitation, including MI-BCI and routine training. However, most aspects of MI-BCI training for patients with subacute stroke are based on routine training. Risk factors for inadequate distal UL functional recovery in these patients remain unclear; therefore, it is more realistic to explore the prognostic factors of this comprehensive treatment based on clinical practice. The present study aims to investigate the independent risk factors that might lead to inadequate distal UL functional recovery in patients with stroke after comprehensive rehabilitation including MI-BCI (CRIMI-BCI).Methods: This prospective study recruited 82 patients with stroke who underwent CRIMI-BCI. Motor-imagery brain-computer interface training was performed for 60 min per day, 5 days per week for 4 weeks. The primary outcome was improvement of the wrist and hand dimensionality of Fugl-Meyer Assessment (δFMA-WH). According to the improvement score, the patients were classified into the efficient group (EG, δFMA-WH &gt; 2) and the inefficient group (IG, δFMA-WH ≤ 2). Binary logistic regression was used to analyze clinical and demographic data, including aphasia, spasticity of the affected hand [assessed by Modified Ashworth Scale (MAS-H)], initial UL function, age, gender, time since stroke (TSS), lesion hemisphere, and lesion location.Results: Seventy-three patients completed the study. After training, all patients showed significant improvement in FMA-UL (Z = 7.381, p = 0.000**), FMA-SE (Z = 7.336, p = 0.000**), and FMA-WH (Z = 6.568, p = 0.000**). There were 35 patients (47.9%) in the IG group and 38 patients (52.1%) in the EG group. Multivariate analysis revealed that presence of aphasia [odds ratio (OR) 4.617, 95% confidence interval (CI) 1.435–14.860; p &lt; 0.05], initial FMA-UL score ≤ 30 (OR 5.158, 95% CI 1.150–23.132; p &lt; 0.05), and MAS-H ≥ level I+ (OR 3.810, 95% CI 1.231–11.790; p &lt; 0.05) were the risk factors for inadequate distal UL functional recovery in patients with stroke after CRIMI-BCI.Conclusion: We concluded that CRIMI-BCI improved UL function in stroke patients with varying effectiveness. Inferior initial UL function, significant hand spasticity, and presence of aphasia were identified as independent risk factors for inadequate distal UL functional recovery in stroke patients after CRIMI-BCI
    • …
    corecore