64 research outputs found

    CodeLL: A Lifelong Learning Dataset to Support the Co-Evolution of Data and Language Models of Code

    Full text link
    Motivated by recent work on lifelong learning applications for language models (LMs) of code, we introduce CodeLL, a lifelong learning dataset focused on code changes. Our contribution addresses a notable research gap marked by the absence of a long-term temporal dimension in existing code change datasets, limiting their suitability in lifelong learning scenarios. In contrast, our dataset aims to comprehensively capture code changes across the entire release history of open-source software repositories. In this work, we introduce an initial version of CodeLL, comprising 71 machine-learning-based projects mined from Software Heritage. This dataset enables the extraction and in-depth analysis of code changes spanning 2,483 releases at both the method and API levels. CodeLL enables researchers studying the behaviour of LMs in lifelong fine-tuning settings for learning code changes. Additionally, the dataset can help studying data distribution shifts within software repositories and the evolution of API usages over time.Comment: 4+1 page

    PlayMyData: a curated dataset of multi-platform video games

    Full text link
    Being predominant in digital entertainment for decades, video games have been recognized as valuable software artifacts by the software engineering (SE) community just recently. Such an acknowledgment has unveiled several research opportunities, spanning from empirical studies to the application of AI techniques for classification tasks. In this respect, several curated game datasets have been disclosed for research purposes even though the collected data are insufficient to support the application of advanced models or to enable interdisciplinary studies. Moreover, the majority of those are limited to PC games, thus excluding notorious gaming platforms, e.g., PlayStation, Xbox, and Nintendo. In this paper, we propose PlayMyData, a curated dataset composed of 99,864 multi-platform games gathered by IGDB website. By exploiting a dedicated API, we collect relevant metadata for each game, e.g., description, genre, rating, gameplay video URLs, and screenshots. Furthermore, we enrich PlayMyData with the timing needed to complete each game by mining the HLTB website. To the best of our knowledge, this is the most comprehensive dataset in the domain that can be used to support different automated tasks in SE. More importantly, PlayMyData can be used to foster cross-domain investigations built on top of the provided multimedia data.Comment: Accepted at the The 21st Mining Software Repositories (MSR 2024

    Supporting Early-Safety Analysis of IoT Systems by Exploiting Testing Techniques

    Full text link
    IoT systems complexity and susceptibility to failures pose significant challenges in ensuring their reliable operation Failures can be internally generated or caused by external factors impacting both the systems correctness and its surrounding environment To investigate these complexities various modeling approaches have been proposed to raise the level of abstraction facilitating automation and analysis FailureLogic Analysis FLA is a technique that helps predict potential failure scenarios by defining how a components failure logic behaves and spreads throughout the system However manually specifying FLA rules can be arduous and errorprone leading to incomplete or inaccurate specifications In this paper we propose adopting testing methodologies to improve the completeness and correctness of these rules How failures may propagate within an IoT system can be observed by systematically injecting failures while running test cases to collect evidence useful to add complete and refine FLA rule

    Search for dark matter produced in association with bottom or top quarks in √s = 13 TeV pp collisions with the ATLAS detector

    Get PDF
    A search for weakly interacting massive particle dark matter produced in association with bottom or top quarks is presented. Final states containing third-generation quarks and miss- ing transverse momentum are considered. The analysis uses 36.1 fb−1 of proton–proton collision data recorded by the ATLAS experiment at √s = 13 TeV in 2015 and 2016. No significant excess of events above the estimated backgrounds is observed. The results are in- terpreted in the framework of simplified models of spin-0 dark-matter mediators. For colour- neutral spin-0 mediators produced in association with top quarks and decaying into a pair of dark-matter particles, mediator masses below 50 GeV are excluded assuming a dark-matter candidate mass of 1 GeV and unitary couplings. For scalar and pseudoscalar mediators produced in association with bottom quarks, the search sets limits on the production cross- section of 300 times the predicted rate for mediators with masses between 10 and 50 GeV and assuming a dark-matter mass of 1 GeV and unitary coupling. Constraints on colour- charged scalar simplified models are also presented. Assuming a dark-matter particle mass of 35 GeV, mediator particles with mass below 1.1 TeV are excluded for couplings yielding a dark-matter relic density consistent with measurements

    Search for massive, long-lived particles using multitrack displaced vertices or displaced lepton pairs in pp collisions at √s = 8 TeV with the ATLAS detector

    Get PDF
    Many extensions of the Standard Model posit the existence of heavy particles with long lifetimes. This article presents the results of a search for events containing at least one long-lived particle that decays at a significant distance from its production point into two leptons or into five or more charged particles. This analysis uses a data sample of proton-proton collisions at √s=8  TeV corresponding to an integrated luminosity of 20.3  fb−1 collected in 2012 by the ATLAS detector operating at the Large Hadron Collider. No events are observed in any of the signal regions, and limits are set on model parameters within supersymmetric scenarios involving R-parity violation, split supersymmetry, and gauge mediation. In some of the search channels, the trigger and search strategy are based only on the decay products of individual long-lived particles, irrespective of the rest of the event. In these cases, the provided limits can easily be reinterpreted in different scenarios

    Measurement of the CP-violating phase ϕs and the Bs0 meson decay width difference with Bs0 → J/ψϕ decays in ATLAS

    Get PDF
    A measurement of the Bs0 decay parameters in the Bs0 → J/ψϕ channel using an integrated luminosity of 14.3 fb−1 collected by the ATLAS detector from 8 TeV pp collisions at the LHC is presented. The measured parameters include the CP -violating phase ϕs, the decay width Γs and the width difference between the mass eigenstates ΔΓs. The values measured for the physical parameters are statistically combined with those from 4.9 fb−1 of 7 TeV data, leading to the following: ϕ s =−0.090±0.078(stat.)±0.041(syst.)rad ΔΓ s =0.085±0.011(stat.)±0.007(syst.)ps −1 Γ s =0.675±0.003(stat.)±0.003(syst.)ps −1 In the analysis the parameter ΔΓs is constrained to be positive. Results for ϕs and ΔΓs are also presented as 68% and 95% likelihood contours in the ϕs-ΔΓs plane. Also measured in this decay channel are the transversity amplitudes and corresponding strong phases. All measurements are in agreement with the Standard Model predictions

    Measurement of the production cross-section of a single top quark in association with a W boson at 8 TeV with the ATLAS experiment

    Get PDF
    The cross-section for the production of a single top quark in association with a W boson in proton-proton collisions at s√=8TeV is measured. The dataset corresponds to an integrated luminosity of 20.3 fb−1, collected by the ATLAS detector in 2012 at the Large Hadron Collider at CERN. Events containing two leptons and one central b-jet are selected. The W t signal is separated from the backgrounds using boosted decision trees, each of which combines a number of discriminating variables into one classifier. Production of W t events is observed with a significance of 7.7σ. The cross-section is extracted in a profile likelihood fit to the classifier output distributions. The W t cross-section, inclusive of decay modes, is measured to be 23.0 ± 1.3(stat.)− 3.5+ 3.2(syst.)±1.1(lumi.) pb. The measured cross-section is used to extract a value for the CKM matrix element |Vtb| of 1.01 ± 0.10 and a lower limit of 0.80 at the 95% confidence level. The cross-section for the production of a top quark and a W boson is also measured in a fiducial acceptance requiring two leptons with pT> 25 GeV and |η| 20 GeV and |η|  20 GeV, including both W t and top-quark pair events as signal. The measured value of the fiducial cross-section is 0.85 ± 0.01(stat.)− 0.07+ 0.07(syst.)±0.03(lumi.) pb

    Measurement of the differential cross-section of highly boosted top quarks as a function of their transverse momentum in s =8 TeV proton-proton collisions using the ATLAS detector

    Get PDF
    The differential cross-section for pair production of top quarks with high transverse momentum is measured in 20.3  fb−1 of proton-proton collisions at a center-of-mass energy of 8 TeV. The measurement is performed for tt¯ events in the lepton+jets channel. The cross-section is reported as a function of the hadronically decaying top quark transverse momentum for values above 300 GeV. The hadronically decaying top quark is reconstructed as an anti-kt jet with radius parameter R=1.0 and identified with jet substructure techniques. The observed yield is corrected for detector effects to obtain a cross-section at particle level in a fiducial region close to the event selection. A parton-level cross-section extrapolated to the full phase space is also reported for top quarks with transverse momentum above 300 GeV. The predictions of a majority of next-to-leading-order and leading-order matrix-element Monte Carlo generators are found to agree with the measured cross-sections.- We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently. We acknowledge the support of ANPCyT, Argentina; YerPhI, Armenia; ARC, Australia; BMWFW and FWF, Austria; ANAS, Azerbaijan; SSTC, Belarus; CNPq and FAPESP, Brazil; NSERC, NRC and CFI, Canada; CERN; CONICYT, Chile; CAS, MOST and NSFC, China; COLCIENCIAS, Colombia; MSMT CR, MPO CR and VSC CR, Czech Republic; DNRF, DNSRC and Lundbeck Foundation, Denmark; IN2P3-CNRS, CEA-DSM/IRFU, France; GNSF, Georgia; BMBF, HGF, and MPG, Germany; GSRT, Greece; RGC, Hong Kong SAR, China; ISF, I-CORE and Benoziyo Center, Israel; INFN, Italy; MEXT and JSPS, Japan; CNRST, Morocco; FOM and NWO, Netherlands; RCN, Norway; MNiSW and NCN, Poland; FCT, Portugal; MNE/IFA, Romania; MES of Russia and NRC KI, Russian Federation; JINR; MESTD, Serbia; MSSR, Slovakia; ARRS and MIZS, Slovenia; DST/NRF, South Africa; MINECO, Spain; SRC and Wallenberg Foundation, Sweden; SERI, SNSF and Cantons of Bern and Geneva, Switzerland; MOST, Taiwan; TAEK, Turkey; STFC, United Kingdom; DOE and NSF, United States of America. In addition, individual groups and members have received support from BCKDF, the Canada Council, CANARIE, CRC, Compute Canada, FQRNT, and the Ontario Innovation Trust, Canada; EPLANET, ERC, FP7, Horizon 2020 and Marie Sklodowska-Curie Actions, European Union; Investissements d'Avenir Labex and Idex, ANR, Region Auvergne and Fondation Partager le Savoir, France; DFG and AvH Foundation, Germany; Herakleitos, Thales and Aristeia programmes co-financed by EU-ESF and the Greek NSRF; BSF, GIF and Minerva, Israel; BRF, Norway; the Royal Society and Leverhulme Trust, United Kingdom. The crucial computing support from all WLCG partners is acknowledged gratefully, in particular from CERN and the ATLAS Tier-1 facilities at TRIUMF (Canada), NDGF (Denmark, Norway, Sweden), CC-IN2P3 (France), KIT/GridKA (Germany), INFN-CNAF (Italy), NL-T1 (Netherlands), PIC (Spain), ASGC (Taiwan), RAL (UK) an
    corecore