7 research outputs found

    TraceSim: An alignment method for computing stack trace similarity

    Get PDF
    ABSTRACT: Software systems can automatically submit crash reports to a repository for investigation when program failures occur. A significant portion of these crash reports are duplicate, i.e., they are caused by the same software issue. Therefore, if the volume of submitted reports is very large, automatic grouping of duplicate crash reports can significantly ease and speed up analysis of software failures. This task is known as crash report deduplication. Given a huge volume of incoming reports, increasing quality of deduplication is an important task. The majority of studies address it via information retrieval or sequence matching methods based on the similarity of stack traces from two crash reports. While information retrieval methods disregard the position of a frame in a stack trace, the existing works based on sequence matching algorithms do not fully consider subroutine global frequency and unmatched frames. Besides, due to data distribution differences among software projects, parameters that are learned using machine learning algorithms are necessary to provide more flexibility to the methods. In this paper, we propose TraceSim – an approach for crash report deduplication which combines TF-IDF, optimum global alignment, and machine learning (ML) in a novel way. Moreover, we propose a new evaluation methodology for this task that is more comprehensive and robust than previously used evaluation approaches. TraceSim significantly outperforms seven baselines and state-of-the-art methods in the majority of the scenarios. It is the only approach that achieves competitive results on all datasets regarding all considered metrics. Moreover, we conduct an extensive ablation study that demonstrates the importance of each TraceSim’s element to its final performance and robustness. Finally, we provide the source code for all considered methods and evaluation methodology as well as the created datasets

    New Sulfamides Based on 1-Izopropil-3-Ξ±-Naftyl-5- Methoxymethyl-4-Aminopyrazole and Determination of Their Structure

    Get PDF
    Для Ρ€Π°Π½Π΅Π΅ ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½ΠΎΠ³ΠΎ 1-ΠΈΠ·ΠΎΠΏΡ€ΠΎΠΏΠΈΠ»-3-Ξ±-Π½Π°Ρ„Ρ‚ΠΈΠ»-5-мСтоксимСтил-4-Π½ΠΈΡ‚Ρ€ΠΎΠ·ΠΎΠΏΠΈΡ€Π°Π·ΠΎΠ»Π° ΠΏΡ€ΠΎΠ²Π΅Π΄Π΅Π½Π° рСакция восстановлСния Π³ΠΈΠ΄Ρ€Π°Π·ΠΈΠ½Π³ΠΈΠ΄Ρ€Π°Ρ‚ΠΎΠΌ. Π’ΠΏΠ΅Ρ€Π²Ρ‹Π΅ Π±Ρ‹Π» синтСзирован 1-ΠΈΠ·ΠΎΠΏΡ€ΠΎΠΏΠΈΠ»-3-Ξ±-Π½Π°Ρ„Ρ‚ΠΈΠ»-5-мСтоксимСтил-4-Π°ΠΌΠΈΠ½ΠΎΠΏΠΈΡ€Π°Π·ΠΎΠ», ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ Π·Π°Ρ‚Π΅ΠΌ ΡΡƒΠ»ΡŒΡ„ΠΎΠ½ΠΈΠ»ΠΈΡ€ΠΎΠ²Π°Π»ΠΈ ΠΏ-Π°Ρ†Π΅Ρ‚Π°ΠΌΠΈΠ΄ΠΎΠ±Π΅Π½Π·ΠΎΠ»ΡΡƒΠ»ΡŒΡ„ΠΎΡ…Π»ΠΎΡ€ΠΈΠ΄ΠΎΠΌ ΠΈ ΠΏ-Ρ‚ΠΎΠ»ΡƒΠΎΠ»ΡΡƒΠ»ΡŒΡ„ΠΎΡ…Π»ΠΎΡ€ΠΈΠ΄ΠΎΠΌ. Π’ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Π΅ ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Ρ‹ Ρ€Π°Π½Π΅Π΅ нСизвСстныС ΡΡƒΠ»ΡŒΡ„ΠΎΠ½ΠΈΠ»ΠΈΡ€ΠΎΠ²Π°Π½Π½Ρ‹Π΅ ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄Π½Ρ‹Π΅ N-Π°Π»ΠΊΠΈΠ»ΠΈΡ€ΠΎΠ²Π°Π½Π½Ρ‹Ρ… Π°ΠΌΠΈΠ½ΠΎΠΏΠΈΡ€Π°Π·ΠΎΠ»ΠΎΠ². Бостав ΠΈ строСниС ΠΏΠΎΠ΄Ρ‚Π²Π΅Ρ€ΠΆΠ΄Π΅Π½Ρ‹ соврСмСнными ΠΌΠ΅Ρ‚ΠΎΠ΄Π°ΠΌΠΈ Π°Π½Π°Π»ΠΈΠ·Π°, Ρ‚Π°ΠΊΠΈΠΌΠΈ ΠΊΠ°ΠΊ ИК-, ЯМР 1Н-спСктроскопия ΠΈ масс-спСктромСтрияFor the previously obtained 1-isopropyl-3-Ξ±-naphthyl-5-methoxymethyl-4-nitrosopyrazole, a reduction reaction with hydrazine hydrate was performed. It was first synthesized by 1-isopropyl-3-Ξ±-naphthyl- 5-methoxymethyl-4-aminopyrazole which was then sulfonylated by p-acetamidobenzenesulfonyl chloride and p-toluenesulfonic chloride. As a result previously unknown sulfonylated derivatives of N-alkylated aminopyrazoles were obtained. The composition and structure are confirmed by modern methods of analysis such as IR, 1H NMR spectroscopy and mass spectrometr

    S3M: Siamese Stack (Trace) Similarity Measure

    No full text
    Automatic crash reporting systems have become a de-facto standard in software development. These systems monitor target software, and if a crash occurs they send details to a backend application. Later on, these reports are aggregated and used in the development process to 1) understand whether it is a new or an existing issue, 2) assign these bugs to appropriate developers, and 3) gain a general overview of the application's bug landscape. The efficiency of report aggregation and subsequent operations heavily depends on the quality of the report similarity metric. However, a distinctive feature of this kind of report is that no textual input from the user (i.e., bug description) is available: it contains only stack trace information. In this paper, we present S3M ("extreme") -- the first approach to computing stack trace similarity based on deep learning. It is based on a siamese architecture that uses a biLSTM encoder and a fully-connected classifier to compute similarity. Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset. Additionally, we review the impact of stack trace trimming on the quality of the results

    All You Need Is Logs: Improving Code Completion by Learning from Anonymous IDE Usage Logs

    Full text link
    Integrated Development Environments (IDE) are designed to make users more productive, as well as to make their work more comfortable. To achieve this, a lot of diverse tools are embedded into IDEs, and the developers of IDEs can employ anonymous usage logs to collect the data about how they are being used to improve them. A particularly important component that this can be applied to is code completion, since improving code completion using statistical learning techniques is a well-established research area. In this work, we propose an approach for collecting completion usage logs from the users in an IDE and using them to train a machine learning based model for ranking completion candidates. We developed a set of features that describe completion candidates and their context, and deployed their anonymized collection in the Early Access Program of IntelliJ-based IDEs. We used the logs to collect a dataset of code completions from users, and employed it to train a ranking CatBoost model. Then, we evaluated it in two settings: on a held-out set of the collected completions and in a separate A/B test on two different groups of users in the IDE. Our evaluation shows that using a simple ranking model trained on the past user behavior logs significantly improved code completion experience. Compared to the default heuristics-based ranking, our model demonstrated a decrease in the number of typing actions necessary to perform the completion in the IDE from 2.073 to 1.832. The approach adheres to privacy requirements and legal constraints, since it does not require collecting personal information, performing all the necessary anonymization on the client's side. Importantly, it can be improved continuously: implementing new features, collecting new data, and evaluating new models - this way, we have been using it in production since the end of 2020.Comment: 11 pages, 4 figure
    corecore