995 research outputs found
A Survey on Automated Software Vulnerability Detection Using Machine Learning and Deep Learning
Software vulnerability detection is critical in software security because it
identifies potential bugs in software systems, enabling immediate remediation
and mitigation measures to be implemented before they may be exploited.
Automatic vulnerability identification is important because it can evaluate
large codebases more efficiently than manual code auditing. Many Machine
Learning (ML) and Deep Learning (DL) based models for detecting vulnerabilities
in source code have been presented in recent years. However, a survey that
summarises, classifies, and analyses the application of ML/DL models for
vulnerability detection is missing. It may be difficult to discover gaps in
existing research and potential for future improvement without a comprehensive
survey. This could result in essential areas of research being overlooked or
under-represented, leading to a skewed understanding of the state of the art in
vulnerability detection. This work address that gap by presenting a systematic
survey to characterize various features of ML/DL-based source code level
software vulnerability detection approaches via five primary research questions
(RQs). Specifically, our RQ1 examines the trend of publications that leverage
ML/DL for vulnerability detection, including the evolution of research and the
distribution of publication venues. RQ2 describes vulnerability datasets used
by existing ML/DL-based models, including their sources, types, and
representations, as well as analyses of the embedding techniques used by these
approaches. RQ3 explores the model architectures and design assumptions of
ML/DL-based vulnerability detection approaches. RQ4 summarises the type and
frequency of vulnerabilities that are covered by existing studies. Lastly, RQ5
presents a list of current challenges to be researched and an outline of a
potential research roadmap that highlights crucial opportunities for future
work
Model-Driven Engineering for Artificial Intelligence - A Systematic Literature Review
Objective: This study aims to investigate the existing body of knowledge in the field of Model-Driven Engineering MDE in support of AI (MDE4AI) to sharpen future research further and define the current state of the art. Method: We conducted a Systemic Literature Review (SLR), collecting papers from five major databases resulting in 703 candidate studies, eventually retaining 15 primary studies. Each primary study will be evaluated and discussed with respect to the adoption of (1) MDE principles and practices and (2) the phases of AI development support aligned with the stages of the CRISP-DM methodology. Results: The study's findings show that the pillar concepts of MDE (metamodel, concrete syntax and model transformation), are leveraged to define domain-specific languages (DSL) explicitly addressing AI concerns. Different MDE technologies are used, leveraging different language workbenches. The most prominent AI-related concerns are training and modeling of the AI algorithm, while minor emphasis is given to the time-consuming preparation of the data sets. Early project phases that support interdisciplinary communication of requirements, such as the CRISP-DM \textit{Business Understanding} phase, are rarely reflected. Conclusion: The study found that the use of MDE for AI is still in its early stages, and there is no single tool or method that is widely used. Additionally, current approaches tend to focus on specific stages of development rather than providing support for the entire development process. As a result, the study suggests several research directions to further improve the use of MDE for AI and to guide future research in this area
Use and misuse of the term "Experiment" in mining software repositories research
The significant momentum and importance of Mining Software Repositories (MSR) in Software Engineering (SE) has fostered new opportunities and challenges for extensive empirical research. However, MSR researchers seem to struggle to characterize the empirical methods they use into the existing empirical SE body of knowledge. This is especially the case of MSR experiments. To provide evidence on the special characteristics of MSR experiments and their differences with experiments traditionally acknowledged in SE so far, we elicited the hallmarks that differentiate an experiment from other types of empirical studies and characterized the hallmarks and types of experiments in MSR. We analyzed MSR literature obtained from a small-scale systematic mapping study to assess the use of the term experiment in MSR. We found that 19% of the papers claiming to be an experiment are indeed not an experiment at all but also observational studies, so they use the term in a misleading way. From the remaining 81% of the papers, only one of them refers to a genuine controlled experiment while the others stand for experiments with limited control. MSR researchers tend to overlook such limitations, compromising the interpretation of the results of their studies. We provide recommendations and insights to support the improvement of MSR experiments.This work has been partially supported by the Spanish project: MCI PID2020-117191RB-I00.Peer ReviewedPostprint (author's final draft
Understanding the Issues, Their Causes and Solutions in Microservices Systems: An Empirical Study
Many small to large organizations have adopted the Microservices Architecture
(MSA) style to develop and deliver their core businesses. Despite the
popularity of MSA in the software industry, there is a limited evidence-based
and thorough understanding of the types of issues (e.g., errors, faults,
failures, and bugs) that microservices system developers experience, the causes
of the issues, and the solutions as potential fixing strategies to address the
issues. To ameliorate this gap, we conducted a mixed-methods empirical study
that collected data from 2,641 issues from the issue tracking systems of 15
open-source microservices systems on GitHub, 15 interviews, and an online
survey completed by 150 practitioners from 42 countries across 6 continents.
Our analysis led to comprehensive taxonomies for the issues, causes, and
solutions. The findings of this study inform that Technical Debt, Continuous
Integration and Delivery, Exception Handling, Service Execution and
Communication, and Security are the most dominant issues in microservices
systems. Furthermore, General Programming Errors, Missing Features and
Artifacts, and Invalid Configuration and Communication are the main causes
behind the issues. Finally, we found 177 types of solutions that can be applied
to fix the identified issues. Based on our study results, we formulated future
research directions that could help researchers and practitioners to engineer
emergent and next-generation microservices systems.Comment: 35 pages, 5 images, 7 tables, Manuscript submitted to a Journal
(2023
Social media mining under the COVID-19 context: Progress, challenges, and opportunities
Social media platforms allow users worldwide to create and share information, forging vast sensing networks that
allow information on certain topics to be collected, stored, mined, and analyzed in a rapid manner. During the
COVID-19 pandemic, extensive social media mining efforts have been undertaken to tackle COVID-19 challenges
from various perspectives. This review summarizes the progress of social media data mining studies in the
COVID-19 contexts and categorizes them into six major domains, including early warning and detection, human
mobility monitoring, communication and information conveying, public attitudes and emotions, infodemic and
misinformation, and hatred and violence. We further document essential features of publicly available COVID-19
related social media data archives that will benefit research communities in conducting replicable and repro�ducible studies. In addition, we discuss seven challenges in social media analytics associated with their potential
impacts on derived COVID-19 findings, followed by our visions for the possible paths forward in regard to social
media-based COVID-19 investigations. This review serves as a valuable reference that recaps social media mining
efforts in COVID-19 related studies and provides future directions along which the information harnessed from
social media can be used to address public health emergencies
Quantinar: a blockchain p2p ecosystem for honest scientific research
Living in the Information Age, the power of data and correct statistical
analysis has never been more prevalent. Academics, practitioners and many other
professionals nowadays require an accurate application of quantitative methods.
Though many branches are subject to a crisis of integrity, which is shown in
improper use of statistical models, -hacking, HARKing or failure to
replicate results. We propose the use of a peer-to-peer education network,
Quantinar, to spread quantitative analysis knowledge embedded with code in the
form of Quantlets. The integration of blockchain technology makes Quantinar a
decentralised autonomous organisation (DAO) that ensures fully transparent and
reproducible scientific research
- …