973 research outputs found

    A Systematic Review of Automated Query Reformulations in Source Code Search

    Full text link
    Fixing software bugs and adding new features are two of the major maintenance tasks. Software bugs and features are reported as change requests. Developers consult these requests and often choose a few keywords from them as an ad hoc query. Then they execute the query with a search engine to find the exact locations within software code that need to be changed. Unfortunately, even experienced developers often fail to choose appropriate queries, which leads to costly trials and errors during a code search. Over the years, many studies attempt to reformulate the ad hoc queries from developers to support them. In this systematic literature review, we carefully select 70 primary studies on query reformulations from 2,970 candidate studies, perform an in-depth qualitative analysis (e.g., Grounded Theory), and then answer seven research questions with major findings. First, to date, eight major methodologies (e.g., term weighting, term co-occurrence analysis, thesaurus lookup) have been adopted to reformulate queries. Second, the existing studies suffer from several major limitations (e.g., lack of generalizability, vocabulary mismatch problem, subjective bias) that might prevent their wide adoption. Finally, we discuss the best practices and future opportunities to advance the state of research in search query reformulations.Comment: 81 pages, accepted at TOSE

    Source Code Retrieval from Large Software Libraries for Automatic Bug Localization

    Get PDF
    This dissertation advances the state-of-the-art in information retrieval (IR) based approaches to automatic bug localization in software. In an IR-based approach, one first creates a search engine using a probabilistic or a deterministic model for the files in a software library. Subsequently, a bug report is treated as a query to the search engine for retrieving the files relevant to the bug. With regard to the new work presented, we first demonstrate the importance of taking version histories of the files into account for achieving significant improvements in the precision with which the files related to a bug are located. This is motivated by the realization that the files that have not changed in a long time are likely to have ``stabilized and are therefore less likely to contain bugs. Subsequently, we look at the difficulties created by the fact that developers frequently use abbreviations and concatenations that are not likely to be familiar to someone trying to locate the files related to a bug. We show how an initial query can be automatically reformulated to include the relevant actual terms in the files by an analysis of the files retrieved in response to the original query for terms that are proximal to the original query terms. The last part of this dissertation generalizes our term-proximity based work by using Markov Random Fields (MRF) to model the inter-term dependencies in a query vis-a-vis the files. Our MRF work redresses one of the major defects of the most commonly used modeling approaches in IR, which is the loss of all inter-term relationships in the documents

    Recovering fitness gradients for interprocedural Boolean flags in search-based testing

    Get PDF
    National Research Foundation (NRF) Singapore under Corp Lab @ University scheme; National Research Foundation (NRF) Singapore under its NSoE Programm

    Metadiscourse analysis of digital interpersonal interactions in academic settings in Turkey

    Get PDF
    Rapid technological advances, efficiency and easy access have firmly established emailing as a vital medium of communication in the last decades. Nowadays, all around the world, particularly in educational settings, the medium is one of the most widely used modes of interaction between students and university lecturers. Despite their important role in academic life, very little is known about the metadiscursive characteristics of these e-messages and as far as the author is aware there is no study that has examined metadiscourse in request emails in Turkish. This study aims to contribute to filling in this gap by focusing on the following two research questions: (i) How many and what type of interpersonal metadiscourse markers are used in request emails sent by students to their lecturers? (ii) Where are they placed and how are they combined with other elements in the text? In order to answer these questions a corpus of unsolicited request e-mails in Turkish was compiled. The data collection started in January 2010 and continued until March 2018. A total of 353 request emails sent from university students to their lecturers were collected. The data were first transcribed in CLAN CHILDES format and analysed using the interpersonal model. The metadiscourse categories that aimed to involve readers in the email were identified and classified. Next, their places in the text were determined and described in detail. Findings of the study show that request emails include a wide array of multifunctional interpersonal metadiscourse markers which are intricately combined and employed by the writers to reach their aims. The results also showed that there is a close relation between the “weight of the request” and number of the interpersonal metadiscourse markers in request mails

    Steps towards adaptive situation and context-aware access: a contribution to the extension of access control mechanisms within pervasive information systems

    Get PDF
    L'évolution des systèmes pervasives a ouvert de nouveaux horizons aux systèmes d'information classiques qui ont intégré des nouvelles technologies et des services qui assurent la transparence d'accès aux resources d'information à n'importe quand, n'importe où et n'importe comment. En même temps, cette évolution a relevé des nouveaux défis à la sécurité de données et à la modélisation du contrôle d'accès. Afin de confronter ces challenges, differents travaux de recherche se sont dirigés vers l'extension des modèles de contrôles d'accès (en particulier le modèle RBAC) afin de prendre en compte la sensibilité au contexte dans le processus de prise de décision. Mais la liaison d'une décision d'accès aux contraintes contextuelles dynamiques d'un utilisateur mobile va non seulement ajouter plus de complexité au processus de prise de décision mais pourra aussi augmenter les possibilités de refus d'accès. Sachant que l'accessibilité est un élément clé dans les systèmes pervasifs et prenant en compte l'importance d'assurer l'accéssibilité en situations du temps réel, nombreux travaux de recherche ont proposé d'appliquer des mécanismes flexibles de contrôle d'accès avec des solutions parfois extrêmes qui depassent les frontières de sécurité telle que l'option de "Bris-de-Glace". Dans cette thèse, nous introduisons une solution modérée qui se positionne entre la rigidité des modèles de contrôle d'accès et la flexibilité qui expose des risques appliquées pendant des situations du temps réel. Notre contribution comprend deux volets : au niveau de conception, nous proposons PS-RBAC - un modèle RBAC sensible au contexte et à la situation. Le modèle réalise des attributions des permissions adaptatives et de solution de rechange à base de prise de décision basée sur la similarité face à une situation importanteÀ la phase d'exécution, nous introduisons PSQRS - un système de réécriture des requêtes sensible au contexte et à la situation et qui confronte les refus d'accès en reformulant la requête XACML de l'utilisateur et en lui proposant une liste des resources alternatives similaires qu'il peut accéder. L'objectif est de fournir un niveau de sécurité adaptative qui répond aux besoins de l'utilisateur tout en prenant en compte son rôle, ses contraintes contextuelles (localisation, réseau, dispositif, etc.) et sa situation. Notre proposition a été validé dans trois domaines d'application qui sont riches des contextes pervasifs et des scénarii du temps réel: (i) les Équipes Mobiles Gériatriques, (ii) les systèmes avioniques et (iii) les systèmes de vidéo surveillance.The evolution of pervasive computing has opened new horizons to classical information systems by integrating new technologies and services that enable seamless access to information sources at anytime, anyhow and anywhere. Meanwhile this evolution has opened new threats to information security and new challenges to access control modeling. In order to meet these challenges, many research works went towards extending traditional access control models (especially the RBAC model) in order to add context awareness within the decision-making process. Meanwhile, tying access decisions to the dynamic contextual constraints of mobile users would not only add more complexity to decision-making but could also increase the possibilities of access denial. Knowing that accessibility is a key feature for pervasive systems and taking into account the importance of providing access within real-time situations, many research works have proposed applying flexible access control mechanisms with sometimes extreme solutions that depass security boundaries such as the Break-Glass option. In this thesis, we introduce a moderate solution that stands between the rigidity of access control models and the riskful flexibility applied during real-time situations. Our contribution is twofold: on the design phase, we propose PS-RBAC - a Pervasive Situation-aware RBAC model that realizes adaptive permission assignments and alternative-based decision-making based on similarity when facing an important situation. On the implementation phase, we introduce PSQRS - a Pervasive Situation-aware Query Rewriting System architecture that confronts access denials by reformulating the user's XACML access request and proposing to him a list of alternative similar solutions that he can access. The objective is to provide a level of adaptive security that would meet the user needs while taking into consideration his role, contextual constraints (location, network, device, etc.) and his situation. Our proposal has been validated in three application domains that are rich in pervasive contexts and real-time scenarios: (i) Mobile Geriatric Teams, (ii) Avionic Systems and (iii) Video Surveillance Systems

    Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey

    Full text link
    Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of domain objectives, and the diversity of the constraints (e.g., various social norms, cultural conformity, religious beliefs, and ethical standards in the domain applications). Domain specification techniques are key to make large language models disruptive in many applications. Specifically, to solve these hurdles, there has been a notable increase in research and practices conducted in recent years on the domain specialization of LLMs. This emerging field of study, with its substantial potential for impact, necessitates a comprehensive and systematic review to better summarize and guide ongoing work in this area. In this article, we present a comprehensive survey on domain specification techniques for large language models, an emerging direction critical for large language model applications. First, we propose a systematic taxonomy that categorizes the LLM domain-specialization techniques based on the accessibility to LLMs and summarizes the framework for all the subcategories as well as their relations and differences to each other. Second, we present an extensive taxonomy of critical application domains that can benefit dramatically from specialized LLMs, discussing their practical significance and open challenges. Last, we offer our insights into the current research status and future trends in this area
    corecore