384 research outputs found

    A Unified Checklist for Observational and Experimental Research in Software Engineering (Version 1)

    Get PDF
    Current checklists for empirical software engineering cover either experimental research or case study research but ignore the many commonalities that exist across all kinds of empirical research. Identifying these commonalities, and explaining why they exist, would enhance our understanding of empirical research in general and of the differences between experimental and case study research in particular. In this report we design a unified checklist for empirical research, and identify commonalities and differences between experimental and case study research. We design the unified checklist as a specialization of the general engineering cycle, which itself is a special case of the rational choice cycle. We then compare the resulting empirical research cycle with two checklists for experimental research, and with one checklist for case study research. The resulting checklist identifies important questions to be answered in experimental and case study research design and reports. The checklist provides insights in two different types of empirical research design and their relationships. Its limitations are that it ignores other research methods such as meta-research or surveys. It has been tested so far only in our own research designs and in teaching empirical methods. Future work includes expanding the comparison with other methods and application in more cases, by others than ourselves

    and Cost/Benefits Opportunities

    Get PDF
    Acquisition Research Program Sponsored Report SeriesSponsored Acquisition Research & Technical ReportsThe acquisition of artificial intelligence (AI) systems is a relatively new challenge for the U.S. Department of Defense (DoD). Given the potential for high-risk failures of AI system acquisitions, it is critical for the acquisition community to examine new analytical and decision-making approaches to managing the acquisition of these systems in addition to the existing approaches (i.e., Earned Value Management, or EVM). In addition, many of these systems reside in small start-up or relatively immature system development companies, further clouding the acquisition process due to their unique business processes when compared to the large defense contractors. This can lead to limited access to data, information, and processes that are required in the standard DoD acquisition approach (i.e., the 5000 series). The well-known recurring problems in acquiring information technology automation within the DoD will likely be exacerbated in acquiring complex and risky AI systems. Therefore, more robust, agile, and analytically driven acquisition methodologies will be required to help avoid costly disasters in acquiring these kinds of systems. This research provides a set of analytical tools for acquiring organically developed AI systems through a comparison and contrast of the proposed methodologies that will demonstrate when and how each method can be applied to improve the acquisitions lifecycle for AI systems, as well as provide additional insights and examples of how some of these methods can be applied. This research identifies, reviews, and proposes advanced quantitative, analytically based methods within the integrated risk management (IRM)) and knowledge value added (KVA) methodologies to complement the current EVM approach. This research examines whether the various methodologies—EVM, KVA, and IRM—could be used within the Defense Acquisition System (DAS) to improve the acquisition of AI. While this paper does not recommend one of these methodologies over the other, certain methodologies, specifically IRM, may be more beneficial when used throughout the entire acquisition process instead of within a portion of the system. Due to this complexity of AI system, this research looks at AI as a whole and not specific types of AI.Approved for public release; distribution is unlimited.Approved for public release; distribution is unlimited

    A complex systems approach to connectivity to international markets

    Get PDF
    PhD ThesisImproving connectivity is increasingly a topic at the centre of the international trade and transport policy agendas. An examination on available documents and studies in both the policy-making and the academic fields shows that the concept of connectivity has often been defined in different ways, and thus has taken a variety of meanings. This poses the questions: what is freight connectivity?; what are its determinants in the context of international trade? The researcher is not aware of any study that has analysed, in a comprehensive and systematic way, the different perspectives, determinants and measures of connectivity to international markets. Using a mixed-methods approach that includes a systematic literature review encompassing literature in the fields of Transport Engineering and Economics, International Economics, Supply Chain Management, Physics and Transport Geography; a survey and in-depth interviews in three countries; comparative analysis of connectivity metrics in a variety of fields; and network analysis of over 100 networks, this Dissertation contributes to fill this gap by providing: (i) a complex systems approach to connectivity to international markets; (ii) a comprehensive definition of connectivity to international markets which encompasses the different factors that influence it; and (iii) a novel method to assess connectivity to international markets using network analysis. Further contributions of this research include insights on the multi-layered characteristics of both international trade flows and its support system; the perspective of emerging economies; and the study of a region – the Americas – mostly overlooked by the literature on complex systems applied to trade and transport networks. It is expected that a multi-disciplinary, comprehensive and more precise understanding and assessment of the determinants of connectivity will contribute to identify and design more effective policies to address barriers impeding the fast, smooth access to international markets, as well as guide future multi-disciplinary research and analysis in academia and policy-making

    Changes and bugs mining and predicting development activities

    Get PDF
    Software development results in a huge amount of data: changes to source code are recorded in version archives, bugs are reported to issue tracking systems, and communications are archived in e-mails and newsgroups. In this thesis, we present techniques for mining version archives and bug databases to understand and support software development. First, we present techniques which mine version archives for fine-grained changes. We introduce the concept of co-addition of method calls, which we use to identify patterns that describe how methods should be called. We use dynamic analysis to validate these patterns and identify violations. The co-addition of method calls can also detect cross-cutting changes, which are an indicator for concerns that could have been realized as aspects in aspect-oriented programming. Second, we present techniques to build models that can successfully predict the most defectprone parts of large-scale industrial software, in our experiments Windows Server 2003. This helps managers to allocate resources for quality assurance to those parts of a system that are expected to have most defects. The proposed measures on dependency graphs outperformed traditional complexity metrics. In addition, we found empirical evidence for a domino effect: depending on defect-prone binaries increases the chances of having defects.Software-Entwicklung führt zu einer großen Menge an Daten: Änderungen des Quellcodes werden in Versionsarchiven, Fehler in Problemdatenbanken und Kommunikation in E-Mails und Newsgroups archiviert. In dieser Arbeit präsentieren wir Verfahren, die solche Datenbanken analysieren, um Software-Entwicklung zu verstehen und unterstützen. Zuerst präsentieren wir Techniken, die feinkörnige Änderungen in Versionsarchiven untersuchen. Wir konzentrieren uns dabei auf das gleichzeitige Hinzufügen von Methodenaufrufen und identifizieren Muster, die beschreiben wie Methoden aufgerufen werden sollen. Außerdem validieren wir diese Muster zur Laufzeit und erkennen Verletzungen. Das gleichzeitige Hinzufügen von Methodenaufrufen kann außerdem querschneidende Änderungen erkennen. Solche Änderungen sind typischerweise ein Indikator für querschneidende Funktionalitäten, die besser mit Aspekten und aspektorientierter Programmierung realisiert werden können. Zum Abschluss der Arbeit bauen wir Fehlervorhersagemodelle, die erfolgreich die Teile von Windows Server 2003 mit den meisten Fehlern vorhersagen können. Fehlervorhersagen helfen Managern, die Ressourcen für die Qualitätssicherung gezielt auf fehlerhafte Teile einer Software zu lenken. Die auf Abhängigkeitsgraphen basierenden Modelle erzielen dabei bessere Ergebnisse als Modelle, die auf traditionellen Komplexitätsmetriken basieren. Darüber hinaus haben wir einen Domino-Effekt beobachtet: Dateien, die von fehlerhaften Dateien abhängen, besitzen eine erhöhte Fehlerwahrscheinlichkeit

    Account Recovery Methods for Two-Factor Authentication (2FA): An Exploratory Study

    Get PDF
    System administrators have started to adopt two-factor authentication (2FA) to increase user account resistance to cyber-attacks. Systems with 2FA require users to verify their identity using a password and a second-factor authentication device to gain account access. This research found that 60% of users only enroll one second-factor device to their account. If a user’s second factor becomes unavailable, systems are using different procedures to ensure its authorized owner recovers the account. Account recovery is essentially a bypass of the system’s main security protocols and needs to be handled as an alternative authentication process (Loveless, 2018). The current research aimed to evaluate users’ perceived security for four 2FA account recovery methods. Using Renaud’s (2007) opportunistic equation, the present study determined that a fallback phone number recovery method provides user accounts with the most cyber-attack resistance followed by system-generated recovery codes, a color grid pattern, and graphical passcode. This study surveyed 103 participants about authentication knowledge, general risk perception aptitude, ability to correctly rank the recovery methods in terms of their attackr esistance, and recovery method perceptions. Other survey inquires related to previous 2FA, account recovery, and cybersecurity training experiences. Participants generally performed poorly when asked to rank the recovery methods by security strength. Results suggested that neither risk numeracy, authentication knowledge, nor cybersecurity familiarity impacted users’ ability to rank recovery methods by security strength. However, the majority of participants ranked either generated recovery codes, 39%, or a fallback phone number, 25%, as being most secure. The majority of participants, 45%, preferred the fallback phone number for account recovery, 38% expect it will be the easiest to use, and 46% expect it to be the most memorable. However, user’s annotative descriptions for recovery method preferences revealed that users are likely to disregard the setup instructions and use their phone number instead of an emergency contact number. Overall, this exploratory study offers information that researchers and designers can deploy to improve user’s 2FA- and 2FA account recovery- experiences

    Quality of Design, Analysis and Reporting of Software Engineering Experiments:A Systematic Review

    Get PDF
    Background: Like any research discipline, software engineering research must be of a certain quality to be valuable. High quality research in software engineering ensures that knowledge is accumulated and helpful advice is given to the industry. One way of assessing research quality is to conduct systematic reviews of the published research literature. Objective: The purpose of this work was to assess the quality of published experiments in software engineering with respect to the validity of inference and the quality of reporting. More specifically, the aim was to investigate the level of statistical power, the analysis of effect size, the handling of selection bias in quasi-experiments, and the completeness and consistency of the reporting of information regarding subjects, experimental settings, design, analysis, and validity. Furthermore, the work aimed at providing suggestions for improvements, using the potential deficiencies detected as a basis. Method: The quality was assessed by conducting a systematic review of the 113 experiments published in nine major software engineering journals and three conference proceedings in the decade 1993-2002. Results: The review revealed that software engineering experiments were generally designed with unacceptably low power and that inadequate attention was paid to issues of statistical power. Effect sizes were sparsely reported and not interpreted with respect to their practical importance for the particular context. There seemed to be little awareness of the importance of controlling for selection bias in quasi-experiments. Moreover, the review revealed a need for more complete and standardized reporting of information, which is crucial for understanding software engineering experiments and judging their results. Implications: The consequence of low power is that the actual effects of software engineering technologies will not be detected to an acceptable extent. The lack of reporting of effect sizes and the improper interpretation of effect sizes result in ignorance of the practical importance, and thereby the relevance to industry, of experimental results. The lack of control for selection bias in quasi-experiments may make these experiments less credible than randomized experiments. This is an unsatisfactory situation, because quasi-experiments serve an important role in investigating cause-effect relationships in software engineering, for example, in industrial settings. Finally, the incomplete and unstandardized reporting makes it difficult for the reader to understand an experiment and judge its results. Conclusions: Insufficient quality was revealed in the reviewed experiments. This has implications for inferences drawn from the experiments and might in turn lead to the accumulation of erroneous information and the offering of misleading advice to the industry. Ways to improve this situation are suggested

    Investigating the relationship between wisdom, intelligence, age, and gender and the role of mediators and moderators: an Australian setting

    Get PDF
    Wisdom and intelligence are complex distinct constructs which share some characteristics. Measures of wisdom should be distinguished from the construct of intelligence, because, although intelligence helps us engage in our environment, wisdom assists us in dealing with life’s existential challenges. Yet, wisdom a master virtue, often lacks valid and reliable measures. This thesis investigated how wisdom and intelligence are influenced by age and gender, in two quantitative studies. Study One examined whether the structural validity of the popular 40-item five factor Self-Assessed Wisdom Scale (SAWS) would replicate in our sample. We also tested multigroup invariance, and SAWS Openness dimension as a wisdom precursor proposed by other models. Data from 709 respondents, aged 15–92 were randomly split into two. Confirmatory factor analysis (CFA) on Sample 1 showed that the SAWS factor structure did not fit the data. Exploratory factor analysis (EFA) on Sample 2 offered an alternative model, a 12-item four factor solution (SAWS-12), without a Humour facet. SAWS-12 demonstrated a good fit and measurement invariance (MI) across age groups and gender. In respect to findings relative to age, all adults were wiser than adolescents and young adults differed in wisdom from midlife adults. These two groups were similar to older persons. Despite women being wiser than men, the effect size was small. In Study Two, CFA cross-validated the SAWS-12 structure with 457 participants aged 16–87 and compared the measure with the Three-Dimensional Wisdom Scale-12 (3D-WS-12). SAWS-12 displayed good discriminant validity, but not 3D-WS-12, since 3D-WS-12 shared similar r = .34 with both SAWS-12 and crystallised intelligence (Gc). Again, women scored higher on SAWS-12, but there were no gender differences on 3D-WS-12. On both measures, wisdom–age trajectory was curvilinear with peak at midlife, corroborating current literature. Older adults’ mean wisdom scores did not differ from younger or midlife groups. Highest wisdom scorers were older on both wisdom measures, but better educated only on 3D-WS-12. On measures of Gc and fluid intelligence (Gf) there were no gender differences. While Gc linearly inclined with ageing, Gf’s inverse U–curve ageing trajectory was almost flat. Although intelligence failed to mediate the relationship between age and SAWS-12, Gc mediated 3D-WS-12 with age. Age and gender did not moderate the relationship between intelligence and wisdom. This thesis established new findings. We confirmed SAWS Openness facet is a basic component of wisdom, whereas the Humour factor is not. We demonstrated ceiling and cohort effects, opposing and challenging declining Gf with age reported in contemporary literature. SAWS-12 as a new measure of wisdom demonstrated excellent psychometrics superior to the 3D-WS-12, replicated in a new population across time, displayed convergent and discriminant validity, and MI across age groups and gender. This suggests SAWS-12 is a short, direct, reliable measure of wisdom, which offers distinct advantages to research where increments of time are the focus of the study, such as longitudinal studies, and for vulnerable population groups with short attentional spans
    • …
    corecore