444 research outputs found
Interaction-aware development environments: recording, mining, and leveraging IDE interactions to analyze and support the development flow
Nowadays, software development is largely carried out using Integrated Development Environments, or IDEs. An IDE is a collection of tools and facilities to support the most diverse software engineering activities, such as writing code, debugging, and program understanding. The fact that they are integrated enables developers to find all the tools needed for the development in the same place. Each activity is composed of many basic events, such as clicking on a menu item in the IDE, opening a new user interface to browse the source code of a method, or adding a new statement in the body of a method. While working, developers generate thousands of these interactions, that we call fine-grained IDE interaction data. We believe this data is a valuable source of information that can be leveraged to enable better analyses and to offer novel support to developers. However, this data is largely neglected by modern IDEs. In this dissertation we propose the concept of "Interaction-Aware Development Environments": IDEs that collect, mine, and leverage the interactions of developers to support and simplify their workflow. We formulate our thesis as follows: Interaction-Aware Development Environments enable novel and in- depth analyses of the behavior of software developers and set the ground to provide developers with effective and actionable support for their activities inside the IDE. For example, by monitoring how developers navigate source code, the IDE could suggest the program entities that are potentially relevant for a particular task. Our research focuses on three main directions: 1. Modeling and Persisting Interaction Data. The first step to make IDEs aware of interaction data is to overcome its ephemeral nature. To do so we have to model this new source of data and to persist it, making it available for further use. 2. Interpreting Interaction Data. One of the biggest challenges of our research is making sense of the millions of interactions generated by developers. We propose several models to interpret this data, for example, by reconstructing high-level development activities from interaction histories or measure the navigation efficiency of developers. 3. Supporting Developers with Interaction Data. Novel IDEs can use the potential of interaction data to support software development. For example, they can identify the UI components that are potentially unnecessary for the future and suggest developers to close them, reducing the visual cluttering of the IDE
Developers' Visuo-spatial Mental Model and Program Comprehension
Previous works from research and industry have proposed a spatial
representation of code in a canvas, arguing that a navigational code space
confers developers the freedom to organise elements according to their
understanding. By allowing developers to translate logical relatedness into
spatial proximity, this code representation could aid in code navigation and
comprehension. However, the association between developers' code comprehension
and their visuo-spatial mental model of the code is not yet well understood.
This mental model is affected on the one hand by the spatial code
representation and on the other by the visuo-spatial working memory of
developers.
We address this knowledge gap by conducting an online experiment with 20
developers following a between-subject design. The control group used a
conventional tab-based code visualization, while the experimental group used a
code canvas to complete three code comprehension tasks. Furthermore, we measure
the participants' visuo-spatial working memory using a Corsi Block test at the
end of the tasks. Our results suggest that, overall, neither the spatial
representation of code nor the visuo-spatial working memory of developers has a
significant impact on comprehension performance. However, we identified
significant differences in the time dedicated to different comprehension
activities such as navigation, annotation, and UI interactions.Comment: To appear in 2023 International Conference on Software Engineering
(ICSE 2023). Authors' version of the wor
Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation
Creating the photo-realistic version of people sketched portraits is useful
to various entertainment purposes. Existing studies only generate portraits in
the 2D plane with fixed views, making the results less vivid. In this paper, we
present Stereoscopic Simplified Sketch-to-Portrait (SSSP), which explores the
possibility of creating Stereoscopic 3D-aware portraits from simple contour
sketches by involving 3D generative models. Our key insight is to design
sketch-aware constraints that can fully exploit the prior knowledge of a
tri-plane-based 3D-aware generative model. Specifically, our designed
region-aware volume rendering strategy and global consistency constraint
further enhance detail correspondences during sketch encoding. Moreover, in
order to facilitate the usage of layman users, we propose a Contour-to-Sketch
module with vector quantized representations, so that easily drawn contours can
directly guide the generation of 3D portraits. Extensive comparisons show that
our method generates high-quality results that match the sketch. Our usability
study verifies that our system is greatly preferred by user.Comment: Project Page on https://hangz-nju-cuhk.github.io
Data-Driven Decisions and Actions in Today’s Software Development
Today’s software development is all about data: data about the software product itself, about the process and its different stages, about the customers and markets, about the development, the testing, the integration, the deployment, or the runtime aspects in the cloud. We use static and dynamic data of various kinds and quantities to analyze market feedback, feature impact, code quality, architectural design alternatives, or effects of performance optimizations. Development environments are no longer limited to IDEs in a desktop application or the like but span the Internet using live programming environments such as Cloud9 or large-volume repositories such as BitBucket, GitHub, GitLab, or StackOverflow. Software development has become “live” in the cloud, be it the coding, the testing, or the experimentation with different product options on the Internet. The inherent complexity puts a further burden on developers, since they need to stay alert when constantly switching between tasks in different phases. Research has been analyzing the development process, its data and stakeholders, for decades and is working on various tools that can help developers in their daily tasks to improve the quality of their work and their productivity. In this chapter, we critically reflect on the challenges faced by developers in a typical release cycle, identify inherent problems of the individual phases, and present the current state of the research that can help overcome these issues
An Automatic Evaluation Framework for Multi-turn Medical Consultations Capabilities of Large Language Models
Large language models (LLMs) have achieved significant success in interacting
with human. However, recent studies have revealed that these models often
suffer from hallucinations, leading to overly confident but incorrect
judgments. This limits their application in the medical domain, where tasks
require the utmost accuracy. This paper introduces an automated evaluation
framework that assesses the practical capabilities of LLMs as virtual doctors
during multi-turn consultations. Consultation tasks are designed to require
LLMs to be aware of what they do not know, to inquire about missing medical
information from patients, and to ultimately make diagnoses. To evaluate the
performance of LLMs for these tasks, a benchmark is proposed by reformulating
medical multiple-choice questions from the United States Medical Licensing
Examinations (USMLE), and comprehensive evaluation metrics are developed and
evaluated on three constructed test sets. A medical consultation training set
is further constructed to improve the consultation ability of LLMs. The results
of the experiments show that fine-tuning with the training set can alleviate
hallucinations and improve LLMs' performance on the proposed benchmark.
Extensive experiments and ablation studies are conducted to validate the
effectiveness and robustness of the proposed framework.Comment: 10 pages, 9figure
Advanced Security Analysis for Emergent Software Platforms
Emergent software ecosystems, boomed by the advent of smartphones and the Internet of Things (IoT) platforms, are perpetually sophisticated, deployed into highly dynamic environments, and facilitating interactions across heterogeneous domains. Accordingly, assessing the security thereof is a pressing need, yet requires high levels of scalability and reliability to handle the dynamism involved in such volatile ecosystems.
This dissertation seeks to enhance conventional security detection methods to cope with the emergent features of contemporary software ecosystems. In particular, it analyzes the security of Android and IoT ecosystems by developing rigorous vulnerability detection methods. A critical aspect of this work is the focus on detecting vulnerable and unsafe interactions between applications that share common components and devices. Contributions of this work include novel insights and methods for: (1) detecting vulnerable interactions between Android applications that leverage dynamic loading features for concealing the interactions; (2) identifying unsafe interactions between smart home applications by considering physical and cyber channels; (3) detecting malicious IoT applications that are developed to target numerous IoT devices; (4) detecting insecure patterns of emergent security APIs that are reused from open-source software. In all of the four research thrusts, we present thorough security analysis and extensive evaluations based on real-world applications. Our results demonstrate that the proposed detection mechanisms can efficiently and effectively detect vulnerabilities in contemporary software platforms.
Advisers: Hamid Bagheri and Qiben Ya
Modalities, Cohesion, and Information Flow
It is informally understood that the purpose of modal type constructors in
programming calculi is to control the flow of information between types. In
order to lend rigorous support to this idea, we study the category of
classified sets, a variant of a denotational semantics for information flow
proposed by Abadi et al. We use classified sets to prove multiple
noninterference theorems for modalities of a monadic and comonadic flavour. The
common machinery behind our theorems stems from the the fact that classified
sets are a (weak) model of Lawvere's theory of axiomatic cohesion. In the
process, we show how cohesion can be used for reasoning about multi-modal
settings. This leads to the conclusion that cohesion is a particularly useful
setting for the study of both information flow, but also modalities in type
theory and programming languages at large
Towards a Tool-based Development Methodology for Pervasive Computing Applications
Despite much progress, developing a pervasive computing application remains a
challenge because of a lack of conceptual frameworks and supporting tools. This
challenge involves coping with heterogeneous devices, overcoming the
intricacies of distributed systems technologies, working out an architecture
for the application, encoding it in a program, writing specific code to test
the application, and finally deploying it. This paper presents a design
language and a tool suite covering the development life-cycle of a pervasive
computing application. The design language allows to define a taxonomy of
area-specific building-blocks, abstracting over their heterogeneity. This
language also includes a layer to define the architecture of an application,
following an architectural pattern commonly used in the pervasive computing
domain. Our underlying methodology assigns roles to the stakeholders, providing
separation of concerns. Our tool suite includes a compiler that takes design
artifacts written in our language as input and generates a programming
framework that supports the subsequent development stages, namely
implementation, testing, and deployment. Our methodology has been applied on a
wide spectrum of areas. Based on these experiments, we assess our approach
through three criteria: expressiveness, usability, and productivity
- …