37 research outputs found
Detecting and Characterizing Propagation of Security Weaknesses in Puppet-based Infrastructure Management
Despite being beneficial for managing computing infrastructure automatically,
Puppet manifests are susceptible to security weaknesses, e.g., hard-coded
secrets and use of weak cryptography algorithms. Adequate mitigation of
security weaknesses in Puppet manifests is thus necessary to secure computing
infrastructure that are managed with Puppet manifests. A characterization of
how security weaknesses propagate and affect Puppet-based infrastructure
management, can inform practitioners on the relevance of the detected security
weaknesses, as well as help them take necessary actions for mitigation. To that
end, we conduct an empirical study with 17,629 Puppet manifests mined from 336
open source repositories. We construct Taint Tracker for Puppet Manifests
(TaintPup), for which we observe 2.4 times more precision compared to that of a
state-of-the-art security static analysis tool. TaintPup leverages
Puppet-specific information flow analysis using which we characterize
propagation of security weaknesses. From our empirical study, we observe
security weaknesses to propagate into 4,457 resources, i.e, Puppet-specific
code elements used to manage infrastructure. A single instance of a security
weakness can propagate into as many as 35 distinct resources. We observe
security weaknesses to propagate into 7 categories of resources, which include
resources used to manage continuous integration servers and network
controllers. According to our survey with 24 practitioners, propagation of
security weaknesses into data storage-related resources is rated to have the
most severe impact for Puppet-based infrastructure management.Comment: 14 pages, currently under revie
Conversational Challenges in AI-Powered Data Science: Obstacles, Needs, and Design Opportunities
Large Language Models (LLMs) are being increasingly employed in data science
for tasks like data preprocessing and analytics. However, data scientists
encounter substantial obstacles when conversing with LLM-powered chatbots and
acting on their suggestions and answers. We conducted a mixed-methods study,
including contextual observations, semi-structured interviews (n=14), and a
survey (n=114), to identify these challenges. Our findings highlight key issues
faced by data scientists, including contextual data retrieval, formulating
prompts for complex tasks, adapting generated code to local environments, and
refining prompts iteratively. Based on these insights, we propose actionable
design recommendations, such as data brushing to support context selection, and
inquisitive feedback loops to improve communications with AI-based assistants
in data-science tools.Comment: 24 pages, 8 figure
Correlates of programmer efficacy and their link to experience: a combined EEG and eye-tracking study
Background: Despite similar education and background, programmers can exhibit vast differences in efficacy. While research has
identified some potential factors, such as programming experience
and domain knowledge, the effect of these factors on programmers’
efficacy is not well understood.
Aims: We aim at unraveling the relationship between efficacy
(speed and correctness) and measures of programming experience.
We further investigate the correlates of programmer efficacy in
terms of reading behavior and cognitive load.
Method: For this purpose, we conducted a controlled experiment
with 37 participants using electroencephalography (EEG) and eye
tracking. We asked participants to comprehend up to 32 Java sourcecode snippets and observed their eye gaze and neural correlates of
cognitive load. We analyzed the correlation of participants’ efficacy
with popular programming experience measures.
Results: We found that programmers with high efficacy read source
code more targeted and with lower cognitive load. Commonly used
experience levels do not predict programmer efficacy well, but selfestimation and indicators of learning eagerness are fairly accurate.
Implications: The identified correlates of programmer efficacy
can be used for future research and practice (e.g., hiring). Future
research should also consider efficacy as a group sampling method,
rather than using simple experience measures
Continuous Deployment Transitions at Scale
Predictable, rapid, and data-driven feature rollout; lightning-fast; and automated fix deployment are some of the benefits most large software organizations worldwide are striving for. In the process, they are transitioning toward the use of continuous deployment practices. Continuous deployment enables companies to make hundreds or thousands of software changes to live computing infrastructure every day while maintaining service to millions of customers. Such ultra-fast changes create a new reality in software development. Over the past four years, the Continuous Deployment Summit, hosted at Facebook, Netflix, Google, and Twitter has been held. Representatives from companies like Cisco, Facebook, Google, IBM, Microsoft, Netflix, and Twitter have shared the triumphs and struggles of their transition to continuous deployment practices—each year the companies press on, getting ever faster. In this chapter, the authors share the common strategies and practices used by continuous deployment pioneers and adopted by newcomers as they transition and use continuous deployment practices at scale