Search CORE

279 research outputs found

Is Stack Overflow Overflowing With Questions and Tags

Author: K. Ranjitha R.
Singh Sanjay
Publication venue
Publication date: 01/01/2015
Field of study

Programming question and answer (Q & A) websites, such as Quora, Stack Overflow, and Yahoo! Answer etc. helps us to understand the programming concepts easily and quickly in a way that has been tested and applied by many software developers. Stack Overflow is one of the most frequently used programming Q\&A website where the questions and answers posted are presently analyzed manually, which requires a huge amount of time and resource. To save the effort, we present a topic modeling based technique to analyze the words of the original texts to discover the themes that run through them. We also propose a method to automate the process of reviewing the quality of questions on Stack Overflow dataset in order to avoid ballooning the stack overflow with insignificant questions. The proposed method also recommends the appropriate tags for the new post, which averts the creation of unnecessary tags on Stack Overflow.Comment: 11 pages, 7 figures, 3 tables Presented at Third International Symposium on Women in Computing and Informatics (WCI-2015

arXiv.org e-Print Archive

Crossref

IPhone Securtity Analysis

Author: Pandya Vaibhav Ranchhoddas
Publication venue: SJSU ScholarWorks
Publication date: 01/01/2008
Field of study

The release of Apple’s iPhone was one of the most intensively publicized product releases in the history of mobile devices. While the iPhone wowed users with its exciting design and features, it also outraged many for not allowing installation of third party applications and for working exclusively with AT&T wireless services for the first two years. Software attacks have been developed to get around both limitations. The development of those attacks and further evaluation revealed several vulnerabilities in iPhone security. In this paper, we examine several of the attacks developed for the iPhone as a way of investigating the iPhone’s security structure. We also analyze the security holes that have been discovered and make suggestions for improving iPhone security

SJSU ScholarWorks

Asking Questions is Easy, Asking Great Questions is Hard: Constructing Effective Stack Overflow Questions

Author: Hsieh Jane W.
Publication venue: Digital Commons at Oberlin
Publication date: 01/01/2020
Field of study

This paper explores and seeks to improve the ways in which Stack Overflow question posts can elicit answers. Using statistical data analysis approaches and reviews of existing literature, we pin- point three key factors that are found in many previously success- ful/answerable questions. We then present a prototypical sidebar for the ask page that leverages these factors to dynamically (1) evaluate the quality of questions in construction (2) display answer previews of relevant questions and (3) scaffold the identified factors to subsequent askers during their question development processes

Digital Commons at Oberlin (Oberlin College)

Modeling Tag Prediction based on Question Tagging Behavior Analysis of CommunityQA Platform Users

Author: Chandrasekaran Nirupama
Cucerzan Silviu
Gamon Michael
Pal Kuntal Kumar
Publication venue
Publication date: 03/07/2023
Field of study

In community question-answering platforms, tags play essential roles in effective information organization and retrieval, better question routing, faster response to questions, and assessment of topic popularity. Hence, automatic assistance for predicting and suggesting tags for posts is of high utility to users of such platforms. To develop better tag prediction across diverse communities and domains, we performed a thorough analysis of users' tagging behavior in 17 StackExchange communities. We found various common inherent properties of this behavior in those diverse domains. We used the findings to develop a flexible neural tag prediction architecture, which predicts both popular tags and more granular tags for each question. Our extensive experiments and obtained performance show the effectiveness of our modelComment: 20 page

arXiv.org e-Print Archive

Fine-grained reasoning about the security and usability trade-off in modern security tools

Author: Al-Saleh Mohammed I
Publication venue: UNM Digital Repository
Publication date: 01/07/2011
Field of study

Defense techniques detect or prevent attacks based on their ability to model the attacks. A balance between security and usability should always be established in any kind of defense technique. Attacks that exploit the weak points in security tools are very powerful and thus can go undetected. One source of those weak points in security tools comes when security is compromised for usability reasons, where if a security tool completely secures a system against attacks the whole system will not be usable because of the large false alarms or the very restricted policies it will create, or if the security tool decides not to secure a system against certain attacks, those attacks will simply and easily succeed. The key contribution of this dissertation is that it digs deeply into modern security tools and reasons about the inherent security and usability trade-offs based on identifying the low-level, contributing factors to known issues. This is accomplished by implementing full systems and then testing those systems in realistic scenarios. The thesis that this dissertation tests is that we can reason about security and usability trade-offs in fine-grained ways by building and testing full systems. Furthermore, this dissertation provides practical solutions and suggestions to reach a good balance between security and usability. We study two modern security tools, Dynamic Information Flow Tracking (DIFT) and Antivirus (AV) software, for their importance and wide usage. DIFT is a powerful technique that is used in various aspects of security systems. It works by tagging certain inputs and propagating the tags along with the inputs in the target system. However, current DIFT systems do not track implicit information flow because if all DIFT propagation rules are directly applied in a conservative way, the target system will be full of tagged data (a problem called overtagging) and thus useless because the tags tell us very little about the actual information flow of the system. So, current DIFT systems drop some security for usability. In this dissertation, we reason about the sources of the overtagging problem and provide practical ways to deal with it, while previous approaches have focused on abstract descriptions of the main causes of the problem based on limited experiments. The second security tool we consider in this dissertation is antivirus (AV) software. AV is a very important tool that protects systems against worms and viruses by scanning data against a database of signatures. Despite its importance and wide usage, AV has received little attention from the security research community. In this dissertation, we examine the AV internals and reason about the possibility of creating timing channel attacks against AV software. The attacker could infer information about the AV based only on the scanning time the AV spends to scan benign inputs. The other aspect of AV this dissertation explores is the low-level AV performance impact on systems. Even though the performance overhead of AV is a well known issue, the exact reasons behind this overhead are not well-studied. In this dissertation, we design a methodology that utilizes Event Tracing for Windows technology (ETW), a technology that accounts for all OS events, to reason about AV performance impact from the OS point of view. We show that the main performance impact of the AV on a task is the longer waiting time the task spends waiting on events

The Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation

Author: Kairouz Peter
Liu Ziyu
Steinke Thomas
Publication venue
Publication date: 12/02/2021
Field of study

We consider training models on private data that is distributed across user devices. To ensure privacy, we add on-device noise and use secure aggregation so that only the noisy sum is revealed to the server. We present a comprehensive end-to-end system, which appropriately discretizes the data and adds discrete Gaussian noise before performing secure aggregation. We provide a novel privacy analysis for sums of discrete Gaussians. We also analyze the effect of rounding the input data and the modular summation arithmetic. Our theoretical guarantees highlight the complex tension between communication, privacy, and accuracy. Our extensive experimental results demonstrate that our solution is essentially able to achieve a comparable accuracy to central differential privacy with 16 bits of precision per value

arXiv.org e-Print Archive

Simplifying Embedded System Development Through Whole-Program Compilers

Author: McCartney William Patrick
Publication venue: EngagedScholarship@CSU
Publication date: 01/01/2011
Field of study

As embedded systems embrace ever more complicated microcontrollers, they present both new capability and new complexity. To simplify their development, some lessons of computer application development will translate with additional work. This thesis offers one such translation. It shows how whole-program compilers - those that broadly analyze a program\u27s entire source code - can achieve performance gains and remove faults in embedded system applications. In so doing, this yields a novel stackless threading system named UnStacked C. UnStacked C enables cooperative multithreading without the risk of stack overflows in embedded system applications. We also propose a novel preemption system called Lazy Preemption. Unstacked C with Lazy Preemption enables stackless preemptive multithreading in embedded systems. These remove the possibility of thread stack overflows, but also significantly reduces the memory required for multithreading in embedded system

Cleveland-Marshall College of Law

Recommended from our members

Scalable hardware memory disambiguation

Author: Sethumadhavan Lakshminarasimhan, 1978-
Publication venue
Publication date: 01/12/2007
Field of study

This dissertation deals with one of the long-standing problems in Computer Architecture – the problem of memory disambiguation. Microprocessors typically reorder memory instructions during execution to improve concurrency. Such microprocessors use hardware memory structures for memory disambiguation, known as LoadStore Queues (LSQs), to ensure that memory instruction dependences are satisfied even when the memory instructions execute out-of-order. A typical LSQ implementation (circa 2006) holds all in-flight memory instructions in a physically centralized LSQ and performs a fully associative search on all buffered instructions to ensure that memory dependences are satisfied. These LSQ implementations do not scale because they use large, fully associative structures, which are known to be slow and power hungry. The increasing trend towards distributed microarchitectures further exacerbates these problems. As on-chip wire delays increase and high-performance processors become necessarily distributed, centralized structures such as the LSQ can limit scalability. This dissertation describes techniques to create scalable LSQs in both centralized and distributed microarchitectures. The problems and solutions described in this thesis are motivated and validated by real system designs. The dissertation starts with a description of the partitioned primary memory system of the TRIPS processor, of which the LSQ is an important component, and then through a series of optimizations describes how the power, area, and centralization problems of the LSQ can be solved with minor performance losses (if at all) even for large number of in flight memory instructions. The four solutions described in this dissertation — partitioning, filtering, late binding and efficient overflow management — enable power-, area-efficient, distributed and scalable LSQs, which in turn enable aggressive large-window processors capable of simultaneously executing thousands of instructions. To mitigate the power problem, we replaced the power-hungry, fully associative search with a power-efficient hash table lookup using a simple address-based Bloom filter. Bloom filters are probabilistic data structures used for testing set membership and can be used to quickly check if an instruction with the same data address is likely to be found in the LSQ without performing the associative search. Bloom filters typically eliminate more than 80% of the associative searches and they are highly effective because in most programs, it is uncommon for loads and stores to have the same data address and be in execution simultaneously. To rectify the area problem, we observe the fact that only a small fraction of all memory instructions are dependent, that only such dependent instructions need to be buffered in the LSQ, and that these instructions need to be in the LSQ only for certain parts of the pipelined execution. We propose two mechanisms to exploit these observations. The first mechanism, area filtering, is a hardware mechanism that couples Bloom filters and dependence predictors to dynamically identify and buffer only those instructions which are likely to be dependent. The second mechanism, late binding, reduces the occupancy and hence size of the LSQ. Both of these optimizations allows the number of LSQ slots to be reduced by up to one-half compared to a traditional organization without any performance degradation. Finally, we describe a new decentralized LSQ design for handling LSQ structural hazards in distributed microarchitectures. Decentralization of LSQs, and to a large extent distributed microarchitectures with memory speculation, has proved to be impractical because of the high performance penalties associated with the mechanisms for dealing with hazards. To solve this problem, we applied classic flow-control techniques from interconnection networks for handling resource con- flicts. The first method, memory-side buffering, buffers the overflowing instructions in a separate buffer near the LSQs. The second scheme, execution-side NACKing, sends the overflowing instruction back to the issue window from which it is later re-issued. The third scheme, network buffering, uses the buffers in the interconnection network between the execution units and memory to hold instructions when the LSQ is full, and uses virtual channel flow control to avoid deadlocks. The network buffering scheme is the most robust of all the overflow schemes and shows less than 1% performance degradation due to overflows for a subset of SPEC CPU 2000 and EEMBC benchmarks on a cycle-accurate simulator that closely models the TRIPS processor. The techniques proposed in this dissertation are independent, architectureneutral and their cumulative benefits result in LSQs that can be partitioned at a fine granularity and have low design complexity. Each of these partitions selectively buffers only memory instructions with true dependences and can be closely coupled with the execution units thus minimizing power, area, and latency. Such LSQ designs with near-ideal characteristics are well suited for microarchitectures with thousands of instructions in-flight and may enable even more aggressive microarchitectures in the future.Computer Science

Texas ScholarWorks

Predictions to Ease Users' Effort in Scalable Sharing

Author: Bartel Jacob
Publication venue: University of North Carolina at Chapel Hill Graduate School
Publication date: 01/01/2015
Field of study

Significant user effort is required to choose recipients of shared information, which grows as the scale of the number of potential or target recipients increases. It is our thesis that it is possible to develop new approaches to predict persistent named groups, ephemeral groups, and response times that will reduce user effort. We predict persistent named groups using the insight that implicit social graphs inferred from messages can be composed with existing prediction techniques designed for explicit social graphs, thereby demonstrating similar grouping patterns in email and communities. However, this approach still requires that users know when to generate such predictions. We predict group creation times based on the intuition that bursts of change in the social graph likely signal named group creation. While these recommendations can help create new groups, they do not update existing ones. We predict how existing named groups should evolve based on the insight that the growth rates of named groups and the underlying social graph will match. When appropriate named groups do not exist, it is useful to predict ephemeral groups of information recipients. We have developed an approach to make hierarchical recipient recommendations that groups the elements in a flat list of recommended recipients, and thus is composable with existing flat recipient-recommendation techniques. It is based on the insight that groups of recipients in past messages can be organized in a tree. To help users select among alternative sets of recipients, we have made predictions about the scale of response time of shared information, based on the insights that messages addressed to similar recipients or containing similar titles will yield similar response times. Our prediction approaches have been applied to three specific systems - email, Usenet and Stack Overflow - based on the insight that email recipients correspond to Stack Overflow tags and Usenet newsgroups. We evaluated these approaches with actual user data using new metrics for measuring the differences in scale between predicted and actual response times and measuring the costs of eliminating spurious named-group predictions, editing named-group recommendations for use in future messages, scanning and selecting hierarchical ephemeral group-recommendations, and manually entering recipients.Doctor of Philosoph

Carolina Digital Repository