3,128 research outputs found
Identifying Cloned Navigational Patterns in Web Applications
Web Applications are subject to continuous and rapid evolution. Often programmers indiscriminately
duplicate Web pages without considering systematic development and maintenance methods. This practice
creates code clones that make Web Applications hard to maintain and reuse. We present an approach to
identify duplicated functionalities in Web Applications through cloned navigational pattern analysis.
Cloned patterns can be generalized in a reengineering process, thus to simplify the structure and future
maintenance of the Web Applications. The proposed method first identifies pairs of cloned pages by
analyzing similarity at structure, content, and scripting code. Two pages are considered clones if their
similarity is greater than a given threshold. Cloned pages are then grouped into clusters and the links
connecting pages of two clusters are grouped too. An interconnection metric has been defined on the links
between two clusters to express the effort required to reengineer them as well as to select the patterns of
interest. To further reduce the comprehension effort, we filter out links and nodes of the clustered
navigational schema that do not contribute to the identification of cloned navigational patterns. A tool
supporting the proposed approach has been developed and validated in a case study
An Approach and an Eclipse Based Environment for Enhancing the Navigation Structure of Web Sites
This paper presents an approach based on information retrieval and clustering techniques for automatically enhancing the navigation structure of a Web site for improving navigability. The approach increments the set of navigation links provided in each page of the site with a semantic navigation map, i.e., a set of links enabling navigating from a given page to other pages of the site showing similar or related content. The approach uses Latent Semantic Indexing to compute a dissimilarity measure between the pages of the site and a graph-theoretic clustering algorithm to group pages showing similar or related content according to the calculated dissimilarity measure. AJAX code is finally used to extend each Web page with an associated semantic navigation map. The paper also presents a prototype of a tool developed to support the approach and the results from a case study conducted to assess the validity and feasibility of the proposal
Metric Selection and Metric Learning for Matching Tasks
A quarter of a century after the world-wide web was born, we have grown accustomed to having easy access to a wealth of data sets and open-source software. The value of these resources is restricted if they are not properly integrated and maintained. A lot of this work boils down to matching; finding existing records about entities and enriching them with information from a new data source. In the realm of code this means integrating new code snippets into a code base while avoiding duplication.
In this thesis, we address two different such matching problems. First, we leverage the diverse and mature set of string similarity measures in an iterative semisupervised learning approach to string matching. It is designed to query a user to make a sequence of decisions on specific cases of string matching. We show that we can find almost optimal solutions after only a small amount of such input. The low labelling complexity of our algorithm is due to addressing the cold start problem that is inherent to Active Learning; by ranking queries by variance before the arrival of enough supervision information, and by a self-regulating mechanism that counteracts initial biases.
Second, we address the matching of code fragments for deduplication. Programming code is not only a tool, but also a resource that itself demands maintenance. Code duplication is a frequent problem arising especially from modern development practice. There are many reasons to detect and address code duplicates, for example to keep a clean and maintainable codebase. In such more complex data structures, string similarity measures are inadequate. In their stead, we study a modern supervised Metric Learning approach to model code similarity with Neural Networks. We find that in such a model representing the elementary tokens with a pretrained word embedding is the most important ingredient. Our results show both qualitatively (by visualization) that relatedness is modelled well by the embeddings and quantitatively (by ablation) that the encoded information is useful for the downstream matching task.
As a non-technical contribution, we unify the common challenges arising in supervised learning approaches to Record Matching, Code Clone Detection and generic Metric Learning tasks. We give a novel account to string similarity measures from a psychological standpoint and point out and document one longstanding naming conflict in string similarity measures. Finally, we point out the overlap of latest research in Code Clone Detection with the field of Natural Language Processing
Determinants of anti-PD-1 response and resistance in clear cell renal cell carcinoma
ADAPTeR is a prospective, phase II study of nivolumab (anti-PD-1) in 15 treatment-naive patients (115 multiregion tumor samples) with metastatic clear cell renal cell carcinoma (ccRCC) aiming to understand the mechanism underpinning therapeutic response. Genomic analyses show no correlation between tumor molecular features and response, whereas ccRCC-specific human endogenous retrovirus expression indirectly correlates with clinical response. T cell receptor (TCR) analysis reveals a significantly higher number of expanded TCR clones pre-treatment in responders suggesting pre-existing immunity. Maintenance of highly similar clusters of TCRs post-treatment predict response, suggesting ongoing antigen engagement and survival of families of T cells likely recognizing the same antigens. In responders, nivolumab-bound CD8+ T cells are expanded and express GZMK/B. Our data suggest nivolumab drives both maintenance and replacement of previously expanded T cell clones, but only maintenance correlates with response. We hypothesize that maintenance and boosting of a pre-existing response is a key element of anti-PD-1 mode of action
Interpreting 16S metagenomic data without clustering to achieve sub-OTU resolution
The standard approach to analyzing 16S tag sequence data, which relies on
clustering reads by sequence similarity into Operational Taxonomic Units
(OTUs), underexploits the accuracy of modern sequencing technology. We present
a clustering-free approach to multi-sample Illumina datasets that can identify
independent bacterial subpopulations regardless of the similarity of their 16S
tag sequences. Using published data from a longitudinal time-series study of
human tongue microbiota, we are able to resolve within standard 97% similarity
OTUs up to 20 distinct subpopulations, all ecologically distinct but with 16S
tags differing by as little as 1 nucleotide (99.2% similarity). A comparative
analysis of oral communities of two cohabiting individuals reveals that most
such subpopulations are shared between the two communities at 100% sequence
identity, and that dynamical similarity between subpopulations in one host is
strongly predictive of dynamical similarity between the same subpopulations in
the other host. Our method can also be applied to samples collected in
cross-sectional studies and can be used with the 454 sequencing platform. We
discuss how the sub-OTU resolution of our approach can provide new insight into
factors shaping community assembly.Comment: Updated to match the published version. 12 pages, 5 figures +
supplement. Significantly revised for clarity, references added, results not
change
Recommended from our members
Determinants of anti-PD-1 response and resistance in clear cell renal cell carcinoma.
ADAPTeR is a prospective, phase II study of nivolumab (anti-PD-1) in 15 treatment-naive patients (115 multiregion tumor samples) with metastatic clear cell renal cell carcinoma (ccRCC) aiming to understand the mechanism underpinning therapeutic response. Genomic analyses show no correlation between tumor molecular features and response, whereas ccRCC-specific human endogenous retrovirus expression indirectly correlates with clinical response. T cell receptor (TCR) analysis reveals a significantly higher number of expanded TCR clones pre-treatment in responders suggesting pre-existing immunity. Maintenance of highly similar clusters of TCRs post-treatment predict response, suggesting ongoing antigen engagement and survival of families of T cells likely recognizing the same antigens. In responders, nivolumab-bound CD8+ T cells are expanded and express GZMK/B. Our data suggest nivolumab drives both maintenance and replacement of previously expanded T cell clones, but only maintenance correlates with response. We hypothesize that maintenance and boosting of a pre-existing response is a key element of anti-PD-1 mode of action
Find Unique Usages: Helping Developers Understand Common Usages
When working in large and complex codebases, developers face challenges using
\textit{Find Usages} to understand how to reuse classes and methods. To better
understand these challenges, we conducted a small exploratory study with 4
participants. We found that developers often wasted time reading long lists of
similar usages or prematurely focused on a single usage. Based on these
findings, we hypothesized that clustering usages by the similarity of their
surrounding context might enable developers to more rapidly understand how to
use a function. To explore this idea, we designed and implemented \textit{Find
Unique Usages}, which extracts usages, computes a diff between pairs of usages,
generates similarity scores, and uses these scores to form usage clusters. To
evaluate this approach, we conducted a controlled experiment with 12
participants. We found that developers with Find Unique Usages were
significantly faster, completing their task in 35% less time
Common bacterial responses in six ecosystems exposed to 10 years of elevated atmospheric carbon dioxide
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/91149/1/j.1462-2920.2011.02695.x.pd
- …