Search CORE

42,540 research outputs found

DNA ANALYSIS USING GRAMMATICAL INFERENCE

Author: Cook Cory
Publication venue: SJSU ScholarWorks
Publication date: 14/06/2016
Field of study

An accurate language definition capable of distinguishing between coding and non-coding DNA has important applications and analytical significance to the field of computational biology. The method proposed here uses positive sample grammatical inference and statistical information to infer languages for coding DNA. An algorithm is proposed for the searching of an optimal subset of input sequences for the inference of regular grammars by optimizing a relevant accuracy metric. The algorithm does not guarantee the finding of the optimal subset; however, testing shows improvement in accuracy and performance over the basis algorithm. Testing shows that the accuracy of inferred languages for components of DNA are consistently accurate. By using the proposed algorithm languages are inferred for coding DNA with average conditional probability over 80%. This reveals that languages for components of DNA can be inferred and are useful independent of the process that created them. These languages can then be analyzed or used for other tasks in computational biology. To illustrate potential applications of regular grammars for DNA components, an inferred language for exon sequences is applied as post processing to Hidden Markov exon prediction to reduce the number of wrong exons detected and improve the specificity of the model significantly

SJSU ScholarWorks

The effectiveness of refactoring, based on a compatibility testing taxonomy and a dependency graph

Author: Counsell S
Hassoun Y
Hierons RM
Loizou G
Najjar R
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

In this paper, we describe and then appraise a testing taxonomy proposed by van Deursen and Moonen (VD&M) based on the post-refactoring repeatability of tests. Four categories of refactoring are identified by VD&M ranging from semantic-preserving to incompatible, where, for the former, no new tests are required and for the latter, a completely new test set has to be developed. In our appraisal of the taxonomy, we heavily stress the need for the inter-dependence of the refactoring categories to be considered when making refactoring decisions and we base that need on a refactoring dependency graph developed as part of the research. We demonstrate that while incompatible refactorings may be harmful and time-consuming from a testing perspective, semantic-preserving refactorings can have equally unpleasant hidden ramifications despite their advantages. In fact, refactorings which fall into neither category have the most interesting properties. We support our results with empirical refactoring data drawn from seven Java open-source systems (OSS) and from the same analysis form a tentative categorization of code smells

Crossref

Brunel University Research Archive

Compatible Remediation on Vulnerabilities from Third-Party Libraries for Java Projects

Author: Chen Sen
Fan Lingling
Liu Chengwei
Liu Yang
Wu Jiahui
Xu Zhengzi
Zhang Lyuye
Zhao Lida
Publication venue
Publication date: 20/01/2023
Field of study

With the increasing disclosure of vulnerabilities in open-source software, software composition analysis (SCA) has been widely applied to reveal third-party libraries and the associated vulnerabilities in software projects. Beyond the revelation, SCA tools adopt various remediation strategies to fix vulnerabilities, the quality of which varies substantially. However, ineffective remediation could induce side effects, such as compilation failures, which impede acceptance by users. According to our studies, existing SCA tools could not correctly handle the concerns of users regarding the compatibility of remediated projects. To this end, we propose Compatible Remediation of Third-party libraries (CORAL) for Maven projects to fix vulnerabilities without breaking the projects. The evaluation proved that CORAL not only fixed 87.56% of vulnerabilities which outperformed other tools (best 75.32%) and achieved a 98.67% successful compilation rate and a 92.96% successful unit test rate. Furthermore, we found that 78.45% of vulnerabilities in popular Maven projects could be fixed without breaking the compilation, and the rest of the vulnerabilities (21.55%) could either be fixed by upgrades that break the compilations or even be impossible to fix by upgrading.Comment: 11 pages, conferenc

arXiv.org e-Print Archive

Corpus Annotation for Parser Evaluation

Author: Briscoe Ted
Carroll John
Minnen Guido
Publication venue
Publication date: 01/01/1999
Field of study

We describe a recently developed corpus annotation scheme for evaluating parsers that avoids shortcomings of current methods. The scheme encodes grammatical relations between heads and dependents, and has been used to mark up a new public-domain corpus of naturally occurring English text. We show how the corpus can be used to evaluate the accuracy of a robust parser, and relate the corpus to extant resources.Comment: 7 pages, LaTeX (uses eaclap.sty

arXiv.org e-Print Archive

CiteSeerX

The 3G standard setting strategy and indigenous innovation policy in China is TD-SCDMA a flagship?

Author: Hui Yan
Publication venue
Publication date
Field of study

In the time of “network economy”, industries and the public have stressed several “battles for dominance” between two or more rival technologies, often involving well-known firms operating in highly visible industries. In this paper, we are going to focus on the Chinese self-developed standard TD-SCDMA to perceive the implication and target of the nation’s policy and strategy. The motivation of the research starts from the interesting fact we observed: TD-SCDMA is named as the Chinese made standard, however the Chinese hold core patent technology is still about 7%, while most of the rest part is still taken by other foreign companies. The “faultage” between the small share reality and a self made standard sweet dream implies a well plotted strategy. In order to understand it, we firstly raise the question of why the Chinese government postpones the 3G decision again and again. Then we go further to probe why the standard-setting of TD-SCDMA has aroused wide attention as a strategic tool to fulfill “indigenous innovation”, and finally becomes part of national science and technology policy to increase international competitiveness? We are going to use economics theories to understand the essence of the creation of TD-SCDMA, and its relation to China’s interests.3G, standard, innovation, China

Research Papers in Economics

COMPATIBILITY TESTING FOR COMPONENT-BASED SYSTEMS

Author: YOON ILCHUL
Publication venue
Publication date: 01/01/2010
Field of study

Many component-based systems are deployed in diverse environments, each with different components and with different component versions. To ensure the system builds correctly for all deployable combinations (or, configurations), developers often perform compatibility testing by building their systems on various configurations. However, due to the large number of possible configurations, testing all configurations is often infeasible, and in practice, only a handful of popular configurations are tested; as a result, errors can escape to the field. This problem is compounded when components evolve over time and when test resources are limited. To address these problems, in this dissertation I introduce a process, algorithms and a tool called Rachet. First, I describe a formal modeling scheme for capturing the system configuration space, and a sampling criterion that determines the portion of the space to test. I describe an algorithm to sample configurations satisfying the sampling criterion and methods to test the sampled configurations. Second, I present an approach that incrementally tests compatibility between components, so as to accommodate component evolution. I describe methods to compute test obligations, and algorithms to produce configurations that test the obligations, attempting to reuse test artifacts. Third, I present an approach that prioritizes and tests configurations based on developers' preferences. Configurations are tested, by default starting from the most preferred one as requested by a developer, but cost-related factors are also considered to reduce overall testing time. The testing approaches presented are applied to two large-scale systems in the high-performance computing domain, and experimental results show that the approaches can (1) identify compatibility between components effectively and efficiently, (2) make the process of compatibility testing more practical under constant component evolution, and also (3) help developers achieve preferred compatibility results early in the overall testing process when time and resources are limited

Digital Repository at the University of Maryland