Search CORE

22,899 research outputs found

PACE: Pattern Accurate Computationally Efficient Bootstrapping for Timely Discovery of Cyber-Security Concepts

Author: Bridges Robert A.
Czejdo Bogdan
Goodall John R.
Iannacone Michael D.
McNeil Nikki
Perez Nicolas
Publication venue
Publication date: 11/10/2013
Field of study

Public disclosure of important security information, such as knowledge of vulnerabilities or exploits, often occurs in blogs, tweets, mailing lists, and other online sources months before proper classification into structured databases. In order to facilitate timely discovery of such knowledge, we propose a novel semi-supervised learning algorithm, PACE, for identifying and classifying relevant entities in text sources. The main contribution of this paper is an enhancement of the traditional bootstrapping method for entity extraction by employing a time-memory trade-off that simultaneously circumvents a costly corpus search while strengthening pattern nomination, which should increase accuracy. An implementation in the cyber-security domain is discussed as well as challenges to Natural Language Processing imposed by the security domain.Comment: 6 pages, 3 figures, ieeeTran conference. International Conference on Machine Learning and Applications 201

arXiv.org e-Print Archive

Crossref

Consensus and meta-analysis regulatory networks for combining multiple microarray gene expression datasets

Author: Akaike
Allan Tucker
Beissbarth
Conlon
Courcelle
DerSimonian
Eisen
Emma Steele
Faith
Friedman
Gasch
Grigull
Hanley
Hartemink
Jarvinen
Khil
Kuo
Matzkevich
Ng
Pearl
Pearl
Pennock
Pe’er
Pe’er
Quillardet
Salgado
Sangurdekar
Smyth
Soinov
Spellman
Stoica
Sutton
Teixeira
Wang
Yauk
Publication venue: 'Elsevier BV'
Publication date: 01/12/2008
Field of study

Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with greater confidence, and place less reliance on a single dataset. However, combining datasets directly can be difficult as experiments are often conducted on different microarray platforms, and in different laboratories leading to inherent biases in the data that are not always removed through pre-processing such as normalisation. In this paper we compare two frameworks for combining microarray datasets to model regulatory networks: pre- and post-learning aggregation. In pre-learning approaches, such as using simple scale-normalisation prior to the concatenation of datasets, a model is learnt from a combined dataset, whilst in post-learning aggregation individual models are learnt from each dataset and the models are combined. We present two novel approaches for post-learning aggregation, each based on aggregating high-level features of Bayesian network models that have been generated from different microarray expression datasets. Meta-analysis Bayesian networks are based on combining statistical confidences attached to network edges whilst Consensus Bayesian networks identify consistent network features across all datasets. We apply both approaches to multiple datasets from synthetic and real (Escherichia coli and yeast) networks and demonstrate that both methods can improve on networks learnt from a single dataset or an aggregated dataset formed using a standard scale-normalisation

Elsevier - Publisher Connector

Crossref

Brunel University Research Archive

Determinants of linear judgment: A meta-analysis of lens model studies

Author: Natalia Karelaia
Robin Hogarth
Publication venue
Publication date
Field of study

The mathematical representation of Brunswik’s lens model has been used extensively to study human judgment and provides a unique opportunity to conduct a meta-analysis of studies that covers roughly five decades. Specifically, we analyze statistics of the “lens model equation” (Tucker, 1964) associated with 259 different task environments obtained from 78 papers. In short, we find – on average – fairly high levels of judgmental achievement and note that people can achieve similar levels of cognitive performance in both noisy and predictable environments. Although overall performance varies little between laboratory and field studies, both differ in terms of components of performance and types of environments (numbers of cues and redundancy). An analysis of learning studies reveals that the most effective form of feedback is information about the task. We also analyze empirically when bootstrapping is more likely to occur. We conclude by indicating shortcomings of the kinds of studies conducted to date, limitations in the lens model methodology, and possibilities for future research.Judgment, lens model, linear models, learning, bootstrapping

Research Papers in Economics

A practical guide and software for analysing pairwise comparison experiments

Author: Mantiuk Rafal K.
Perez-Ortiz Maria
Publication venue
Publication date: 11/12/2017
Field of study

Most popular strategies to capture subjective judgments from humans involve the construction of a unidimensional relative measurement scale, representing order preferences or judgments about a set of objects or conditions. This information is generally captured by means of direct scoring, either in the form of a Likert or cardinal scale, or by comparative judgments in pairs or sets. In this sense, the use of pairwise comparisons is becoming increasingly popular because of the simplicity of this experimental procedure. However, this strategy requires non-trivial data analysis to aggregate the comparison ranks into a quality scale and analyse the results, in order to take full advantage of the collected data. This paper explains the process of translating pairwise comparison data into a measurement scale, discusses the benefits and limitations of such scaling methods and introduces a publicly available software in Matlab. We improve on existing scaling methods by introducing outlier analysis, providing methods for computing confidence intervals and statistical testing and introducing a prior, which reduces estimation error when the number of observers is low. Most of our examples focus on image quality assessment.Comment: Code available at https://github.com/mantiuk/pwcm

arXiv.org e-Print Archive

UCL Discovery