9,879 research outputs found
Extracting Conflict-free Information from Multi-labeled Trees
A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more
leaves share a label, e.g., a species name. A MUL-tree can imply multiple
conflicting phylogenetic relationships for the same set of taxa, but can also
contain conflict-free information that is of interest and yet is not obvious.
We define the information content of a MUL-tree T as the set of all
conflict-free quartet topologies implied by T, and define the maximal reduced
form of T as the smallest tree that can be obtained from T by pruning leaves
and contracting edges while retaining the same information content. We show
that any two MUL-trees with the same information content exhibit the same
reduced form. This introduces an equivalence relation in MUL-trees with
potential applications to comparing MUL-trees. We present an efficient
algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its
performance on empirical datasets in terms of both quality of the reduced tree
and the degree of data reduction achieved.Comment: Submitted in Workshop on Algorithms in Bioinformatics 2012
(http://algo12.fri.uni-lj.si/?file=wabi
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques
specifically developed for analyzing and understanding the inner-workings and
representations acquired by neural models of language. Approaches included:
systematic manipulation of input to neural networks and investigating the
impact on their performance, testing whether interpretable knowledge can be
decoded from intermediate representations acquired by neural networks,
proposing modifications to neural network architectures to make their knowledge
state or generated output more explainable, and examining the performance of
networks on simplified or formal languages. Here we review a number of
representative studies in each category
- …