2,514 research outputs found
External Lexical Information for Multilingual Part-of-Speech Tagging
Morphosyntactic lexicons and word vector representations have both proven
useful for improving the accuracy of statistical part-of-speech taggers. Here
we compare the performances of four systems on datasets covering 16 languages,
two of these systems being feature-based (MEMMs and CRFs) and two of them being
neural-based (bi-LSTMs). We show that, on average, all four approaches perform
similarly and reach state-of-the-art results. Yet better performances are
obtained with our feature-based models on lexically richer datasets (e.g. for
morphologically rich languages), whereas neural-based results are higher on
datasets with less lexical variability (e.g. for English). These conclusions
hold in particular for the MEMM models relying on our system MElt, which
benefited from newly designed features. This shows that, under certain
conditions, feature-based approaches enriched with morphosyntactic lexicons are
competitive with respect to neural methods
A polynomial delay algorithm for the enumeration of bubbles with length constraints in directed graphs and its application to the detection of alternative splicing in RNA-seq data
We present a new algorithm for enumerating bubbles with length constraints in
directed graphs. This problem arises in transcriptomics, where the question is
to identify all alternative splicing events present in a sample of mRNAs
sequenced by RNA-seq. This is the first polynomial-delay algorithm for this
problem and we show that in practice, it is faster than previous approaches.
This enables us to deal with larger instances and therefore to discover novel
alternative splicing events, especially long ones, that were previously
overseen using existing methods.Comment: Peer-reviewed and presented as part of the 13th Workshop on
Algorithms in Bioinformatics (WABI2013
Efficiently listing bounded length st-paths
The problem of listing the shortest simple (loopless) -paths in a
graph has been studied since the early 1960s. For a non-negatively weighted
graph with vertices and edges, the most efficient solution is an
algorithm for directed graphs by Yen and Lawler
[Management Science, 1971 and 1972], and an algorithm for
the undirected version by Katoh et al. [Networks, 1982], both using
space. In this work, we consider a different parameterization for this problem:
instead of bounding the number of -paths output, we bound their length. For
the bounded length parameterization, we propose new non-trivial algorithms
matching the time complexity of the classic algorithms but using only
space. Moreover, we provide a unified framework such that the solutions to both
parameterizations -- the classic -shortest and the new length-bounded paths
-- can be seen as two different traversals of a same tree, a Dijkstra-like and
a DFS-like traversal, respectively.Comment: 12 pages, accepted to IWOCA 201
Extracting few representative reconciliations with Host-Switches (Extended Abstract)
Phylogenetic tree reconciliation is the approach commonly used to in- vestigate the coevolution of sets of organisms such as hosts and symbionts. Given a phylogenetic tree for each such set, respectively denoted by H and S, together with a mapping φ of the leaves of S to the leaves of H, a reconciliation is a mapping ρ of the internal vertices of S to the vertices of H which extends φ with some constraints.
Given a cost for each reconciliation, a huge number of most parsimonious ones are possible, even exponential in the dimension of the trees. Without further information, any biological interpretation of the underlying coevolution would require that all optimal solutions are enumerated and examined. The latter is however impossible without pro- viding some sort of high level view of the situation. One approach would be to extract a small number of representatives, based on some notion of similarity or of equivalence between the reconciliations.
In this paper, we define two equivalence relations that allow one to identify many reconciliations with a single one, thereby reducing their number. Extensive experiments indicate that the number of output solutions greatly decreases in general. By how much clearly depends on the constraints that are given as input
Estrategias para enfrentar la violencia contra las mujeres : reflexiones feministas desde América Latina
Este ensayo presenta la violencia contra la mujer como un problema social de elevada magnitud y pautado en la desigualdad de género. La autora realiza una síntesis de los principales puntos de discusión y de lucha del movimiento feminista latino-americano en las últimas décadas. Enfatiza la concepción de la violencia contra la mujer como un problema público, una negación de los derechos de ciudadanía y un hecho de justicia. A pesar del carácter contradictorio entre el movimiento feminista y las instituciones sociales, no hay dudas de que se configura actualmente una nueva práctica social acerca de la violencia perpetrada contra las mujeres.This essay presents violence against women as a social problem of high magnitude as well as based on gender inequality. The author summarizes both the main discussion issues and the last decades' Latin American feminist fight. She highlights the conception of violence against women as a public problem, a negation of citizenship rights and as a law matter. Despite the withdrawals and the contradictory character of the relations between the feminist movement and the social institutions, there is no doubt that the configuration of a new social practice related to violence perpetrated against women is taking place
DeLex, a freely-avaible, large-scale and linguistically grounded morphological lexicon for German
International audienceWe introduce DeLex, a freely-avaible, large-scale and linguistically grounded morphological lexicon for German developed within the Alexina framework. We extracted lexical information from the German wiktionary and developed a morphological inflection grammar for German, based on a linguistically sound model of inflectional morphology. Although the developement of DeLex involved some manual work, we show that is represents a good tradeoff between development cost, lexical coverage and resource accuracy
Étiquetage multilingue en parties du discours avec MElt
International audienceWe describe recent evolutions of MElt, a discriminative part-of-speech tagging system. MElt is targeted at the optimal exploitation of information provided by external lexicons for improving its performance over models trained solely on annotated corpora. We have trained MElt on more than 40 datasets covering over 30 languages. Compared with the state-of-the-art system MarMoT, MElt's results are slightly worse on average when no external lexicon is used, but slightly better when such resources are available, resulting in state-of-the-art taggers for a number of languages.Nous présentons des travaux récents réalisés autour de MElt, système discriminant d'étiquetage en parties du discours. MElt met l'accent sur l'exploitation optimale d'informations lexicales externes pour améliorer les performances des étiqueteurs par rapport aux modèles entraînés seulement sur des corpus annotés. Nous avons entraîné MElt sur plus d'une quarantaine de jeux de données couvrant plus d'une trentaine de langues. Comparé au système état-de-l'art MarMoT, MElt obtient en moyenne des résultats légèrement moins bons en l'absence de lexique externe, mais meilleurs lorsque de telles ressources sont disponibles, produisant ainsi des étiqueteurs état-de-l'art pour plusieurs langues
Unraveling Ecological Effects on Social Behavior. Insights from Tent-roosting Bats
Although group living has been associated with high fitness cost, multiple lines of evidence have suggested that it has evolved multiple times independently. Given the wide diversity of social systems, it appears that multiple explanations are necessary to understand this process. Although evidence indicates that multiple ecological and environmental factors might promote variation in cohesion of social organisms, studies investigating how these factors interrelate and shape social structure have been limited. In the tropics, there are at least 23 bat species that roost in modified structures called tents. These species present a wide diversity in social systems. Moreover, they have divergent evolutionary origins but similar roosting habits, suggesting convergence in roost use. These characteristics make this group an ideal system to test hypotheses regarding effects of ecological and environmental factors in evolution and stability of social groups. Thus, my objectives were first to investigate the importance of habitat factors in predicting presence and density of the tent-roosting bat Uroderma bilobatum. Additionally, I wanted to determine relative contributions of habitat factors on group cohesion and stability. I found that presence of coconut palms (Cocos nucifera) had the highest unique predictive power of presence and density of U. bilobatum. Additionally, I found that roost characteristics contributed more to the explained variation in group relatedness. This pattern was driven by relatedness of adult females within social groups, suggesting that females using roosts of specific characteristics exhibit higher relatedness. To determine if this pattern holds across multiple tent-roosting bat species, I tested for correlated evolution between group stability and roost lifespan. I found that most bats that used tents of short lifespan also had stable groups, and most species that used tents of long lifespan had unstable groups, suggesting that group stability and tent lifespan did not evolve independently. The observed relationships between roosting ecology, group cohesion and stability in tent-roosting bats suggest that roosts play an important role in the evolution of group formation. Incorporating ecological and environmental factors in the study of sociality will allow broad understanding of the forces that bring together individuals into cohesive social groups
- …