1,827 research outputs found
Comparison of Balancing Techniques for Multimedia IR over Imbalanced Datasets
A promising method to improve the performance of information retrieval systems is to approach retrieval tasks as a supervised classification problem. Previous user interactions, e.g. gathered from a thorough log file analysis, can be used to train classifiers which aim to inference relevance of retrieved documents based on user interactions. A problem in this approach is, however, the large imbalance ratio between relevant and non-relevant documents in the collection. In standard test collection as used in academic evaluation frameworks such as TREC, non-relevant documents outnumber relevant documents by far. In this work, we address this imbalance problem in the multimedia domain. We focus on the logs of two multimedia user studies which are highly imbalanced. We compare a naiinodotve solution of randomly deleting documents belonging to the majority class with various balancing algorithms coming from different fields: data classification and text classification. Our experiments indicate that all algorithms improve the classification performance of just deleting at random from the dominant class
Complex patterns of male germline instability and somatic mosaicism in myotonic dystrophy type 1
The genetic basis of myotonic dystrophy type 1 (DM1) is the expansion of a CTG repeat in the 3' untranslated region of DM1PK . Once into the disease range, the repeat becomes highly unstable and is biased toward expansion in both somatic and germline tissues. Intergenerational differences usually reveal an increase in allele length, concordant with the clinical anticipation characteristic of DM1, but there have also been cases with intergenerational contractions of the repeat length, accompanied by apparent anticipation. In order to gain a better understanding of this intergenerational behaviour, we have obtained semen samples from six DM males and used single molecule analyses to compare the allele distributions present in their sperm and blood with those of their offspring. We have confirmed that the male germline mutational pathway is distinct from that of the soma, but the extent of variation is highly variable from one individual to another and not obviously correlated with progenitor allele length. Nonetheless, in all cases the alleles present in the father's sperm overlap with those observed in their offspring. These data also provide further indications that the interpretation of intergenerational transmissions by standard analyses is frequently compromised by the masking of germline differences by age-dependent somatic expansion in the parent
Recommended from our members
Genome-wide patterns of polymorphism in an inbred line of the African malaria mosquito Anopheles gambiae.
Anopheles gambiae is a major mosquito vector of malaria in Africa. Although increased use of insecticide-based vector control tools has decreased malaria transmission, elimination is likely to require novel genetic control strategies. It can be argued that the absence of an A. gambiae inbred line has slowed progress toward genetic vector control. In order to empower genetic studies and enable precise and reproducible experimentation, we set out to create an inbred line of this species. We found that amenability to inbreeding varied between populations of A. gambiae. After full-sib inbreeding for ten generations, we genotyped 112 individuals--56 saved prior to inbreeding and 56 collected after inbreeding--at a genome-wide panel of single nucleotide polymorphisms (SNPs). Although inbreeding dramatically reduced diversity across much of the genome, we discovered numerous, discrete genomic blocks that maintained high heterozygosity. For one large genomic region, we were able to definitively show that high diversity is due to the persistent polymorphism of a chromosomal inversion. Inbred lines in other eukaryotes often exhibit a qualitatively similar retention of polymorphism when typed at a small number of markers. Our whole-genome SNP data provide the first strong, empirical evidence supporting associative overdominance as the mechanism maintaining higher than expected diversity in inbred lines. Although creation of A. gambiae lines devoid of nearly all polymorphism may not be feasible, our results provide critical insights into how more fully isogenic lines can be created
Composition and Self-Adaptation of Service-Based Systems with Feature Models
The adoption of mechanisms for reusing software in pervasive systems has not yet become standard practice. This is because the use of pre-existing software requires the selection, composition and adaptation of prefabricated software parts, as well as the management of some complex problems such as guaranteeing high levels of efficiency and safety in critical domains. In addition to the wide variety of services, pervasive systems are composed of many networked heterogeneous devices with embedded software. In this work, we promote the safe reuse of services in service-based systems using two complementary technologies, Service-Oriented Architecture and Software Product Lines. In order to do this, we extend both the service discovery and composition processes defined in the DAMASCo framework, which currently does not deal with the service variability that constitutes pervasive systems. We use feature models to represent the variability and to self-adapt the services during the composition in a safe way taking context changes into consideration. We illustrate our proposal with a case study related to the driving domain of an Intelligent Transportation System, handling the context information of the environment.Work partially supported by the projects TIN2008-05932,
TIN2008-01942, TIN2012-35669, TIN2012-34840 and CSD2007-0004 funded by
Spanish Ministry of Economy and Competitiveness and FEDER; P09-TIC-05231 and
P11-TIC-7659 funded by Andalusian Government; and FP7-317731 funded by EU. Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tec
- …