110 research outputs found

    An adaptive technique for content-based image retrieval

    Get PDF
    We discuss an adaptive approach towards Content-Based Image Retrieval. It is based on the Ostensive Model of developing information needs—a special kind of relevance feedback model that learns from implicit user feedback and adds a temporal notion to relevance. The ostensive approach supports content-assisted browsing through visualising the interaction by adding user-selected images to a browsing path, which ends with a set of system recommendations. The suggestions are based on an adaptive query learning scheme, in which the query is learnt from previously selected images. Our approach is an adaptation of the original Ostensive Model based on textual features only, to include content-based features to characterise images. In the proposed scheme textual and colour features are combined using the Dempster-Shafer theory of evidence combination. Results from a user-centred, work-task oriented evaluation show that the ostensive interface is preferred over a traditional interface with manual query facilities. This is due to its ability to adapt to the user's need, its intuitiveness and the fluid way in which it operates. Studying and comparing the nature of the underlying information need, it emerges that our approach elicits changes in the user's need based on the interaction, and is successful in adapting the retrieval to match the changes. In addition, a preliminary study of the retrieval performance of the ostensive relevance feedback scheme shows that it can outperform a standard relevance feedback strategy in terms of image recall in category search

    Is searching full text more effective than searching abstracts?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With the growing availability of full-text articles online, scientists and other consumers of the life sciences literature now have the ability to go beyond searching bibliographic records (title, abstract, metadata) to directly access full-text content. Motivated by this emerging trend, I posed the following question: is searching full text more effective than searching abstracts? This question is answered by comparing text retrieval algorithms on MEDLINE<sup>® </sup>abstracts, full-text articles, and spans (paragraphs) within full-text articles using data from the TREC 2007 genomics track evaluation. Two retrieval models are examined: <it>bm25 </it>and the ranking algorithm implemented in the open-source Lucene search engine.</p> <p>Results</p> <p>Experiments show that treating an entire article as an indexing unit does not consistently yield higher effectiveness compared to abstract-only search. However, retrieval based on spans, or paragraphs-sized segments of full-text articles, consistently outperforms abstract-only search. Results suggest that highest overall effectiveness may be achieved by combining evidence from spans and full articles.</p> <p>Conclusion</p> <p>Users searching full text are more likely to find relevant articles than searching only abstracts. This finding affirms the value of full text collections for text retrieval and provides a starting point for future work in exploring algorithms that take advantage of rapidly-growing digital archives. Experimental results also highlight the need to develop distributed text retrieval algorithms, since full-text articles are significantly longer than abstracts and may require the computational resources of multiple machines in a cluster. The MapReduce programming model provides a convenient framework for organizing such computations.</p

    Tag-Aware Recommender Systems: A State-of-the-art Survey

    Get PDF
    In the past decade, Social Tagging Systems have attracted increasing attention from both physical and computer science communities. Besides the underlying structure and dynamics of tagging systems, many efforts have been addressed to unify tagging information to reveal user behaviors and preferences, extract the latent semantic relations among items, make recommendations, and so on. Specifically, this article summarizes recent progress about tag-aware recommender systems, emphasizing on the contributions from three mainstream perspectives and approaches: network-based methods, tensor-based methods, and the topic-based methods. Finally, we outline some other tag-related works and future challenges of tag-aware recommendation algorithms.Comment: 19 pages, 3 figure

    Optimizing search strategies to identify randomized controlled trials in MEDLINE

    Get PDF
    BACKGROUND: The Cochrane Highly Sensitive Search Strategy (HSSS), which contains three phases, is widely used to identify Randomized Controlled Trials (RCTs) in MEDLINE. Lefebvre and Clarke suggest that reviewers might consider using four revisions of the HSSS. The objective of this study is to validate these four revisions: combining the free text terms volunteer, crossover, versus, and the Medical Subject Heading CROSS-OVER STUDIES with the top two phases of the HSSS, respectively. METHODS: We replicated the subject search for 61 Cochrane reviews. The included studies of each review that were indexed in MEDLINE were pooled together by review and then combined with the subject search and each of the four proposed search strategies, the top two phases of the HSSS, and all three phases of the HSSS. These retrievals were used to calculate the sensitivity and precision of each of the six search strategies for each review. RESULTS: Across the 61 reviews, the search term versus combined with the top two phases of the HSSS was able to find 3 more included studies than the top two phases of the HSSS alone, or in combination with any of the other proposed search terms, but at the expense of missing 56 relevant articles that would be found if all three phases of the HSSS were used. The estimated time needed to finish a review is 1086 hours for all three phases of the HSSS, 823 hours for the strategy versus, 818 hours for the first two phases of the HSSS or any of the other three proposed strategies. CONCLUSION: This study shows that compared to the first two phases of the HSSS, adding the term versus to the top two phases of the HSSS balances the sensitivity and precision in the reviews studied here to some extent but the differences are very small. It is well known that missing relevant studies may result in bias in systematic reviews. Reviewers need to weigh the trade-offs when selecting the search strategies for identifying RCTs in MEDLINE

    Rocchio Algorithm to Enhance Semantically Collaborative Filtering

    Get PDF
    International audienceRecommender system provides relevant items to users from huge catalogue. Collaborative filtering and content-based filtering are the most widely used techniques in personalized recommender systems. Collaborative filtering uses only the user-ratings data to make predictions, while content-based filtering relies on semantic information of items for recommendation. Hybrid recommendation system combines the two techniques. In this paper, we present another hybridization approach: User Semantic Collaborative Filtering. The aim of our approach is to predict users preferences for items based on their inferred preferences for semantic information of items. In this aim, we design a new user semantic model to describe the user preferences by using Rocchio algorithm. Due to the high dimension of item content, we apply a latent semantic analysis to reduce the dimension of data. User semantic model is then used in a user-based collaborative filtering to compute prediction ratings and to provide recommendations. Applying our approach to real data set, the MoviesLens 1M data set, significant improvement can be noticed compared to usage only approach, content based only approach

    A new family of diprotodontian marsupials from the latest Oligocene of Australia and the evolution of wombats, koalas, and their relatives (Vombatiformes)

    Get PDF
    We describe the partial cranium and skeleton of a new diprotodontian marsupial from the late Oligocene (~26–25 Ma) Namba Formation of South Australia. This is one of the oldest Australian marsupial fossils known from an associated skeleton and it reveals previously unsuspected morphological diversity within Vombatiformes, the clade that includes wombats (Vombatidae), koalas (Phascolarctidae) and several extinct families. Several aspects of the skull and teeth of the new taxon, which we refer to a new family, are intermediate between members of the fossil family Wynyardiidae and wombats. Its postcranial skeleton exhibits features associated with scratch-digging, but it is unlikely to have been a true burrower. Body mass estimates based on postcranial dimensions range between 143 and 171 kg, suggesting that it was ~5 times larger than living wombats. Phylogenetic analysis based on 79 craniodental and 20 postcranial characters places the new taxon as sister to vombatids, with which it forms the superfamily Vombatoidea as defined here. It suggests that the highly derived vombatids evolved from wynyardiid-like ancestors, and that scratch-digging adaptations evolved in vombatoids prior to the appearance of the ever-growing (hypselodont) molars that are a characteristic feature of all post-Miocene vombatids. Ancestral state reconstructions on our preferred phylogeny suggest that bunolophodont molars are plesiomorphic for vombatiforms, with full lophodonty (characteristic of diprotodontoids) evolving from a selenodont morphology that was retained by phascolarctids and ilariids, and wynyardiids and vombatoids retaining an intermediate selenolophodont condition. There appear to have been at least six independent acquisitions of very large (>100 kg) body size within Vombatiformes, several having already occurred by the late Oligocene

    An Inducer of VGF Protects Cells against ER Stress-Induced Cell Death and Prolongs Survival in the Mutant SOD1 Animal Models of Familial ALS

    Get PDF
    Amyotrophic lateral sclerosis (ALS) is the most frequent adult-onset motor neuron disease, and recent evidence has suggested that endoplasmic reticulum (ER) stress signaling is involved in the pathogenesis of ALS. Here we identified a small molecule, SUN N8075, which has a marked protective effect on ER stress-induced cell death, in an in vitro cell-based screening, and its protective mechanism was mediated by an induction of VGF nerve growth factor inducible (VGF): VGF knockdown with siRNA completely abolished the protective effect of SUN N8075 against ER-induced cell death, and overexpression of VGF inhibited ER-stress-induced cell death. VGF level was lower in the spinal cords of sporadic ALS patients than in the control patients. Furthermore, SUN N8075 slowed disease progression and prolonged survival in mutant SOD1 transgenic mouse and rat models of ALS, preventing the decrease of VGF expression in the spinal cords of ALS mice. These data suggest that VGF plays a critical role in motor neuron survival and may be a potential new therapeutic target for ALS, and SUN N8075 may become a potential therapeutic candidate for treatment of ALS

    Text-derived concept profiles support assessment of DNA microarray data for acute myeloid leukemia and for androgen receptor stimulation

    Get PDF
    BACKGROUND: High-throughput experiments, such as with DNA microarrays, typically result in hundreds of genes potentially relevant to the process under study, rendering the interpretation of these experiments problematic. Here, we propose and evaluate an approach to find functional associations between large numbers of genes and other biomedical concepts from free-text literature. For each gene, a profile of related concepts is constructed that summarizes the context in which the gene is mentioned in literature. We assign a weight to each concept in the profile based on a likelihood ratio measure. Gene concept profiles can then be clustered to find related genes and other concepts. RESULTS: The experimental validation was done in two steps. We first applied our method on a controlled test set. After this proved to be successful the datasets from two DNA microarray experiments were analyzed in the same way and the results were evaluated by domain experts. The first dataset was a gene-expression profile that characterizes the cancer cells of a group of acute myeloid leukemia patients. For this group of patients the biological background of the cancer cells is largely unknown. Using our methodology we found an association of these cells to monocytes, which agreed with other experimental evidence. The second data set consisted of differentially expressed genes following androgen receptor stimulation in a prostate cancer cell line. Based on the analysis we put forward a hypothesis about the biological processes induced in these studied cells: secretory lysosomes are involved in the production of prostatic fluid and their development and/or secretion are androgen-regulated processes. CONCLUSION: Our method can be used to analyze DNA microarray datasets based on information explicitly and implicitly available in the literature. We provide a publicly available tool, dubbed Anni, for this purpose

    Biclustering via optimal re-ordering of data matrices in systems biology: rigorous methods and comparative studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The analysis of large-scale data sets via clustering techniques is utilized in a number of applications. Biclustering in particular has emerged as an important problem in the analysis of gene expression data since genes may only jointly respond over a subset of conditions. Biclustering algorithms also have important applications in sample classification where, for instance, tissue samples can be classified as cancerous or normal. Many of the methods for biclustering, and clustering algorithms in general, utilize simplified models or heuristic strategies for identifying the "best" grouping of elements according to some metric and cluster definition and thus result in suboptimal clusters.</p> <p>Results</p> <p>In this article, we present a rigorous approach to biclustering, OREO, which is based on the Optimal RE-Ordering of the rows and columns of a data matrix so as to globally minimize the dissimilarity metric. The physical permutations of the rows and columns of the data matrix can be modeled as either a network flow problem or a traveling salesman problem. Cluster boundaries in one dimension are used to partition and re-order the other dimensions of the corresponding submatrices to generate biclusters. The performance of OREO is tested on (a) metabolite concentration data, (b) an image reconstruction matrix, (c) synthetic data with implanted biclusters, and gene expression data for (d) colon cancer data, (e) breast cancer data, as well as (f) yeast segregant data to validate the ability of the proposed method and compare it to existing biclustering and clustering methods.</p> <p>Conclusion</p> <p>We demonstrate that this rigorous global optimization method for biclustering produces clusters with more insightful groupings of similar entities, such as genes or metabolites sharing common functions, than other clustering and biclustering algorithms and can reconstruct underlying fundamental patterns in the data for several distinct sets of data matrices arising in important biological applications.</p
    corecore