309 research outputs found

    Quality information retrieval for the World Wide Web

    Get PDF
    The World Wide Web is an unregulated communication medium which exhibits very limited means of quality control. Quality assurance has become a key issue for many information retrieval services on the Internet, e.g. web search engines. This paper introduces some quality evaluation and assessment methods to assess the quality of web pages. The proposed quality evaluation mechanisms are based on a set of quality criteria which were extracted from a targeted user survey. A weighted algorithmic interpretation of the most significant user quoted quality criteria is proposed. In addition, the paper utilizes machine learning methods to produce a prediction of quality for web pages before they are downloaded. The set of quality criteria allows us to implement a web search engine with quality ranking schemes, leading to web crawlers which can crawl directly quality web pages. The proposed approaches produce some very promising results on a sizable web repository

    Breast cancer data analysis for survivability studies and prediction

    Full text link
    © 2017 Elsevier B.V. Background Breast cancer is the most common cancer affecting females worldwide. Breast cancer survivability prediction is challenging and a complex research task. Existing approaches engage statistical methods or supervised machine learning to assess/predict the survival prospects of patients. Objective The main objectives of this paper is to develop a robust data analytical model which can assist in (i) a better understanding of breast cancer survivability in presence of missing data, (ii) providing better insights into factors associated with patient survivability, and (iii) establishing cohorts of patients that share similar properties. Methods Unsupervised data mining methods viz. the self-organising map (SOM) and density-based spatial clustering of applications with noise (DBSCAN) is used to create patient cohort clusters. These clusters, with associated patterns, were used to train multilayer perceptron (MLP) model for improved patient survivability analysis. A large dataset available from SEER program is used in this study to identify patterns associated with the survivability of breast cancer patients. Information gain was computed for the purpose of variable selection. All of these methods are data-driven and require little (if any) input from users or experts. Results SOM consolidated patients into cohorts of patients with similar properties. From this, DBSCAN identified and extracted nine cohorts (clusters). It is found that patients in each of the nine clusters have different survivability time. The separation of patients into clusters improved the overall survival prediction accuracy based on MLP and revealed intricate conditions that affect the accuracy of a prediction. Conclusions A new, entirely data driven approach based on unsupervised learning methods improves understanding and helps identify patterns associated with the survivability of patient. The results of the analysis can be used to segment the historical patient data into clusters or subsets, which share common variable values and survivability. The survivability prediction accuracy of a MLP is improved by using identified patient cohorts as opposed to using raw historical data. Analysis of variable values in each cohort provide better insights into survivability of a particular subgroup of breast cancer patients

    Automated functional testing of online search services

    Get PDF
    Search services are the main interface through which people discover information on the Internet. A fundamental challenge in testing search services is the lack of oracles. The sheer volume of data on the Internet prohibits testers from verifying the results. Furthermore, it is difficult to objectively assess the ranking quality because different assessors can have very different opinions on the relevance of a Web page to a query. This paper presents a novel method for automatically testing search services without the need of a human oracle. The experimental findings reveal that some commonly used search engines, including Google, Yahoo!, and Live Search, are not as reliable as what most users would expect. For example, they may fail to find pages that exist in their own repositories, or rank pages in a way that is logically inconsistent. Suggestions are made for search service providers to improve their service quality. Copyright © 2010 John Wiley & Sons, Ltd. A novel method for automatically testing search services without the need of a human oracle is presented. The experimental findings reveal that some commonly used search engines, including Google, Yahoo!, and Live Search, are not as reliable as what most users would expect. For example, they may fail to find pages that exist in their own repositories, or rank pages in a way that is logically inconsistent. Suggestions are made for search service providers to improve their service quality. Copyright © 2010 John Wiley & Sons, Ltd.link_to_subscribed_fulltex

    Energy cost of physical activities and sedentary behaviors in young children

    Get PDF
    Background: This study reports energy expenditure (EE) data for lifestyle and ambulatory activities in young children. Methods: Eleven children aged 3 to 6 years (mean age = 4.8 ± 0.9; 55% boys) completed 12 semistructured activities including sedentary behaviors (SB), light (LPA), and moderate-to-vigorous physical activities (MVPA) over 2 laboratory visits while wearing a portable metabolic system to measure EE. Results: Mean EE values for SB (TV, reading, tablet and toy play) were between 0.9 to 1.1 kcal/min. Standing art had an energy cost that was 1.5 times that of SB (mean = 1.4 kcal/min), whereas bike riding (mean = 2.5 kcal/min) was similar to LPA (cleaning-up, treasure hunt and walking) (mean = 2.3 to 2.5 kcal/min), which had EE that were 2.5 times SB. EE for MVPA (running, active games and obstacle course) was 4.2 times SB (mean = 3.8 to 3.9 kcal/ min). Conclusion: EE values reported in this study can contribute to the limited available data on the energy cost of lifestyle and ambulatory activities in young children

    Self-Organizing Time Map: An Abstraction of Temporal Multivariate Patterns

    Full text link
    This paper adopts and adapts Kohonen's standard Self-Organizing Map (SOM) for exploratory temporal structure analysis. The Self-Organizing Time Map (SOTM) implements SOM-type learning to one-dimensional arrays for individual time units, preserves the orientation with short-term memory and arranges the arrays in an ascending order of time. The two-dimensional representation of the SOTM attempts thus twofold topology preservation, where the horizontal direction preserves time topology and the vertical direction data topology. This enables discovering the occurrence and exploring the properties of temporal structural changes in data. For representing qualities and properties of SOTMs, we adapt measures and visualizations from the standard SOM paradigm, as well as introduce a measure of temporal structural changes. The functioning of the SOTM, and its visualizations and quality and property measures, are illustrated on artificial toy data. The usefulness of the SOTM in a real-world setting is shown on poverty, welfare and development indicators

    Discovery of Sanggenon G as a natural cell-permeable small-molecular weight inhibitor of X-linked inhibitor of apoptosis protein (XIAP)

    Get PDF
    AbstractDefects in the regulation of apoptosis are one main cause of cancer development and may result from overexpression of anti-apoptotic proteins such as the X-linked inhibitor of apoptosis protein (XIAP). XIAP is frequently overexpressed in human leukemia and prostate and breast tumors. Inhibition of apoptosis by XIAP is mainly coordinated through direct binding to the initiator caspase-9 via its baculovirus-IAP-repeat-3 (BIR3) domain. XIAP inhibits caspases directly making it to an attractive target for anti-cancer therapy. In the search for novel, non-peptidic XIAP inhibitors in this study we focused on the chemical constituents of sāng bái pí (mulberry root bark). Most promising candidates of this plant were tested biochemically in vitro by a fluorescence polarization (FP) assay and in vivo via protein fragment complementation analysis (PCA). We identified the Diels Alder adduct Sanggenon G (SG1) as a novel, small-molecular weight inhibitor of XIAP. As shown by FP and PCA analyses, SG1 binds specifically to the BIR3 domain of XIAP with a binding affinity of 34.26μM. Treatment of the transgenic leukemia cell line Molt3/XIAP with SG1 enhances caspase-8, -3 and -9 cleavage, displaces caspase-9 from XIAP as determined by immunoprecipitation experiments and sensitizes these cells to etoposide-induced apoptosis. SG1 not only sensitizes the XIAP-overexpressing leukemia cell line Molt3/XIAP to etoposide treatment but also different neuroblastoma cell lines endogenously expressing high XIAP levels. Taken together, Sanggenon G (SG1) is a novel, natural, non-peptidic, small-molecular inhibitor of XIAP that can serve as a starting point to develop a new class of improved XIAP inhibitors

    Recursive self-organizing map as a contractive iterative function system

    Get PDF
    Recently, there has been a considerable research activity in extending topographic maps of vectorial data to more general data structures, such as sequences or trees. However, the representational capabilities and internal representations of the models are not well understood. We rigorously analyze a generalization of the Self-Organizing Map (SOM) for processing sequential data, Recursive SOM (RecSOM [1]), as a non-autonomous dynamical system consisting off a set of fixed input maps. We show that contractive fixed input maps are likely to produce Markovian organizations of receptive fields o the RecSOM map. We derive bounds on parameter β\beta (weighting the importance of importing past information when processing sequences) under which contractiveness of the fixed input maps is guaranteed
    corecore