96 research outputs found

    Squeezing the ensemble pruning: Faster and more accurate categorization for news portals

    Get PDF
    Recent studies show that ensemble pruning works as effective as traditional ensemble of classifiers (EoC). In this study, we analyze how ensemble pruning can improve text categorization efficiency in time-critical real-life applications such as news portals. The most crucial two phases of text categorization are training classifiers and assigning labels to new documents; but the latter is more important for efficiency of such applications. We conduct experiments on ensemble pruning-based news article categorization to measure its accuracy and time cost. The results show that our heuristics reduce the time cost of the second phase. Also we can make a trade-off between accuracy and time cost to improve both of them with appropriate pruning degrees. © 2012 Springer-Verlag Berlin Heidelberg

    Ensemble pruning for text categorization based on data partitioning

    Get PDF
    Ensemble methods can improve the effectiveness in text categorization. Due to computation cost of ensemble approaches there is a need for pruning ensembles. In this work we study ensemble pruning based on data partitioning. We use a ranked-based pruning approach. For this purpose base classifiers are ranked and pruned according to their accuracies in a separate validation set. We employ four data partitioning methods with four machine learning categorization algorithms. We mainly aim to examine ensemble pruning in text categorization. We conduct experiments on two text collections: Reuters-21578 and BilCat-TRT. We show that we can prune 90% of ensemble members with almost no decrease in accuracy. We demonstrate that it is possible to increase accuracy of traditional ensembling with ensemble pruning. © 2011 Springer-Verlag Berlin Heidelberg

    Discovering story chains: A framework based on zigzagged search and news actors

    Get PDF
    A story chain is a set of related news articles that reveal how different events are connected. This study presents a framework for discovering story chains, given an input document, in a text collection. The framework has 3 complementary parts that i) scan the collection, ii) measure the similarity between chain-member candidates and the chain, and iii) measure similarity among news articles. For scanning, we apply a novel text-mining method that uses a zigzagged search that reinvestigates past documents based on the updated chain. We also utilize social networks of news actors to reveal connections among news articles. We conduct 2 user studies in terms of 4 effectiveness measures—relevance, coverage, coherence, and ability to disclose relations. The first user study compares several versions of the framework, by varying parameters, to set a guideline for use. The second compares the framework with 3 baselines. The results show that our method provides statistically significant improvement in effectiveness in 61% of pairwise comparisons, with medium or large effect size; in the remainder, none of the baselines significantly outperforms our method. © 2017 ASIS&T

    A model study of measurement intellectual capital in Turkey and an application

    Get PDF
    Bu çalışmada, ulusal endüstri içinde yer alan firmaların entelektüel sermayelerinin, kısaca  maddi olmayan varlıklarının ölçülmesi için bir model yaratılmıştır. Yaratılan bu modelde entelektüel sermaye insan sermayesi, örgüt sermayesi ve ilişki sermayesi ile tanımlanmıştır. Bu model yaklaşımı ile Türkiye’deki işletmelerin pazar/defter değeri oranları ile entelektüel sermayeleri arasındaki ilişki incelenmiştir. Araştırma yöntemi olarak Likert-tipi anket çalışması uygulanmıştır. Yapılan araştırma sonuçlarına göre, firmaların insan sermayesi ve ilişki sermayesi ile işletmelerin pazar değerleri arasında pozitif ve güçlü bir ilişki olduğu gösterilmiştir. İşletmenin örgüt sermayesi ile, işletmenin insan ve ilişki sermayesi arasında pozitif yönde, güçlü bir korelasyon olduğu da gösterilmiştir. Anahtar Kelimeler: Entelektüel sermaye, maddi olmayan varlıklar, bilgi varlığının değerlendirilmesi.The purpose of this study is to define the elements of intellectual capital of firms in Turkey and to empirically investigate the relationship between intellectual capital and market value of firms in Istanbul Stock Exchange. For the research an intellectual capital measurement model is created and four hypothesis are defined: Hypothesis (1): There will be a positive relationship between the human capital and the market/book value. Hypothesis (2): There will be a positive relationship between the relation capital and the market/book value. Hypothesis (3a): There will be a positive relationship between the human capital and structural capital. Hypothesis (3b): There will be a positive relationship between the relation capital and structural capital. For testing the hypothesis two different survey studies have been done. In the pre-survey a questionnaire about 71 items, in the main survey about 21 items were designed. In designing of both questionnaire a 7-point Likert scale was used. For the result the statistical tests, Cronbach?s alpha test for reliability, principal components analysis, linear regression and partial least squares were executed. The main conclusions from this study are that: human capital and relation capital have a positive relationship with market/book value of firms in Turkey; and structural capital has a correlation with human and relation capital. Keywords: Intellectual capital, intangible assets, assessing knowledge assets

    Developing a text categorization template for Turkish news portals

    Get PDF
    In news portals, text category information is needed for news presentation. However, for many news stories the category information is unavailable, incorrectly assigned or too generic. This makes the text categorization a necessary tool for news portals. Automated text categorization (ATC) is a multifaceted difficult process that involves decisions regarding tuning of several parameters, term weighting, word stemming, word stopping, and feature selection. In this study we aim to find a categorization setup that will provide highly accurate results in ATC for Turkish news portals. We also examine some other aspects such as the effects of training dataset set size and robustness issues. Two Turkish test collections with different characteristics are created using Bilkent News Portal. Experiments are conducted with four classification methods: C4.5, KNN, Naive Bayes, and SVM (using polynomial and rbf kernels). Our results recommends a text categorization template for Turkish news portals and provides some future research pointers. © 2011 IEEE

    Leukocyte-Endothelium Interaction in the Sublingual Microcirculation of Coronary Artery Bypass Grafting Patients

    Get PDF
    Objective: The aim of this study was to apply an innovative methodology to incident dark-field (IDF) imaging in coronary artery bypass grafting (CABG) patients for the identification and quantification of rolling leukocytes along the sublingual microcirculatory endothelium. Methods: This study was a post hoc analysis of a prospective study that evaluated the perioperative course of the sublingual microcirculation in CABG patients. Video images were captured using IDF imaging following the induction of anesthesia (T-0) and cardiopulmonary bypass (CPB) (T-1) in 10 patients. Rolling leukocytes were identified and quantified using frame averaging, which is a technique that was developed for correctly identifying leukocytes. Results: The number of rolling leukocytes increased significantly from T-0 (7.5 {[}6.4-9.1] leukocytes/capillary-postcapillary venule/4 s) to T-1 (14.8 {[}13.2-15.5] leukocytes/capillary-postcapillary venule/4 s) (p < 0.0001). A significant increase in systemic leukocyte count was also detected from 7.4 +/- 0.9 x 10(9)/L (preoperative) to 12.4 +/- 4.4 x 10(9)/L (postoperative) (p < 0.01). Conclusion: The ability to directly visualize leukocyte-endothelium interaction using IDF imaging facilitates the diagnosis of a systemic inflammatory response after CPB via the identification of rolling leukocytes. Integration of the frame averaging algorithm into the software of handheld vital microscopes may enable the use of microcirculatory leukocyte count as a real-time parameter at the bedside.1JAN8-155

    MicroTools enables automated quantification of capillary density and red blood cell velocity in handheld vital microscopy

    Get PDF
    Direct assessment of capillary perfusion has been prioritized in hemodynamic management of critically ill patients in addition to optimizing blood flow on the global scale. Sublingual handheld vital microscopy has enabled online acquisition of moving image sequences of the microcirculation, including the flow of individual red blood cells in the capillary network. However, due to inherent content complexity, manual image sequence analysis remained gold standard, introducing inter-observer variability and precluding real-time image analysis for clinical therapy guidance. Here we introduce an advanced computer vision algorithm for instantaneous analysis and quantification of morphometric and kinetic information related to capillary blood flow in the sublingual microcirculation. We evaluated this technique in a porcine model of septic shock and resuscitation and cardiac surgery patients. This development is of high clinical relevance because it enables implementation of point-of-care goal-directed resuscitation procedures based on correction of microcirculatory perfusion in critically ill and perioperative patients

    Intensive Care Unit Admission Parameters Improve the Accuracy of Operative Mortality Predictive Models in Cardiac Surgery

    Get PDF
    BACKGROUND: Operative mortality risk in cardiac surgery is usually assessed using preoperative risk models. However, intraoperative factors may change the risk profile of the patients, and parameters at the admission in the intensive care unit may be relevant in determining the operative mortality. This study investigates the association between a number of parameters at the admission in the intensive care unit and the operative mortality, and verifies the hypothesis that including these parameters into the preoperative risk models may increase the accuracy of prediction of the operative mortality. METHODOLOGY: 929 adult patients who underwent cardiac surgery were admitted to the study. The preoperative risk profile was assessed using the logistic EuroSCORE and the ACEF score. A number of parameters recorded at the admission in the intensive care unit were explored for univariate and multivariable association with the operative mortality. PRINCIPAL FINDINGS: A heart rate higher than 120 beats per minute and a blood lactate value higher than 4 mmol/L at the admission in the intensive care unit were independent predictors of operative mortality, with odds ratio of 6.7 and 13.4 respectively. Including these parameters into the logistic EuroSCORE and the ACEF score increased their accuracy (area under the curve 0.85 to 0.88 for the logistic EuroSCORE and 0.81 to 0.86 for the ACEF score). CONCLUSIONS: A double-stage assessment of operative mortality risk provides a higher accuracy of the prediction. Elevated blood lactates and tachycardia reflect a condition of inadequate cardiac output. Their inclusion in the assessment of the severity of the clinical conditions after cardiac surgery may offer a useful tool to introduce more sophisticated hemodynamic monitoring techniques. Comparison between the predicted operative mortality risk before and after the operation may offer an assessment of the operative performance
    corecore