205 research outputs found

    Enhancing the Performance of Text Mining

    Get PDF
    The amount of text data produced in science, finance, social media, and medicine is growing at an unprecedented pace. The raw text data typically introduces major computational and analytical obstacles (e.g., extremely high dimensionality) to data mining and machine learning algorithms. Besides, the growth in the size of text data makes the search process more difficult for information retrieval systems, making retrieving relevant results to match the users’ search queries challenging. Moreover, the availability of text data in different languages creates the need to develop new methods to analyze multilingual topics to help policymakers in governmental and health systems to make risk decisions and to create policies to respond to public health crises, natural disasters, and political or social movements. The goal of this thesis is to develop new methods that handle computational and analytical problems for complex high-dimensional text data, develop a new query expansion approach to enhance the performance of information retrieval systems, and to present new techniques for analyzing multilingual topics using a translation service. First, in the field of dimensionality reduction, we develop a new method for detecting and eliminating domain-based words. In this method, we use three different datasets and five classifiers for testing and evaluating the performance of our new approach before and after eliminating domain-based words. We compare the performance of our approach with other feature selection methods. We find that the new approach improves the performance of the binary classifier and reduces the dimensionality of the feature space by 90%. Also, our approach reduces the execution time of the classifier and outperforms one of the feature selection methods. Second, in the field of information retrieval, we design and implement a method that integrates words from a current stream with external data sources in order to predict the occurrence of relevant words that have not yet appeared in the primary source. This algorithm enables the construction of new queries that effectively capture emergent events that a user may not have anticipated when initiating the data collection stream. The added value of using the external data sources appears when we have a stream of data and we want to predict something that has not yet happened instead of using only the stream that is limited to the available information at a specific time. We compare the performance of our approach with two alternative approaches. The first approach (static) expands user queries with words extracted from a probabilistic topic model of the stream. The second approach (emergent) reinforces user queries with emergent words extracted from the stream. We find that our method outperforms alternative approaches, exhibiting particularly good results in identifying future emergent topics. Third, in the field of the multilingual text, we present a strategy to analyze the similarity between multilingual topics in English and Arabic tweets surrounding the 2020 COVID-19 pandemic. We make a descriptive comparison between topics in Arabic and English tweets about COVID-19 using tweets collected in the same way and filtered using the same keywords. We analyze Twitter’s discussion to understand the evolution of topics over time and reveal topic similarity among tweets across the datasets. We use probabilistic topic modeling to identify and extract the key topics of Twitter’s discussion in Arabic and English tweets. We use two methods to analyze the similarity between multilingual topics. The first method (full-text topic modeling approach) translates all text to English and then runs topic modeling to find similar topics. The second method (term-based topic modeling approach) runs topic modeling on the text before translation then translates the top keywords in each topic to find similar topics. We find similar topics related to COVID-19 pandemic covered in English and Arabic tweets for certain time intervals. Results indicate that the term-based topic modeling approach can reduce the cost compared to the full-text topic modeling approach and still have comparable results in finding similar topics. The computational time to translate the terms is significantly lower than the translation of the full text

    Popliteal aneurysms: a 10-year experience

    Get PDF
    Background:Popliteal aneurysms account for 70% of peripheral arterial aneurysms and, if untreated, pose a serious threat to the affected limb. Debate continues about the best form of treatment especially for asymptomatic lesions.Methods:We reviewed the computer records and charts of patients seen at this department with a diagnosis of popliteal aneurysm over the last 10 years. Patients who had not been seen within the last year were followed-up through their G.P.Results:Twenty-four patients (M 23/F 1) presented with 40 popliteal aneurysms. The mean age was 63.5±9 years. Symptoms were present in 23 of the affected limbs while 17 were asymptomatic. Thirty were treated surgically and 10 followed with regular ultrasound. The mean diameter of the repaired aneurysms was 3.3±1 cm. Aneurysms <2 cm were more likely to be asymptomatic. No limbs were lost in patients undergoing elective repair of popliteal aneurysms. The secondary patency and limb salvage rates at 3 years were 84% and 96% respectively. Conservative management of asymptomatic lesions <2 cm was not complicated by the development of symptoms.Conclusions:Elective repair of popliteal aneurysms by exclusion and bypass is a safe, effective and durable technique. Small asymptomatic lesions can be safely managed with close follow-up

    Reply

    Get PDF

    Pasireotide Long-Acting Release Treatment for Diabetic Cats with Underlying Hypersomatotropism

    Get PDF
    BACKGROUND: Long‐term medical management of hypersomatotropism (HS) in cats has proved unrewarding. Pasireotide, a novel somatostatin analogue, decreases serum insulin‐like growth factor 1 (IGF‐1) and improves insulin sensitivity in cats with HS when administered as a short‐acting preparation. OBJECTIVES: Assess once‐monthly administration of long‐acting pasireotide (pasireotide LAR) for treatment of cats with HS. ANIMALS: Fourteen cats with HS, diagnosed based on diabetes mellitus, pituitary enlargement, and serum IGF‐1 > 1000 ng/mL. METHODS: Uncontrolled, prospective cohort study. Cats received pasireotide LAR (6–8 mg/kg SC) once monthly for 6 months. Fructosamine and IGF‐1 concentrations, and 12‐hour blood glucose curves (BGCs) were assessed at baseline and then monthly. Product of fructosamine concentration and insulin dose was calculated as an indicator of insulin resistance (Insulin Resistance Index). Linear mixed‐effects modeling assessed for significant change in fructosamine, IGF‐1, mean blood glucose (MBG) of BGCs, insulin dose (U/kg) and Insulin Resistance Index. RESULTS: Eight cats completed the trial. Three cats entered diabetic remission. Median IGF‐1 (baseline: 1962 ng/mL [range 1051–2000 ng/mL]; month 6: 1253 ng/mL [524–1987 ng/mL]; P < .001) and median Insulin Resistance Index (baseline: 812 ÎŒmolU/L kg [173–3565 ÎŒmolU/L kg]; month 6: 135 ÎŒmolU/L kg [0–443 ÎŒmolU/L kg]; P = .001) decreased significantly. No significant change was found in mean fructosamine (baseline: 494 ± 127 ÎŒmol/L; month 6: 319 ± 113.3 ÎŒmol/L; P = .07) or MBG (baseline: 347.7 ± 111.0 mg/dL; month 6: 319.5 ± 113.3 mg/dL; P = .11), despite a significant decrease in median insulin dose (baseline: 1.5 [0.4–5.2] U/kg; 6 months: 0.3 [0.0–1.4] U/kg; P < .001). Adverse events included diarrhea (n = 11), hypoglycemia (n = 5), and worsening polyphagia (n = 2). CONCLUSIONS AND CLINICAL IMPORTANCE: Pasireotide LAR is the first drug to show potential as a long‐term management option for cats with HS

    Defining Natural History: Assessment of the Ability of College Students to Aid in Characterizing Clinical Progression of Niemann-Pick Disease, Type C

    Get PDF
    Niemann-Pick Disease, type C (NPC) is a fatal, neurodegenerative, lysosomal storage disorder. It is a rare disease with broad phenotypic spectrum and variable age of onset. These issues make it difficult to develop a universally accepted clinical outcome measure to assess urgently needed therapies. To this end, clinical investigators have defined emerging, disease severity scales. The average time from initial symptom to diagnosis is approximately 4 years. Further, some patients may not travel to specialized clinical centers even after diagnosis. We were therefore interested in investigating whether appropriately trained, community-based assessment of patient records could assist in defining disease progression using clinical severity scores. In this study we evolved a secure, step wise process to show that pre-existing medical records may be correctly assessed by non-clinical practitioners trained to quantify disease progression. Sixty-four undergraduate students at the University of Notre Dame were expertly trained in clinical disease assessment and recognition of major and minor symptoms of NPC. Seven clinical records, randomly selected from a total of thirty seven used to establish a leading clinical severity scale, were correctly assessed to show expected characteristics of linear disease progression. Student assessment of two new records donated by NPC families to our study also revealed linear progression of disease, but both showed accelerated disease progression, relative to the current severity scale, especially at the later stages. Together, these data suggest that college students may be trained in assessment of patient records, and thus provide insight into the natural history of a disease
    • 

    corecore