29 research outputs found

    Short Text Classification with Tolerance Near Sets

    Get PDF
    Text classification is a classical machine learning application in Natural Language Processing, which aims to assign labels to textual units such as documents, sentences, paragraphs, and queries. Applications of text classification include sentiment classification and news categorization. Sentiment classification identifies the polarity of text such as positive, negative or neutral based on textual features. In this thesis, we implemented a modified form of a tolerance-based algorithm (TSC) to classify sentiment polarities of tweets as well as news categories from text. The TSC algorithm is a supervised algorithm that was designed to perform short text classification with tolerance near sets (TNS). The proposed TSC algorithm uses pre-trained SBERT algorithm vectors for creating tolerance classes. The effectiveness of the TSC algorithm has been demonstrated by testing it on ten well-researched data sets. One of the datasets (Covid-Sentiment) was hand-crafted with tweets from Twitter of opinions related to COVID. Experiments demonstrate that TSC outperforms five classical ML algorithms with one dataset, and is comparable with all other datasets using a weighted F1-score measure.Master of Science in Applied Computer Scienc

    Data Mining and Machine Learning in Astronomy

    Full text link
    We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black-box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those where data mining techniques directly resulted in improved science, and important current and future directions, including probability density functions, parallel algorithms, petascale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm, and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra figures, some minor additions to the tex

    2014 Annual Research Symposium Abstract Book

    Get PDF
    2014 annual volume of abstracts for science research projects conducted by students at Trinity College

    SPICA:revealing the hearts of galaxies and forming planetary systems : approach and US contributions

    Get PDF
    How did the diversity of galaxies we see in the modern Universe come to be? When and where did stars within them forge the heavy elements that give rise to the complex chemistry of life? How do planetary systems, the Universe's home for life, emerge from interstellar material? Answering these questions requires techniques that penetrate dust to reveal the detailed contents and processes in obscured regions. The ESA-JAXA Space Infrared Telescope for Cosmology and Astrophysics (SPICA) mission is designed for this, with a focus on sensitive spectroscopy in the 12 to 230 micron range. SPICA offers massive sensitivity improvements with its 2.5-meter primary mirror actively cooled to below 8 K. SPICA one of 3 candidates for the ESA's Cosmic Visions M5 mission, and JAXA has is committed to their portion of the collaboration. ESA will provide the silicon-carbide telescope, science instrument assembly, satellite integration and testing, and the spacecraft bus. JAXA will provide the passive and active cooling system (supporting the

    The Apertif Surveys:The First Six Months

    Get PDF
    Apertif is a new phased-array feed for the Westerbork Synthesis Radio Telescope (WSRT), greatly increasing its field of view and turning it into a natural survey instrument. In July 2019, the Apertif legacy surveys commenced; these are a time-domain survey and a two-tiered imaging survey, with a shallow and medium-deep component. The time-domain survey searches for new (millisecond) pulsars and fast radio bursts (FRBs). The imaging surveys provide neutral hydrogen (HI), radio continuum and polarization data products. With a bandwidth of 300 MHz, Apertif can detect HI out to a redshift of 0.26. The key science goals to be accomplished by Apertif include localization of FRBs (including real-time public alerts), the role of environment and interaction on galaxy properties and gas removal, finding the smallest galaxies, connecting cold gas to AGN, understanding the faint radio population, and studying magnetic fields in galaxies. After a proprietary period, survey data products will be publicly available through the Apertif Long Term Archive (ALTA, https://alta.astron.nl). I will review the progress of the surveys and present the first results from the Apertif surveys, including highlighting the currently available public data
    corecore