1,293 research outputs found

    SemPCA-Summarizer: Exploiting Semantic Principal Component Analysis for Automatic Summary Generation

    Get PDF
    Text summarization is the task of condensing a document keeping the relevant information. This task integrated in wider information systems can help users to access key information without having to read everything, allowing for a higher efficiency. In this research work, we have developed and evaluated a single-document extractive summarization approach, named SemPCA-Summarizer, which reduces the dimension of a document using Principal Component Analysis technique enriched with semantic information. A concept-sentence matrix is built from the textual input document, and then, PCA is used to identify and rank the relevant concepts, which are used for selecting the most important sentences through different heuristics, thus leading to various types of summaries. The results obtained show that the generated summaries are very competitive, both from a quantitative and a qualitative viewpoint, thus indicating that our proposed approach is appropriate for briefly providing key information, and thus helping to cope with a huge amount of information available in a quicker and efficient manner

    SemPCA-Summarizer: Exploiting Semantic Principal Component Analysis for Automatic Summary Generation

    Get PDF
    Text summarization is the task of condensing a document keeping the relevant information. This task integrated in wider information systems can help users to access key information without having to read everything, allowing for a higher efficiency. In this research work, we have developed and evaluated a single-document extractive summarization approach, named SemPCA-Summarizer, which reduces the dimension of a document using Principal Component Analysis technique enriched with semantic information. A concept-sentence matrix is built from the textual input document, and then, PCA is used to identify and rank the relevant concepts, which are used for selecting the most important sentences through different heuristics, thus leading to various types of summaries. The results obtained show that the generated summaries are very competitive, both from a quantitative and a qualitative viewpoint, thus indicating that our proposed approach is appropriate for briefly providing key information, and thus helping to cope with a huge amount of information available in a quicker and efficient manner.This research work has been partially funded by the Generalitat Valenciana and the Spanish Government through the projects PROMETEOII/2014/001, TIN2015-65100-R, and TIN2015-65136-C2-2-R

    A Corpus Driven Computational Intelligence Framework for Deception Detection in Financial Text

    Get PDF
    Financial fraud rampages onwards seemingly uncontained. The annual cost of fraud in the UK is estimated to be as high as £193bn a year [1] . From a data science perspective and hitherto less explored this thesis demonstrates how the use of linguistic features to drive data mining algorithms can aid in unravelling fraud. To this end, the spotlight is turned on Financial Statement Fraud (FSF), known to be the costliest type of fraud [2]. A new corpus of 6.3 million words is composed of102 annual reports/10-K (narrative sections) from firms formally indicted for FSF juxtaposed with 306 non-fraud firms of similar size and industrial grouping. Differently from other similar studies, this thesis uniquely takes a wide angled view and extracts a range of features of different categories from the corpus. These linguistic correlates of deception are uncovered using a variety of techniques and tools. Corpus linguistics methodology is applied to extract keywords and to examine linguistic structure. N-grams are extracted to draw out collocations. Readability measurement in financial text is advanced through the extraction of new indices that probe the text at a deeper level. Cognitive and perceptual processes are also picked out. Tone, intention and liquidity are gauged using customised word lists. Linguistic ratios are derived from grammatical constructs and word categories. An attempt is also made to determine ‘what’ was said as opposed to ‘how’. Further a new module is developed to condense synonyms into concepts. Lastly frequency counts from keywords unearthed from a previous content analysis study on financial narrative are also used. These features are then used to drive machine learning based classification and clustering algorithms to determine if they aid in discriminating a fraud from a non-fraud firm. The results derived from the battery of models built typically exceed classification accuracy of 70%. The above process is amalgamated into a framework. The process outlined, driven by empirical data demonstrates in a practical way how linguistic analysis could aid in fraud detection and also constitutes a unique contribution made to deception detection studies

    Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

    Get PDF
    The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov

    A Survey on Semantic Processing Techniques

    Full text link
    Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

    Sentiment Analysis for Social Media

    Get PDF
    Sentiment analysis is a branch of natural language processing concerned with the study of the intensity of the emotions expressed in a piece of text. The automated analysis of the multitude of messages delivered through social media is one of the hottest research fields, both in academy and in industry, due to its extremely high potential applicability in many different domains. This Special Issue describes both technological contributions to the field, mostly based on deep learning techniques, and specific applications in areas like health insurance, gender classification, recommender systems, and cyber aggression detection

    Management, Technology and Learning for Individuals, Organisations and Society in Turbulent Environments

    Get PDF
    This book presents the collection of fifty papers which were presented in the Second International Conference on BUSINESS SUSTAINABILITY 2011 - Management, Technology and Learning for Individuals, Organisations and Society in Turbulent Environments , held in Póvoa de Varzim, Portugal, from 22ndto 24thof June, 2011.The main motive of the meeting was growing awareness of the importance of the sustainability issue. This importance had emerged from the growing uncertainty of the market behaviour that leads to the characterization of the market, i.e. environment, as turbulent. Actually, the characterization of the environment as uncertain and turbulent reflects the fact that the traditional technocratic and/or socio-technical approaches cannot effectively and efficiently lead with the present situation. In other words, the rise of the sustainability issue means the quest for new instruments to deal with uncertainty and/or turbulence. The sustainability issue has a complex nature and solutions are sought in a wide range of domains and instruments to achieve and manage it. The domains range from environmental sustainability (referring to natural environment) through organisational and business sustainability towards social sustainability. Concerning the instruments for sustainability, they range from traditional engineering and management methodologies towards “soft” instruments such as knowledge, learning, and creativity. The papers in this book address virtually whole sustainability problems space in a greater or lesser extent. However, although the uncertainty and/or turbulence, or in other words the dynamic properties, come from coupling of management, technology, learning, individuals, organisations and society, meaning that everything is at the same time effect and cause, we wanted to put the emphasis on business with the intention to address primarily companies and their businesses. Due to this reason, the main title of the book is “Business Sustainability 2.0” but with the approach of coupling Management, Technology and Learning for individuals, organisations and society in Turbulent Environments. Also, the notation“2.0” is to promote the publication as a step further from our previous publication – “Business Sustainability I” – as would be for a new version of software. Concerning the Second International Conference on BUSINESS SUSTAINABILITY, its particularity was that it had served primarily as a learning environment in which the papers published in this book were the ground for further individual and collective growth in understanding and perception of sustainability and capacity for building new instruments for business sustainability. In that respect, the methodology of the conference work was basically dialogical, meaning promoting dialog on the papers, but also including formal paper presentations. In this way, the conference presented a rich space for satisfying different authors’ and participants’ needs. Additionally, promoting the widest and global learning environment and participation, in accordance with the Conference's assumed mission to promote Proactive Generative Collaborative Learning, the Conference Organisation shares/puts open to the community the papers presented in this book, as well as the papers presented on the previous Conference(s). These papers can be accessed from the conference webpage (http://labve.dps.uminho.pt/bs11). In these terms, this book could also be understood as a complementary instrument to the Conference authors’ and participants’, but also to the wider readerships’ interested in the sustainability issues. The book brought together 107 authors from 11 countries, namely from Australia, Belgium, Brazil, Canada, France, Germany, Italy, Portugal, Serbia, Switzerland, and United States of America. The authors “ranged” from senior and renowned scientists to young researchers providing a rich and learning environment. At the end, the editors hope, and would like, that this book to be useful, meeting the expectation of the authors and wider readership and serving for enhancing the individual and collective learning, and to incentive further scientific development and creation of new papers. Also, the editors would use this opportunity to announce the intention to continue with new editions of the conference and subsequent editions of accompanying books on the subject of BUSINESS SUSTAINABILITY, the third of which is planned for year 2013.info:eu-repo/semantics/publishedVersio

    Recent Trends in Computational Intelligence

    Get PDF
    Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications

    From approximative to descriptive fuzzy models

    Get PDF
    • 

    corecore