8,188 research outputs found

    Schema-Driven Actionable Insight Generation and Smart Recommendation

    Full text link
    In natural language generation (NLG), insight mining is seen as a data-to-text task, where data is mined for interesting patterns and verbalised into 'insight' statements. An 'over-generate and rank' paradigm is intuitively used to generate such insights. The multidimensionality and subjectivity of this process make it challenging. This paper introduces a schema-driven method to generate actionable insights from data to drive growth and change. It also introduces a technique to rank the insights to align with user interests based on their feedback. We show preliminary qualitative results of the insights generated using our technique and demonstrate its ability to adapt to feedback

    Beyond FEV1 in COPD: a review of patient-reported outcomes and their measurement

    Get PDF
    Abstract: Patients with chronic obstructive pulmonary disease (COPD) present with a variety of symptoms and pathological consequences. Although primarily viewed as a respiratory disease, COPD has both pulmonary and extrapulmonary effects, which have an impact on many aspects of physical, emotional, and mental well-being. Traditional assessment of COPD relies heavily on measuring lung function, specifically forced expiratory volume in 1 second (FEV1). However, the evidence suggests that FEV1 is a relatively poor correlate of symptoms such as breathlessness and the impact of COPD on daily life. Furthermore, many consequences of the disease, including anxiety and depression and the ability to perform daily activities, can only be described and reported reliably by the patient. Thus, in order to provide a comprehensive view of the effects of interventions in clinical trials, it is essential that spirometry is accompanied by assessments using patient-reported outcome (PRO) instruments. We provide an overview of patient-reported outcome concepts in COPD, such as breathlessness, physical functioning, and health status, and evaluate the tools used for measuring these concepts. Particular attention is given to the newly developed instruments emerging in response to recent regulatory guidelines for the development and use of PROs in clinical trials. We conclude that although data from the development and validation of these new PRO instruments are emerging, to build the body of evidence that supports the use of a new instrument takes many years. Furthermore, new instruments do not necessarily have better discriminative or evaluative properties than older instruments. The development of new PRO tools, however, is crucial, not only to ensure that key COPD concepts are being reliably measured but also that the relevant treatment effects are being captured in clinical trials. In turn, this will help us to understand better the patient's experience of the disease

    Neural Scoring of Logical Inferences from Data using Feedback

    Get PDF
    Insights derived from wearable sensors in smartwatches or sleep trackers can help users in approaching their healthy lifestyle goals. These insights should indicate significant inferences from user behaviour and their generation should adapt automatically to the preferences and goals of the user. In this paper, we propose a neural network model that generates personalised lifestyle insights based on a model of their significance, and feedback from the user. Simulated analysis of our model shows its ability to assign high scores to a) insights with statistically significant behaviour patterns and b) topics related to simple or complex user preferences at any given time. We believe that the proposed neural networks model could be adapted for any application that needs user feedback to score logical inferences from data

    Emotions in context: examining pervasive affective sensing systems, applications, and analyses

    Get PDF
    Pervasive sensing has opened up new opportunities for measuring our feelings and understanding our behavior by monitoring our affective states while mobile. This review paper surveys pervasive affect sensing by examining and considering three major elements of affective pervasive systems, namely; “sensing”, “analysis”, and “application”. Sensing investigates the different sensing modalities that are used in existing real-time affective applications, Analysis explores different approaches to emotion recognition and visualization based on different types of collected data, and Application investigates different leading areas of affective applications. For each of the three aspects, the paper includes an extensive survey of the literature and finally outlines some of challenges and future research opportunities of affective sensing in the context of pervasive computing

    Scalable and Weakly Supervised Bank Transaction Classification

    Full text link
    This paper aims to categorize bank transactions using weak supervision, natural language processing, and deep neural network techniques. Our approach minimizes the reliance on expensive and difficult-to-obtain manual annotations by leveraging heuristics and domain knowledge to train accurate transaction classifiers. We present an effective and scalable end-to-end data pipeline, including data preprocessing, transaction text embedding, anchoring, label generation, discriminative neural network training, and an overview of the system architecture. We demonstrate the effectiveness of our method by showing it outperforms existing market-leading solutions, achieves accurate categorization, and can be quickly extended to novel and composite use cases. This can in turn unlock many financial applications such as financial health reporting and credit risk assessment

    Data analytics and algorithms in policing in England and Wales: Towards a new policy framework

    Get PDF
    RUSI was commissioned by the Centre for Data Ethics and Innovation (CDEI) to conduct an independent study into the use of data analytics by police forces in England and Wales, with a focus on algorithmic bias. The primary purpose of the project is to inform CDEI’s review of bias in algorithmic decision-making, which is focusing on four sectors, including policing, and working towards a draft framework for the ethical development and deployment of data analytics tools for policing. This paper focuses on advanced algorithms used by the police to derive insights, inform operational decision-making or make predictions. Biometric technology, including live facial recognition, DNA analysis and fingerprint matching, are outside the direct scope of this study, as are covert surveillance capabilities and digital forensics technology, such as mobile phone data extraction and computer forensics. However, because many of the policy issues discussed in this paper stem from general underlying data protection and human rights frameworks, these issues will also be relevant to other police technologies, and their use must be considered in parallel to the tools examined in this paper. The project involved engaging closely with senior police officers, government officials, academics, legal experts, regulatory and oversight bodies and civil society organisations. Sixty nine participants took part in the research in the form of semi-structured interviews, focus groups and roundtable discussions. The project has revealed widespread concern across the UK law enforcement community regarding the lack of official national guidance for the use of algorithms in policing, with respondents suggesting that this gap should be addressed as a matter of urgency. Any future policy framework should be principles-based and complement existing police guidance in a ‘tech-agnostic’ way. Rather than establishing prescriptive rules and standards for different data technologies, the framework should establish standardised processes to ensure that data analytics projects follow recommended routes for the empirical evaluation of algorithms within their operational context and evaluate the project against legal requirements and ethical standards. The new guidance should focus on ensuring multi-disciplinary legal, ethical and operational input from the outset of a police technology project; a standard process for model development, testing and evaluation; a clear focus on the human–machine interaction and the ultimate interventions a data driven process may inform; and ongoing tracking and mitigation of discrimination risk

    Text mining with sentiment analysis on seafarers' medical documents

    Get PDF
    Abstract Digital health systems contain large amounts of patient records, doctor notes, and prescriptions in text format. This information summarized over the electronic clinical information will lead to an improved quality of healthcare, the possibility of fewer medical errors, and low costs. Besides, seafarers are more vulnerable to have accidents, and prone to health hazards because of work culture, climatic changes, and personal habits. Therefore, text mining implementation in seafarers' medical documents can generate better knowledge of medical issues that often happened onboard. Medical records are collected from digital health systems of Centro Internazionale Radio Medico (C.I.R.M.) which is an Italian Telemedical Maritime Assistance System (TMAS). Three years (2018–2020) patient data have been used for analysis. Adoption of both lexicon and Naive Bayes' algorithms was done to perform sentimental analysis and experiments were conducted over R statistical tool. Visualization of symptomatic information was done through word clouds and 96% of the correlation between medical problems and diagnosis outcome has been achieved. We validate the sentiment analysis with more than 80% accuracy and precision

    Big data analytics: a predictive analysis applied to cybersecurity in a financial organization

    Get PDF
    Project Work presented as partial requirement for obtaining the Master’s degree in Information Management, with a specialization in Knowledge Management and Business IntelligenceWith the generalization of the internet access, cyber attacks have registered an alarming growth in frequency and severity of damages, along with the awareness of organizations with heavy investments in cybersecurity, such as in the financial sector. This work is focused on an organization’s financial service that operates on the international markets in the payment systems industry. The objective was to develop a predictive framework solution responsible for threat detection to support the security team to open investigations on intrusive server requests, over the exponentially growing log events collected by the SIEM from the Apache Web Servers for the financial service. A Big Data framework, using Hadoop and Spark, was developed to perform classification tasks over the financial service requests, using Neural Networks, Logistic Regression, SVM, and Random Forests algorithms, while handling the training of the imbalance dataset through BEV. The main conclusions over the analysis conducted, registered the best scoring performances for the Random Forests classifier using all the preprocessed features available. Using the all the available worker nodes with a balanced configuration of the Spark executors, the most performant elapsed times for loading and preprocessing of the data were achieved using the column-oriented ORC with native format, while the row-oriented CSV format performed the best for the training of the classifiers.Com a generalização do acesso Ă  internet, os ciberataques registaram um crescimento alarmante em frequĂȘncia e severidade de danos causados, a par da consciencialização das organizaçÔes, com elevados investimentos em cibersegurança, como no setor financeiro. Este trabalho focou-se no serviço financeiro de uma organização que opera nos mercados internacionais da indĂșstria de sistemas de pagamento. O objetivo consistiu no desenvolvimento uma solução preditiva responsĂĄvel pela detecção de ameaças, por forma a dar suporte Ă  equipa de segurança na abertura de investigaçÔes sobre pedidos intrusivos no servidor, relativamente aos exponencialmente crescentes eventos de log coletados pelo SIEM, referentes aos Apache Web Servers, para o serviço financeiro. Uma solução de Big Data, usando Hadoop e Spark, foi desenvolvida com o objectivo de executar tarefas de classificação sobre os pedidos do serviço financeiros, usando os algoritmos Neural Networks, Logistic Regression, SVM e Random Forests, solucionando os problemas associados ao treino de um dataset desequilibrado atravĂ©s de BEV. As principais conclusĂ”es sobre as anĂĄlises realizadas registaram os melhores resultados de classificação usando o algoritmo Random Forests com todas as variĂĄveis prĂ©-processadas disponĂ­veis. Usando todos os nĂłs do cluster e uma configuração balanceada dos executores do Spark, os melhores tempos para carregar e prĂ©-processar os dados foram obtidos usando o formato colunar ORC nativo, enquanto o formato CSV, orientado a linhas, apresentou os melhores tempos para o treino dos classificadores
    • 

    corecore