5 research outputs found

    A Deep Learning Architecture for Sentiment Analysis

    Get PDF
    The fabulous results of Deep Convolution Neural Networks in computer vision and image analysis have recently attracted considerable attention from researchers of other application domains as well. In this paper we present NgramCNN, a neural network architecture we designed for sentiment analysis of long text documents. It uses pretrained word embeddings for dense feature representation and a very simple single-layer classifier. The complexity is encapsulated in feature extraction and selection parts that benefit from the effectiveness of convolution and pooling layers. For evaluation we utilized different kinds of emotional text datasets and achieved an accuracy of 91.2 % accuracy on the popular IMDB movie reviews. NgramCNN is more accurate than similar shallow convolution networks or deeper recurrent networks that were used as baselines. In the future, we intent to generalize the architecture for state of the art results in sentiment analysis of variable-length texts

    Towards intelligent diabetes knowledge management and knowledge discovery : a data mining approach

    Get PDF
    Self-monitoring and self-management play an increasingly vital role in the management of prevalent diseases afflicting millions of people worldwide. Conditions such as cardiovascular disease or diabetes mellitus can be managed with pharmaceuticals. However, lifestyle factors and behaviour modifications also play a crucial part in controlling outcomes. If carried out effectively, self-monitoring and management techniques can benefit the patient from a health point of view, empowering them to take control of their disease, and also support the health sector economically. A recent report showed that almost four fifths of the NHS diabetes budget was spent on managing preventable complications. Although applicable to any medical conditions with lifestyle control elements, this thesis will be using diabetes mellitus as a recurring theme to highlight research conducted. Diabetes is a rapidly growing epidemic disease with global implications impacting humans socially and economically. By 2025 it is estimated that five million people in the UK will be diagnosed with diabetes., This research will therefore, be relevant for a large proportion of the population.Knowledge Management (KM) has demonstrated to be a valuable approach to sharing knowledge and providing users with the information necessary to help self-manage their symptoms. Although, KM has not yet been applied sufficiently to support the growing number of diabetics in the UK. In this thesis, KM is merged with Knowledge Discovery (KD) to combat that and address the specific needs of the diabetic population. The integrated framework is implemented using data mining techniques within the proposed e-Toolkit to elicit useful knowledge encountered by patients regardless of their disease, such as adverse drug reactions. The knowledge is then disseminated through the proposed modified SECI Model for knowledge creation via the e-Toolkit.The second part of this research investigates which patient data is necessary to disseminate to healthcare professionals. This help to bridge any communications gap that may exist between patients and health care professionals. In theory, the e-toolkit will provide patients with one place to record every important health factor, whilst simultaneously allowing the medical team with real-time monitoring. Thus, enabling doctors and patients to work together to find effective ways to reduce the damaging effects of this disease, including determining the common side effects through medication reviews. Keywords: Diabetes, Knowledge Management, Knowledge Discovery, British National Health Service Web System, Doctor, Patient, Data Mining

    Políticas de Copyright de Publicações Científicas em Repositórios Institucionais: O Caso do INESC TEC

    Get PDF
    A progressiva transformação das práticas científicas, impulsionada pelo desenvolvimento das novas Tecnologias de Informação e Comunicação (TIC), têm possibilitado aumentar o acesso à informação, caminhando gradualmente para uma abertura do ciclo de pesquisa. Isto permitirá resolver a longo prazo uma adversidade que se tem colocado aos investigadores, que passa pela existência de barreiras que limitam as condições de acesso, sejam estas geográficas ou financeiras. Apesar da produção científica ser dominada, maioritariamente, por grandes editoras comerciais, estando sujeita às regras por estas impostas, o Movimento do Acesso Aberto cuja primeira declaração pública, a Declaração de Budapeste (BOAI), é de 2002, vem propor alterações significativas que beneficiam os autores e os leitores. Este Movimento vem a ganhar importância em Portugal desde 2003, com a constituição do primeiro repositório institucional a nível nacional. Os repositórios institucionais surgiram como uma ferramenta de divulgação da produção científica de uma instituição, com o intuito de permitir abrir aos resultados da investigação, quer antes da publicação e do próprio processo de arbitragem (preprint), quer depois (postprint), e, consequentemente, aumentar a visibilidade do trabalho desenvolvido por um investigador e a respetiva instituição. O estudo apresentado, que passou por uma análise das políticas de copyright das publicações científicas mais relevantes do INESC TEC, permitiu não só perceber que as editoras adotam cada vez mais políticas que possibilitam o auto-arquivo das publicações em repositórios institucionais, como também que existe todo um trabalho de sensibilização a percorrer, não só para os investigadores, como para a instituição e toda a sociedade. A produção de um conjunto de recomendações, que passam pela implementação de uma política institucional que incentive o auto-arquivo das publicações desenvolvidas no âmbito institucional no repositório, serve como mote para uma maior valorização da produção científica do INESC TEC.The progressive transformation of scientific practices, driven by the development of new Information and Communication Technologies (ICT), which made it possible to increase access to information, gradually moving towards an opening of the research cycle. This opening makes it possible to resolve, in the long term, the adversity that has been placed on researchers, which involves the existence of barriers that limit access conditions, whether geographical or financial. Although large commercial publishers predominantly dominate scientific production and subject it to the rules imposed by them, the Open Access movement whose first public declaration, the Budapest Declaration (BOAI), was in 2002, proposes significant changes that benefit the authors and the readers. This Movement has gained importance in Portugal since 2003, with the constitution of the first institutional repository at the national level. Institutional repositories have emerged as a tool for disseminating the scientific production of an institution to open the results of the research, both before publication and the preprint process and postprint, increase the visibility of work done by an investigator and his or her institution. The present study, which underwent an analysis of the copyright policies of INESC TEC most relevant scientific publications, allowed not only to realize that publishers are increasingly adopting policies that make it possible to self-archive publications in institutional repositories, all the work of raising awareness, not only for researchers but also for the institution and the whole society. The production of a set of recommendations, which go through the implementation of an institutional policy that encourages the self-archiving of the publications developed in the institutional scope in the repository, serves as a motto for a greater appreciation of the scientific production of INESC TEC

    Derivation of forest inventory parameters from high-resolution satellite imagery for the Thunkel area, Northern Mongolia. A comparative study on various satellite sensors and data analysis techniques.

    Get PDF
    With the demise of the Soviet Union and the transition to a market economy starting in the 1990s, Mongolia has been experiencing dramatic changes resulting in social and economic disparities and an increasing strain on its natural resources. The situation is exacerbated by a changing climate, the erosion of forestry related administrative structures, and a lack of law enforcement activities. Mongolia’s forests have been afflicted with a dramatic increase in degradation due to human and natural impacts such as overexploitation and wildfire occurrences. In addition, forest management practices are far from being sustainable. In order to provide useful information on how to viably and effectively utilise the forest resources in the future, the gathering and analysis of forest related data is pivotal. Although a National Forest Inventory was conducted in 2016, very little reliable and scientifically substantiated information exists related to a regional or even local level. This lack of detailed information warranted a study performed in the Thunkel taiga area in 2017 in cooperation with the GIZ. In this context, we hypothesise that (i) tree species and composition can be identified utilising the aerial imagery, (ii) tree height can be extracted from the resulting canopy height model with accuracies commensurate with field survey measurements, and (iii) high-resolution satellite imagery is suitable for the extraction of tree species, the number of trees, and the upscaling of timber volume and basal area based on the spectral properties. The outcomes of this study illustrate quite clearly the potential of employing UAV imagery for tree height extraction (R2 of 0.9) as well as for species and crown diameter determination. However, in a few instances, the visual interpretation of the aerial photographs were determined to be superior to the computer-aided automatic extraction of forest attributes. In addition, imagery from various satellite sensors (e.g. Sentinel-2, RapidEye, WorldView-2) proved to be excellently suited for the delineation of burned areas and the assessment of tree vigour. Furthermore, recently developed sophisticated classifying approaches such as Support Vector Machines and Random Forest appear to be tailored for tree species discrimination (Overall Accuracy of 89%). Object-based classification approaches convey the impression to be highly suitable for very high-resolution imagery, however, at medium scale, pixel-based classifiers outperformed the former. It is also suggested that high radiometric resolution bears the potential to easily compensate for the lack of spatial detectability in the imagery. Quite surprising was the occurrence of dark taiga species in the riparian areas being beyond their natural habitat range. The presented results matrix and the interpretation key have been devised as a decision tool and/or a vademecum for practitioners. In consideration of future projects and to facilitate the improvement of the forest inventory database, the establishment of permanent sampling plots in the Mongolian taigas is strongly advised.2021-06-0
    corecore