    Term Clustering of Syntactic Phrases

    Term clustering and syntactic phrase formation are methods for transforming natural language text. Both have had only mixed success as strategies for improving the quality of text representations for document retrieval. Since the strengths of these methods are complementary, we have explored combining them to produce superior representations. In this paper we discuss our implementation of a syntactic phrase generator, as well as our preliminary experiments with producing phrase clusters. These experiments show small improvements in retrieval effectiveness resulting from the use of phrase clusters, but it is clear that corpora much larger than standard information retrieval test collections will be required to thoroughly evaluate the use of this technique

    From Frequency to Meaning: Vector Space Models of Semantics

    Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field

    Order-theoretical ranking

    A Rule-based Methodology and Feature-based Methodology for Effect Relation Extraction in Chinese Unstructured Text

    The Chinese language differs significantly from English, both in lexical representation and grammatical structure. These differences lead to problems in the Chinese NLP, such as word segmentation and flexible syntactic structure. Many conventional methods and approaches in Natural Language Processing (NLP) based on English text are shown to be ineffective when attending to these language specific problems in late-started Chinese NLP. Relation Extraction is an area under NLP, looking to identify semantic relationships between entities in the text. The term “Effect Relation” is introduced in this research to refer to a specific content type of relationship between two entities, where one entity has a certain “effect” on the other entity. In this research project, a case study on Chinese text from Traditional Chinese Medicine (TCM) journal publications is built, to closely examine the forms of Effect Relation in this text domain. This case study targets the effect of a prescription or herb, in treatment of a disease, symptom or body part. A rule-based methodology is introduced in this thesis. It utilises predetermined rules and templates, derived from the characteristics and pattern observed in the dataset. This methodology achieves the F-score of 0.85 in its Named Entity Recognition (NER) module; 0.79 in its Semantic Relationship Extraction (SRE) module; and the overall performance of 0.46. A second methodology taking a feature-based approach is also introduced in this thesis. It views the RE task as a classification problem and utilises mathematical classification model and features consisting of contextual information and rules. It achieves the F-scores of: 0.73 (NER), 0.88 (SRE) and overall performance of 0.41. The role of functional words in the contemporary Chinese language and in relation to the ERs in this research is explored. Functional words have been found to be effective in detecting the complex structure ER entities as rules in the rule-based methodology

    Classification automatique des diatomées : une approche basée sur le contour et la géométrie

    Desenvolvimento de um Chatbot para apoio clínico

    Dissertação de mestrado integrado em Engenharia Biomédica (área de especialização em Informática Médica)Os avanços recentes das tecnologias de inteligência artificial e de processamento de dados mudaram radicalmente o paradigma do setor de saúde, dando origem a soluções digitais que prometem transformar os vários processos clínicos, permitindo um aumento da sua eficiência e qualidade enquanto capazes de reduzir os custos a eles associados. Com os profissionais de saúde a enfrentarem diariamente o problema de possuírem recursos limitados, fazendo com que não sejam capazes de monitorzar e apoiar diariamente todos os seus pacientes, cada vez mais se torna importante o desenvolvimento de alternativas válidas e fidignas, capazes de ajudar os vários pacientes, no mínimo tempo possível. Uma das soluções mais adotadas de maneira a solucionar o problema referido, reside no desenvolvimento de sistemas conversacionais, comumente chamados de Chatbots, que se assumem como capazes de efetuar o esclarecimento de questões de âmbito clínico, incorporando uma função semelhante a um assistente virtual e preenchendo, desta forma, a lacuna existente na comunicação entre os vários pacientes e os profissionais de saúde. Esta dissertação tem como foco, a sugestão de uma arquitetura relativa a um sistema conversacional com o objetivo de efetuar aconselhamento psiquiátrico. O Chatbot proposto tem o nome de YEC, acrónimo em inglês para “Your Everyday Companion”, representativo de uma abordagem híbrida, pela combinação de técnicas de processamento de linguagem natural e de um modelo Deep Learning, para a geração da sua resposta. O sistema é desenhado para efetuar tratamento diferenciado por utilizador, permitindo desta forma a inferência do seu estado emocional, bem como do estabelecimento de um grau elevado de confiança e proximidade. De forma a provar que o sistema apresentado na teoria, representa uma solução prática viável, procedeu-se ao desenvolvimento de uma fase primária do motor conversacional presente no sistema, expondo as diferentes abordagens realizadas de forma a que, pela análise dos seus resultados, fosse possível inferir sobre a sua melhor implementação. A realização desta dissertação permitiu assim concluír acerca do poder inerente à combinação de técnicas de DL e de NLP para modelação conversacional, aferindo assim da sua capacidade para ajudar a resolver os diferentes problemas clínicos observados nos dias de hoje, sendo no entanto necessário mais investigação de modo a enfatizar esta afirmação.The recent advances of technologies for artificial intelligence and data processing have radically changed the healthcare industry, giving rise to digital healthcare solutions, promising to transform the whole healthcare process to become more efficient, less expensive and with higher quality. Nowadays, Health professionals have to deal with the lack of resources, not being able to personally monitor and support patients in their everyday life, so it’s becoming more and more important, to find and develop alternative ways to instantaneously help patients, corresponding to their needs. One of the solutions for the problem above, resides in the form of dialogue systems, called Chatbots, that could play a leading role, by embodying the function of a virtual assistant and bridging the gap between patients and clinicians. This thesis focus on the suggestion of a Chatbot architecture, for both psychiatric counseling and elderly monitoring, combining methodologies to emotion recognition and intent understanding. The system proposed is called YEC, acronym for “Your Everyday Companion”, and represents a hybrid approach that integrates both NLP techniques and an encoder-decoder Deep Learning model, in order to generate the appropriated response. The system is designed to perform a differentiated treatment for every user in the system, thus allowing the establishment of trust and confidence between them. To prove that the proposed approach is a viable solution to the presented problem, it is demonstrated the practical implementation of the YEC model, for an initial phase of the system, demonstrating how the use of different recurrent neural networks in our model results in dissimilar performance results for our Chatbot, allowing us to infer which one is better suitable. This work allowed to realize the potential of combining Deep Learning and Natural language Processing for conversational modeling and their capability for solving some real life problems associated with healthcare, being nevertheless necessary more future work to give emphasis to this affirmation

    Context-awareness for adaptive information retrieval systems

    Philosophiae Doctor - PhDThis research study investigates optimization of IRS to individual information needs in order of relevance. The research addressed development of algorithms that optimize the ranking of documents retrieved from IRS. In this thesis, we present two aspects of context-awareness in IR. Firstly, the design of context of information. The context of a query determines retrieved information relevance. Thus, executing the same query in diverse contexts often leads to diverse result rankings. Secondly, the relevant context aspects should be incorporated in a way that supports the knowledge domain representing users’ interests. In this thesis, the use of evolutionary algorithms is incorporated to improve the effectiveness of IRS. A context-based information retrieval system is developed whose retrieval effectiveness is evaluated using precision and recall metrics. The results demonstrate how to use attributes from user interaction behaviour to improve the IR effectivenes

    Evolution von Relationen in temporalen partiten Themen-Graphen

    In der vorliegenden Arbeit wird ein Modell zur Darstellung von Relationen unter aufgespürten Themen unterschiedlicher Zeitfenster als Themen-Graph entwickelt. Variieren und Verschieben des Betrachtungszeitraums bildet Beziehungen zwischen Themen in unterschiedlicher Komplexität ab unter Einbeziehung der jeweiligen Themenbedeutung. Evolutionslebenszyklen eines Themas wie auch Änderungen thematischer Relationen werden sichtbar. Dabei können gefundene Themen bekannten Ereignissen zugeordnet werden