116 research outputs found
Ontologies and Information Extraction
This report argues that, even in the simplest cases, IE is an ontology-driven
process. It is not a mere text filtering method based on simple pattern
matching and keywords, because the extracted pieces of texts are interpreted
with respect to a predefined partial domain model. This report shows that
depending on the nature and the depth of the interpretation to be done for
extracting the information, more or less knowledge must be involved. This
report is mainly illustrated in biology, a domain in which there are critical
needs for content-based exploration of the scientific literature and which
becomes a major application domain for IE
Computational explorations of semantic cognition
Motivated by the widespread use of distributional models of semantics within the cognitive science community, we follow a computational modelling approach in order to better understand and expand the applicability of such models, as well as to test potential ways in which they can be improved and extended. We review evidence in favour of the assumption that distributional models capture important aspects of semantic cognition. We look at the models’ ability to account for behavioural data and fMRI patterns of brain activity, and investigate the structure of model-based, semantic networks. We test whether introducing affective information, obtained from a neural network model designed to predict emojis from co-occurring text, can improve the performance of linguistic and linguistic-visual models of semantics, in accounting for similarity/relatedness ratings. We find that adding visual and affective representations improves performance, especially for concrete and abstract words, respectively. We describe a processing model based on distributional semantics, in which activation spreads throughout a semantic network, as dictated by the patterns of semantic similarity between words. We show that the activation profile of the network, measured at various time points, can account for response time and accuracies in lexical and semantic decision tasks, as well as for concreteness/imageability and similarity/relatedness ratings. We evaluate the differences between concrete and abstract words, in terms of the structure of the semantic networks derived from distributional models of semantics. We examine how the structure is related to a number of factors that have been argued to differ between concrete and abstract words, namely imageability, age of acquisition, hedonic valence, contextual diversity, and semantic diversity. We use distributional models to explore factors that might be responsible for the poor linguistic performance of children suffering from Developmental Language Disorder. Based on the assumption that certain model parameters can be given a psychological interpretation, we start from “healthy” models, and generate “lesioned” models, by manipulating the parameters. This allows us to determine the importance of each factor, and their effects with respect to learning concrete vs abstract words
A deep learning framework for contingent liabilities risk management : predicting Brazilian labor court decisions
Estimar o resultado de um processo em litígio é crucial para muitas organizações. Uma aplicação específica são os "Passivos Contingenciais", que se referem a passivos que podem ou não ocorrer dependendo do resultado de um processo judicial em litígio. A metodologia tradicional para estimar essa probabilidade baseia-se na opinião de um advogado quem determina a possibilidade de um processo judicial ser perdido a partir de uma avaliação quantitativa. Esta tese apresenta a um modelo matemático baseado numa arquitetura de Deep Learning cujo objetivo é estimar a probabilidade de ganho ou perda de um processo de litígio, principalmente para ser utilizada na estimação de Passivos Contingenciais. A arquitetura, diferentemente do método tradicional, oferece um maior grau de confiança ao prever o resultado de um processo legal em termos de probabilidade e com um tempo de processamento de segundos. Além do resultado primário, a arquitetura estima uma amostra dos casos mais semelhantes ao processo estimado, que servem de apoio para a realização de estratégias de litígio. Nossa arquitetura foi testada em duas bases de dados de processos legais: (1) o Tribunal Europeu de Direitos Humanos (ECHR) e (2) o 4º Tribunal Regional do Trabalho brasileiro (4TRT). Ela estimou de acordo com nosso conhecimento, o melhor desempenho já publicado (precisão = 0,906) na base de dados da ECHR, uma coleção amplamente utilizada de processos legais, e é o primeiro trabalho a aplicar essa metodologia em um tribunal de trabalho brasileiro. Os resultados mostram que a arquitetura é uma alternativa adequada a ser utilizada contra o método tradicional de estimação do desfecho de um processo em litígio realizado por advogados. Finalmente, validamos nossos resultados com especialistas que confirmaram as possibilidades promissoras da arquitetura. Assim, nos incentivamos os académicos a continuar desenvolvendo pesquisas sobre modelagem matemática na área jurídica, pois é um tema emergente com um futuro promissor e aos usuários a utilizar ferramentas baseadas como a desenvolvida em nosso trabalho, pois fornecem vantagens substanciais em termos de precisão e velocidade sobre os métodos convencionais.Estimating the likely outcome of a litigation process is crucial for many organizations. A specific application is the “Contingents Liabilities,” which refers to liabilities that may or may not occur depending on the result of a pending litigation process (lawsuit). The traditional methodology for estimating this likelihood is based on the opinion from the lawyer’s experience which is based on a qualitative appreciation. This dissertation presents a mathematical modeling framework based on a Deep Learning architecture that estimates the probability outcome of a litigation process (accepted & not accepted) with a particular use on Contingent Liabilities. The framework offers a degree of confidence by describing how likely an event will occur in terms of probability and provides results in seconds. Besides the primary outcome, it offers a sample of the most similar cases to the estimated lawsuit that serve as support to perform litigation strategies. We tested our framework in two litigation process databases from: (1) the European Court of Human Rights (ECHR) and (2) the Brazilian 4th regional labor court. Our framework achieved to our knowledge the best-published performance (precision = 0.906) on the ECHR database, a widely used collection of litigation processes, and it is the first to be applied in a Brazilian labor court. Results show that the framework is a suitable alternative to be used against the traditional method of estimating the verdict outcome from a pending litigation performed by lawyers. Finally, we validated our results with experts who confirmed the promising possibilities of the framework. We encourage academics to continue developing research on mathematical modeling in the legal area as it is an emerging topic with a promising future and practitioners to use tools based as the proposed, as they provides substantial advantages in terms of accuracy and speed over conventional methods
Domain-Specific Knowledge Acquisition for Conceptual Sentence Analysis
The availability of on-line corpora is rapidly changing the field of natural language processing (NLP) from one dominated by theoretical models of often very specific linguistic phenomena to one guided by computational models that simultaneously account for a wide variety of phenomena that occur in real-world text. Thus far, among the best-performing and most robust systems for reading and summarizing large amounts of real-world text are knowledge-based natural language systems. These systems rely heavily on domain-specific, handcrafted knowledge to handle the myriad syntactic, semantic, and pragmatic ambiguities that pervade virtually all aspects of sentence analysis. Not surprisingly, however, generating this knowledge for new domains is time-consuming, difficult, and error-prone, and requires the expertise of computational linguists familiar with the underlying NLP system. This thesis presents Kenmore, a general framework for domain-specific knowledge acquisition for conceptual sentence analysis. To ease the acquisition of knowledge in new domains, Kenmore exploits an on-line corpus using symbolic machine learning techniques and robust sentence analysis while requiring only minimal human intervention. Unlike most approaches to knowledge acquisition for natural language systems, the framework uniformly addresses a range of subproblems in sentence analysis, each of which traditionally had required a separate computational mechanism. The thesis presents the results of using Kenmore with corpora from two real-world domains (1) to perform part-of-speech tagging, semantic feature tagging, and concept tagging of all open-class words in the corpus; (2) to acquire heuristics for part-ofspeech disambiguation, semantic feature disambiguation, and concept activation; and (3) to find the antecedents of relative pronouns
Recommended from our members
Towards Nootropia : a non-linear approach to adaptive document filtering
In recent years, it has become increasingly difficult for users to find relevant information within the accessible glut. Research in Information Filtering (IF) tackles this problem through a tailored representation of the user interests, a user profile. Traditionally, IF inherits techniques from the related and more well established domains of Information Retrieval and Text Categorisation. These include, linear profile representations that exclude term dependencies and may only effectively represent a single topic of interest, and linear learning algorithms that achieve a steady profile adaptation pace. We argue that these practices are not attuned to the dynamic nature of user interests. A user may be interested in more than one topic in parallel, and both frequent variations and occasional radical changes of interests are inevitable over time. With our experimental system "Nootropia", we achieve adaptive document filtering with a single, multi-topic user profile. A hierarchical term network that takes into account topical and lexical correlations between terms and identifies topic-subtopic relations between them, is used to represent a user's multiple topics of interest and distinguish between them. A series of non-linear document evaluation functions is then established on the hierarchical network. Experiments using a variation of TREC's routing subtask to test the ability of a single profile to represent two and three topics of interest, reveal the approach's superiority over a linear profile representation. Adaptation of this single, multi-topic profile to a variety of changes in the user interests, is achieved through a process of self-organisation that constantly readjusts the profile stucturally, in response to user feedback. We used virtual users and another variation of TREC's routing subtask to test the profile on two learning and two forgetting tasks. The results clearly indicate the profile's ability to adapt to both frequent variations and radical changes in user interests
Proceedings of LOAIT '07 : II Workshop on Legal Ontologies and Artificial Intelligence Techniques
Proceedings of the 2nd Workshop on Legal Ontologies and Artificial Intelligence Techniques June 4th, 2007, Stanford Universit
Semantic adaptability for the systems interoperability
In the current global and competitive business context, it is essential that enterprises adapt their knowledge resources in order to smoothly interact and collaborate with others. However, due to the existent multiculturalism of people and enterprises, there are different representation views of business processes or products, even inside a same domain. Consequently, one of the main problems found in the interoperability between enterprise systems and applications is related to semantics. The integration and sharing of enterprises knowledge to build a common lexicon, plays an important role to the semantic adaptability of the information systems. The author proposes a framework to support the development of systems to manage dynamic semantic adaptability resolution. It allows different organisations to participate in a common knowledge base building, letting at the same time maintain their own views of the domain, without compromising the integration between them. Thus, systems are able to be aware of new knowledge, and have the capacity to learn from it and to manage its semantic interoperability in a dynamic and adaptable way. The author endorses the vision that in the near future, the semantic adaptability skills of the enterprise systems will be the booster to enterprises collaboration and the appearance of new business opportunities
- …