45,913 research outputs found
Automatic extraction of knowledge from web documents
A large amount of digital information available is written as text documents in the form of web pages, reports, papers, emails, etc. Extracting the knowledge of interest from such documents from multiple sources in a timely fashion is therefore crucial. This paper provides an update on the Artequakt system which uses natural language tools to automatically extract knowledge about artists from multiple documents based on a predefined ontology. The ontology represents the type and form of knowledge to extract. This knowledge is then used to generate tailored biographies. The information extraction process of Artequakt is detailed and evaluated in this paper
Recommended from our members
Machine learning : techniques and foundations
The field of machine learning studies computational methods for acquiring new knowledge, new skills, and new ways to organize existing knowledge. In this paper we present some of the basic techniques and principles that underlie AI research on learning, including methods for learning from examples, learning in problem solving, learning by analogy, grammar acquisition, and machine discovery. In each case, we illustrate the techniques with paradigmatic examples
Formally analysing the concepts of domestic violence.
The types of police inquiries performed these days are incredibly diverse. Often data processing architectures are not suited to cope with this diversity since most of the case data is still stored as unstructured text. In this paper Formal Concept Analysis (FCA) is showcased for its exploratory data analysis capabilities in discovering domestic violence intelligence from a dataset of unstructured police reports filed with the regional police Amsterdam-Amstelland in the Netherlands. From this data analysis it is shown that FCA can be a powerful instrument to operationally improve policing practice. For one, it is shown that the definition of domestic violence employed by the police is not always as clear as it should be, making it hard to use it effectively for classification purposes. In addition, this paper presents newly discovered knowledge for automatically classifying certain cases as either domestic or non-domestic violence is. Moreover, it provides practical advice for detecting incorrect classifications performed by police officers. A final aspect to be discussed is the problems encountered because of the sometimes unstructured way of working of police officers. The added value of this paper resides in both using FCA for exploratory data analysis, as well as with the application of FCA for the detection of domestic violence.Formal concept analysis (FCA); Domestic violence; Knowledge discovery in databases; Text mining; Exploratory data analysis; Knowledge enrichment; Concept discovery;
The domain of authority
If the commands of authority are peremptory and content-independent directives, it is a great puzzle why any rational autonomous agent should accept them as morally binding, as Robert Paul Wolff and others have argued. I analyse the peremptory and content-independent quality of authoritative directives and argue that all earthly authorities operate within a specified domain. I investigate three candidates for the role of universally applicable boundary conditions–morality, harm to self, and absurdity. I conclude that commands are authoritative only when intra vires, i.e. issued within the proper domain of the authority. Wolff's challenge is not met, hut it is shown to be less forbidding
Discovering Significant Topics from Legal Decisions with Selective Inference
We propose and evaluate an automated pipeline for discovering significant
topics from legal decision texts by passing features synthesized with topic
models through penalised regressions and post-selection significance tests. The
method identifies case topics significantly correlated with outcomes,
topic-word distributions which can be manually-interpreted to gain insights
about significant topics, and case-topic weights which can be used to identify
representative cases for each topic. We demonstrate the method on a new dataset
of domain name disputes and a canonical dataset of European Court of Human
Rights violation cases. Topic models based on latent semantic analysis as well
as language model embeddings are evaluated. We show that topics derived by the
pipeline are consistent with legal doctrines in both areas and can be useful in
other related legal analysis tasks.Comment: This is an accepted manuscript of work forthcoming in PhilTrans A.
Please cite the publisher's version onl
- …