180,558 research outputs found
Textual Stylistic Variation: Choices, Genres and Individuals
This chapter argues for more informed target metrics for the statistical processing of stylistic variation in text collections. Much as operationalized relevance proved a useful goal to strive for in information retrieval, research in textual stylistics, whether application oriented or philologically inclined, needs goals formulated in terms of pertinence, relevance, and utility — notions that agree with reader ex- perience of text. Differences readers are aware of are mostly based on utility — not on textual characteristics per se. Mostly, readers report stylistic differences in terms of genres. Genres, while vague and undefined, are well-established and talked about: very early on, readers learn to distinguish genres. This chapter discusses variation given by genre, and contrasts it to variation occasioned by individual choice
Beyond information extraction: The role of ontology in military report processing
Information extraction tools like SMES transform natural language into formal representation, e.g. into feature structures. Doing so, these tools exploit and apply linguistic knowledge about the syntactic and morphological regularities of the language used. However, these tools apply semantic as well as pragmatic knowledge only partially at best. Automatic processing of military reports has to result in a visualization of the reports content by map as well as in an actualization of the underlying database in order to allow for the actualization of the common operational picture. Normally, however, the information provided by the result of the information extraction is not explicit enough for visualization processes and database insertions. This originates from the reports themselves that are elliptical, ambiguous, and vague. In order to overcome this obstacle, the situational context and thus semantic and pragmatic aspects have to be taken into account.
In the paper at hand, we present a system that uses an ontological module to integrate semantic and pragmatic knowledge. The result of the completion contains all the specifications to allow for a visualization of the report’s content on a map as well as for a database actualization
Introspective physicalism as an approach to the science of consciousness
Most theories of consciousness are based on vague speculations about the properties of conscious experience. We aim to provide a more solid basis for a science of consciousness. We argue that a theory of consciousness should provide an account of the very processes that allow us to acquire and use information about our own mental states the processes underlying introspection. This can be achieved through the construction of information processing models that can account for Type-C processes. Type-C processes can be specified experimentally by identifying paradigms in which awareness of the stimulus is necessary for an intentional action. The Shallice (1988b) framework is put forward as providing an initial account of Type-C processes, which can relate perceptual consciousness to consciously performed actions. Further, we suggest that this framework may be refined through the investigation of the functions of prefrontal cortex. The formulation of our approach requires us to consider fundamental conceptual and methodological issues associated with consciousness. The most significant of these issues concerns the scientific use of introspective evidence. We outline and justify a conservative methodological approach to the use of introspective evidence, with attention to the difficulties historically associated with its use in psychology
Scalable DB+IR technology: processing Probabilistic Datalog with HySpirit
Probabilistic Datalog (PDatalog, proposed in 1995) is a probabilistic variant of Datalog and a nice conceptual idea to model Information Retrieval in a logical, rule-based programming paradigm. Making PDatalog work in real-world applications requires more than probabilistic facts and rules, and the semantics associated with the evaluation of the programs. We report in this paper some of the key features of the HySpirit system required to scale the execution of PDatalog programs.
Firstly, there is the requirement to express probability estimation in PDatalog. Secondly, fuzzy-like predicates are required to model vague predicates (e.g. vague match of attributes such as age or price). Thirdly, to handle large data sets there are scalability issues to be addressed, and therefore, HySpirit provides probabilistic relational indexes and parallel and distributed processing. The main contribution of this paper is a consolidated view on the methods of the HySpirit system to make PDatalog applicable in real-scale applications that involve a wide range of requirements typical for data (information) management and analysis
Vagueness as Cost Reduction : An Empirical Test
This work was funded in part by an EPSRC Platform Grant awarded to the NLG group at Aberdeen.Publisher PD
Scheduling with Fuzzy Methods
Nowadays, manufacturing industries -- driven by fierce competition and rising
customer requirements -- are forced to produce a broader range of individual
products of rising quality at the same (or preferably lower) cost. Meeting
these demands implies an even more complex production process and thus also an
appropriately increasing request to its scheduling. Aggravatingly, vagueness of
scheduling parameters -- such as times and conditions -- are often inherent in
the production process. In addition, the search for an optimal schedule
normally leads to very difficult problems (NP-hard problems in the complexity
theoretical sense), which cannot be solved effciently. With the intent to
minimize these problems, the introduced heuristic method combines standard
scheduling methods with fuzzy methods to get a nearly optimal schedule within
an appropriate time considering vagueness adequately
Dynamic laser speckle and fuzzy mathematical morphology applied to studies of chemotaxis towards hydrocarbons
The movement of the microorganisms towards a higher concentration of the chemical attractant is called positive chemotaxis and is involved in the efficiency of chemical degradation. Several studies are focused in this field related to genomics, and towards demonstrating chemotactic responses by bacteria, but there is little information related to the activity and morphology of their response. In this work, we use a recently reported dynamic speckle laser method, to process images and to distinguish motile surface patterns per area of colonisation by applying image processing techniques called fuzzy mathematical morphology (FMM). The images of bacterial colonies are usually surfaced, with vague edges and non-homogeneous grey levels. Hence, conventional image processing methods for shape analysis cannot be applied in these cases. In this paper, we propose the application FMM to solve this problem. The approach given was effective to segment, detect and also to describe colonisation patterns
Automatic Detection of Vague Words and Sentences in Privacy Policies
Website privacy policies represent the single most important source of
information for users to gauge how their personal data are collected, used and
shared by companies. However, privacy policies are often vague and people
struggle to understand the content. Their opaqueness poses a significant
challenge to both users and policy regulators. In this paper, we seek to
identify vague content in privacy policies. We construct the first corpus of
human-annotated vague words and sentences and present empirical studies on
automatic vagueness detection. In particular, we investigate context-aware and
context-agnostic models for predicting vague words, and explore
auxiliary-classifier generative adversarial networks for characterizing
sentence vagueness. Our experimental results demonstrate the effectiveness of
proposed approaches. Finally, we provide suggestions for resolving vagueness
and improving the usability of privacy policies.Comment: 10 page
- …