458,167 research outputs found
Beyond Stemming and Lemmatization: Ultra-stemming to Improve Automatic Text Summarization
In Automatic Text Summarization, preprocessing is an important phase to
reduce the space of textual representation. Classically, stemming and
lemmatization have been widely used for normalizing words. However, even using
normalization on large texts, the curse of dimensionality can disturb the
performance of summarizers. This paper describes a new method for normalization
of words to further reduce the space of representation. We propose to reduce
each word to its initial letters, as a form of Ultra-stemming. The results show
that Ultra-stemming not only preserve the content of summaries produced by this
representation, but often the performances of the systems can be dramatically
improved. Summaries on trilingual corpora were evaluated automatically with
Fresa. Results confirm an increase in the performance, regardless of summarizer
system used.Comment: 22 pages, 12 figures, 9 table
Does Phenomenal Consciousness Overflow Attention? An Argument from Feature-Integration
In the past two decades a number of arguments have been given in favor of the possibility of phenomenal consciousness without attentional access, otherwise known as phenomenal overflow. This paper will show that the empirical data commonly cited in support of this thesis is, at best, ambiguous between two equally plausible interpretations, one of which does not posit phenomenology beyond attention. Next, after citing evidence for the feature-integration theory of attention, this paper will give an account of the relationship between consciousness and attention that accounts for both the empirical data and our phenomenological intuitions without positing phenomenal consciousness beyond attention. Having undercut the motivations for accepting phenomenal overflow along with having given reasons to think that phenomenal overflow does not occur, I end with the tentative conclusion that attention is a necessary condition for phenomenal consciousness
Determination of the financial impact of machine downtime on the Australia Post large letters sorting process
Machine downtime, whether planned or unplanned, is intuitively costly to manufacturing organisations, however is often very difficult to quantify. Costing processes are rarely undertaken within manufacturing organisations. It has previously been estimated that 80% of industrial facilities were unable to accurately cost downtime, with many facilities underestimating the total cost by a factor of 200-300% (Crumrine and Post 2006). It was also acknowledged that the lack of practical guides has hindered costing procedures of any nature being implemented more readily (Dale and Plunkett 1995). Models that did exist rarely considered more than a subset of the costs identified elsewhere, leading to overly conservative estimations. In addition, because cost definitions are not consistent, methodologies for evaluating and quantifying individual costs have not previously been adequately defined. The work outlined in this paper has aimed to develop the first comprehensive methodology for determining the cost of downtime, with particular application to the Australia Post's automated mail processing machines. The method presented may be applied to any manufacturing environment which would benefit from a more complete understanding of the magnitude of the cost of machine or process downtime
Handwriting styles: benchmarks and evaluation metrics
Evaluating the style of handwriting generation is a challenging problem,
since it is not well defined. It is a key component in order to develop in
developing systems with more personalized experiences with humans. In this
paper, we propose baseline benchmarks, in order to set anchors to estimate the
relative quality of different handwriting style methods. This will be done
using deep learning techniques, which have shown remarkable results in
different machine learning tasks, learning classification, regression, and most
relevant to our work, generating temporal sequences. We discuss the challenges
associated with evaluating our methods, which is related to evaluation of
generative models in general. We then propose evaluation metrics, which we find
relevant to this problem, and we discuss how we evaluate the evaluation
metrics. In this study, we use IRON-OFF dataset. To the best of our knowledge,
there is no work done before in generating handwriting (either in terms of
methodology or the performance metrics), our in exploring styles using this
dataset.Comment: Submitted to IEEE International Workshop on Deep and Transfer
Learning (DTL 2018
What May Visualization Processes Optimize?
In this paper, we present an abstract model of visualization and inference
processes and describe an information-theoretic measure for optimizing such
processes. In order to obtain such an abstraction, we first examined six
classes of workflows in data analysis and visualization, and identified four
levels of typical visualization components, namely disseminative,
observational, analytical and model-developmental visualization. We noticed a
common phenomenon at different levels of visualization, that is, the
transformation of data spaces (referred to as alphabets) usually corresponds to
the reduction of maximal entropy along a workflow. Based on this observation,
we establish an information-theoretic measure of cost-benefit ratio that may be
used as a cost function for optimizing a data visualization process. To
demonstrate the validity of this measure, we examined a number of successful
visualization processes in the literature, and showed that the
information-theoretic measure can mathematically explain the advantages of such
processes over possible alternatives.Comment: 10 page
Quantum cryptography: key distribution and beyond
Uniquely among the sciences, quantum cryptography has driven both
foundational research as well as practical real-life applications. We review
the progress of quantum cryptography in the last decade, covering quantum key
distribution and other applications.Comment: It's a review on quantum cryptography and it is not restricted to QK
Spatial representations of numbers and letters in children
Different lines of evidence suggest that children's mental representations of numbers are spatially organized in form of a mental number line. It is, however, still unclear whether a spatial organization is specific for the numerical domain or also applies to other ordinal sequences in children. In the present study, children (n = 129) aged 8–9 years were asked to indicate the midpoint of lines flanked by task-irrelevant digits or letters. We found that the localization of the midpoint was systematically biased toward the larger digit. A similar, but less pronounced, effect was detected for letters with spatial biases toward the letter succeeding in the alphabet. Instead of assuming domain-specific forms of spatial representations, we suggest that ordinal information expressing relations between different items of a sequence might be spatially coded in children, whereby numbers seem to convey this kind of information in the most salient way
- …