16,463 research outputs found
Can Automatic Abstracting Improve on Current Extracting Techniques in Aiding Users to Judge the Relevance of Pages in Search Engine Results?
Current search engines use sentence extraction techniques to produce snippet result summaries, which users may find less than ideal for determining the relevance of pages. Unlike extracting, abstracting programs analyse the context of documents and rewrite them into informative summaries. Our project aims to produce abstracting summaries which are coherent and easy to read thereby lessening users’ time in judging the relevance of pages. However, automatic abstracting technique has its domain restriction. For solving this problem we propose to employ text classification techniques. We propose a new approach to initially classify whole web documents into sixteen top level ODP categories by using machine learning and a Bayesian classifier. We then manually create sixteen templates for each category. The summarisation techniques we use include a natural language processing techniques to weight words and analyse lexical chains to identify salient phrases and place them into relevant template slots to produce summaries
Cognitive Deficit of Deep Learning in Numerosity
Subitizing, or the sense of small natural numbers, is an innate cognitive
function of humans and primates; it responds to visual stimuli prior to the
development of any symbolic skills, language or arithmetic. Given successes of
deep learning (DL) in tasks of visual intelligence and given the primitivity of
number sense, a tantalizing question is whether DL can comprehend numbers and
perform subitizing. But somewhat disappointingly, extensive experiments of the
type of cognitive psychology demonstrate that the examples-driven black box DL
cannot see through superficial variations in visual representations and distill
the abstract notion of natural number, a task that children perform with high
accuracy and confidence. The failure is apparently due to the learning method
not the CNN computational machinery itself. A recurrent neural network capable
of subitizing does exist, which we construct by encoding a mechanism of
mathematical morphology into the CNN convolutional kernels. Also, we
investigate, using subitizing as a test bed, the ways to aid the black box DL
by cognitive priors derived from human insight. Our findings are mixed and
interesting, pointing to both cognitive deficit of pure DL, and some measured
successes of boosting DL by predetermined cognitive implements. This case study
of DL in cognitive computing is meaningful for visual numerosity represents a
minimum level of human intelligence.Comment: Accepted for presentation at the AAAI-1
Recommended from our members
Opportunity Creation in Innovation Networks: Interactive Revealing Practices
Innovating in networks with partners that have diverse knowledge is challenging. The challenges stem from the fact that the commonly used knowledge protection mechanisms often are neither available nor suitable in early stage exploratory collaborations. This article focuses on how company participants in heterogeneous industry networks share private knowledge while protecting firm-specific appropriation. We go beyond the prevailing strategic choice perspectives to discuss interactive revealing practices that sustain joint opportunity creation in the fragile phase of early network formation.Center for Business, Technology and La
Abstracts and Abstracting in Knowledge Discovery
published or submitted for publicatio
Ontology Driven Web Extraction from Semi-structured and Unstructured Data for B2B Market Analysis
The Market Blended Insight project1 has the objective of improving the UK business to business marketing performance using the semantic web technologies. In this project, we are implementing an ontology driven web extraction and translation framework to supplement our backend triple store of UK companies, people and geographical information. It deals with both the semi-structured data and the unstructured text on the web, to annotate and then translate the extracted data according to the backend schema
- …