23,618 research outputs found
Thematic Annotation: extracting concepts out of documents
Contrarily to standard approaches to topic annotation, the technique used in
this work does not centrally rely on some sort of -- possibly statistical --
keyword extraction. In fact, the proposed annotation algorithm uses a large
scale semantic database -- the EDR Electronic Dictionary -- that provides a
concept hierarchy based on hyponym and hypernym relations. This concept
hierarchy is used to generate a synthetic representation of the document by
aggregating the words present in topically homogeneous document segments into a
set of concepts best preserving the document's content.
This new extraction technique uses an unexplored approach to topic selection.
Instead of using semantic similarity measures based on a semantic resource, the
later is processed to extract the part of the conceptual hierarchy relevant to
the document content. Then this conceptual hierarchy is searched to extract the
most relevant set of concepts to represent the topics discussed in the
document. Notice that this algorithm is able to extract generic concepts that
are not directly present in the document.Comment: Technical report EPFL/LIA. 81 pages, 16 figure
Enriching very large ontologies using the WWW
This paper explores the possibility to exploit text on the world wide web in
order to enrich the concepts in existing ontologies. First, a method to
retrieve documents from the WWW related to a concept is described. These
document collections are used 1) to construct topic signatures (lists of
topically related words) for each concept in WordNet, and 2) to build
hierarchical clusters of the concepts (the word senses) that lexicalize a given
word. The overall goal is to overcome two shortcomings of WordNet: the lack of
topical links among concepts, and the proliferation of senses. Topic signatures
are validated on a word sense disambiguation task with good results, which are
improved when the hierarchical clusters are used.Comment: 6 page
An evaluation of pedagogically informed parameterised questions for self assessment
Self-assessment is a crucial component of learning. Learners can learn by asking themselves questions and attempting to answer them. However, creating effective questions is time-consuming because it may require considerable resources and the skill of critical thinking. Questions need careful construction to accurately represent the intended learning outcome and the subject matter involved. There are very few systems currently available which generate questions automatically, and these are confined to specific domains. This paper presents a system for automatically generating questions from a competency framework, based on a sound pedagogical and technological approach. This makes it possible to guide learners in developing questions for themselves, and to provide authoring templates which speed the creation of new questions for self-assessment. This novel design and implementation involves an ontological database that represents the intended learning outcome to be assessed across a number of dimensions, including level of cognitive ability and subject matter. The system generates a list of all the questions that are possible from a given learning outcome, which may then be used to test for understanding, and so could determine the degree to which learners actually acquire the desired knowledge. The way in which the system has been designed and evaluated is discussed, along with its educational benefits
Exploiting Synergy Between Ontologies and Recommender Systems
Recommender systems learn about user preferences over time, automatically finding things of similar interest. This reduces the burden of creating explicit queries. Recommender systems do, however, suffer from cold-start problems where no initial information is available early on upon which to base recommendations. Semantic knowledge structures, such as ontologies, can provide valuable domain knowledge and user information. However, acquiring such knowledge and keeping it up to date is not a trivial task and user interests are particularly difficult to acquire and maintain. This paper investigates the synergy between a web-based research paper recommender system and an ontology containing information automatically extracted from departmental databases available on the web. The ontology is used to address the recommender systems cold-start problem. The recommender system addresses the ontology's interest-acquisition problem. An empirical evaluation of this approach is conducted and the performance of the integrated systems measured
Exploiting synergy between ontologies and recommender systems
Recommender systems learn about user preferences over time, automatically finding things of similar interest. This reduces the burden of creating explicit queries. Recommender systems do, however, suffer from cold-start problems where no initial information is available early on upon which to base recommendations.Semantic knowledge structures, such as ontologies, can provide valuable domain knowledge and user information. However, acquiring such knowledge and keeping it up to date is not a trivial task and user interests are particularly difficult to acquire and maintain.
This paper investigates the synergy between a web-based research paper recommender system and an ontology containing information automatically extracted from departmental databases available on the web. The ontology is used to address the recommender systems cold-start problem. The recommender system addresses the ontology's interest-acquisition problem. An empirical evaluation of this approach is conducted and the performance of the integrated systems measured
The devices, experimental scaffolds, and biomaterials ontology (DEB): a tool for mapping, annotation, and analysis of biomaterials' data
The size and complexity of the biomaterials literature makes systematic data analysis an excruciating manual task. A practical solution is creating databases and information resources. Implant design and biomaterials research can greatly benefit from an open database for systematic data retrieval. Ontologies are pivotal to knowledge base creation, serving to represent and organize domain knowledge. To name but two examples, GO, the gene ontology, and CheBI, Chemical Entities of Biological Interest ontology and their associated databases are central resources to their respective research communities. The creation of the devices, experimental scaffolds, and biomaterials ontology (DEB), an open resource for organizing information about biomaterials, their design, manufacture, and biological testing, is described. It is developed using text analysis for identifying ontology terms from a biomaterials gold standard corpus, systematically curated to represent the domain's lexicon. Topics covered are validated by members of the biomaterials research community. The ontology may be used for searching terms, performing annotations for machine learning applications, standardized meta-data indexing, and other cross-disciplinary data exploitation. The input of the biomaterials community to this effort to create data-driven open-access research tools is encouraged and welcomed.Preprin
Extending, trimming and fusing WordNet for technical documents
This paper describes a tool for the automatic
extension and trimming of a multilingual
WordNet database for cross-lingual retrieval
and multilingual ontology building in
intranets and domain-specific document
collections. Hierarchies, built from
automatically extracted terms and combined
with the WordNet relations, are trimmed
with a disambiguation method based on the
document salience of the words in the
glosses. The disambiguation is tested in a
cross-lingual retrieval task, showing
considerable improvement (7%-11%). The
condensed hierarchies can be used as
browse-interfaces to the documents
complementary to retrieval
- …