3,389 research outputs found
Visualizing the semantic content of large text databases using text maps
A methodology for generating text map representations of the semantic content of text databases is presented. Text maps provide a graphical metaphor for conceptualizing and visualizing the contents and data interrelationships of large text databases. Described are a set of experiments conducted against the TIPSTER corpora of Wall Street Journal articles. These experiments provide an introduction to current work in the representation and visualization of documents by way of their semantic content
SemGrAM - Integrating semantic graphs into association rule mining
To date, most association rule mining algorithms
have assumed that the domains of items are either
discrete or, in a limited number of cases, hierarchical,
categorical or linear. This constrains the search for
interesting rules to those that satisfy the specified
quality metrics as independent values or as higher
level concepts of those values. However, in many
cases the determination of a single hierarchy is not
practicable and, for many datasets, an item’s value
may be taken from a domain that is more conveniently
structured as a graph with weights indicating
semantic (or conceptual) distance. Research in the
development of algorithms that generate disjunctive
association rules has allowed the production of
rules such as Radios V TVs -> Cables. In many
cases there is little semantic relationship between
the disjunctive terms and arguably less readable
rules such as Radios V Tuesday -> Cables can
result. This paper describes two association rule
mining algorithms, SemGrAMG and SemGrAMP,
that accommodate conceptual distance information
contained in a semantic graph. The SemGrAM
algorithms permit the discovery of rules that include
an association between sets of cognate groups of
item values. The paper discusses the algorithms, the
design decisions made during their development and
some experimental results.Sydney, NS
Learning ontology aware classifiers
Many applications of data-driven knowledge discovery processes call for the exploration of data from multiple points of view that reflect different ontological commitments on the part of the learner. Of particular interest in this context are algorithms for learning classifiers from ontologies and data. Against this background, my dissertation research is aimed at the design and analysis of algorithms for construction of robust, compact, accurate and ontology aware classifiers. We have precisely formulated the problem of learning pattern classifiers from attribute value taxonomies (AVT) and partially specified data. We have designed and implemented efficient and theoretically well-founded AVT-based classifier learners. Based on a general strategy of hypothesis refinement to search in a generalized hypothesis space, our AVT-guided learning algorithm adopts a general learning framework that takes into account the tradeoff between the complexity and the accuracy of the predictive models, which enables us to learn a classifier that is both compact and accurate. We have also extended our approach to learning compact and accurate classifier from semantically heterogeneous data sources. We presented a principled way to reduce the problem of learning from semantically heterogeneous data to the problem of learning from distributed partially specified data by reconciling semantic heterogeneity using AVT mappings, and we described a sufficient statistics based solution
- …