9 research outputs found
Efficient Axiomatization of OWL 2 EL Ontologies from Data by means of Formal Concept Analysis: (Extended Version)
We present an FCA-based axiomatization method that produces a complete EL TBox (the terminological part of an OWL 2 EL ontology) from a graph dataset in at most
exponential time. We describe technical details that allow for efficient implementation as well as variations that dispense with the computation of extremely large axioms, thereby
rendering the approach applicable albeit some completeness is lost. Moreover, we evaluate the prototype on real-world datasets.This is an extended version of an article accepted at AAAI 2024
Classification de symboles avec un treillis de Galois et une représentation par sac de mots
National audienceThis paper presents a new approach for graphical symbols recognition by combining a concept lattice with a bag of words representation. Visual words define the properties of a graphical symbol that will be modeled in the Galois Lattice. Indeed, the algorithm of classification is based on the Galois lattice where the intentions of its concepts are the visual words. The use of words as visual primitives allows to evaluate the classifier with a symbolic approach that no longer need the step of the signature discretization to build the Galois Lattice. Our approach is compared to classical approaches without a bag of words and to classical classifiers which are evaluated on different symbols. We show the relevance and the robustness of our approach for graphics recognition.Cet article présente une nouvelle approche pour la reconnaissance de symboles graphiques en combinant un treillis de concepts avec une représentation par sac de mots. Les mots visuels définissent les propriétés représentatives d'un symbole graphique qui seront modélisées dans le treillis de Galois. En effet, l'algorithme de classification est fondé sur le treillis de Galois où les intentions de ses concepts représentent des mots visuels. L'utilisation des mots visuels comme des primitives permet d'évaluer le classifieur avec une approche symbolique qui n'a plus besoin de l' étape de discrétisation primordiale pour la construction du treillis. Notre méthode est comparée aux approches classiques, sans sac de mots et à plusieurs classifieurs usuels, évalués sur différents symboles. Nous montrons la pertinence et la robustesse de notre proposition pour la classification de symboles graphiques
Lattice-based and topological representations of binary relations with an application to music
International audienceFormal concept analysis associates a lattice of formal concepts to a binary relation. The structure of the relation can then be described in terms of lattice theory. On the other hand Q-analysis associates a simplicial complex to a binary relation and studies its properties using topological methods. This paper investigates which mathematical invariants studied in one approach can be captured in the other. Our main result is that all homotopy invariant properties of the simplicial complex can be recovered from the structure of the concept lattice. This not only clarifies the relationships between two frameworks widely used in symbolic data analysis but also offers an effective new method to establish homotopy equivalence in the context of Q-analysis. As a musical application, we will investigate Olivier Messiaen's modes of limited transposition. We will use our theoretical result to show that the simplicial complex associated to a maximal mode with m transpositions is homotopy equivalent to the (m − 2)–dimensional sphere
Most specific consequences in the description logic EL
The notion of a most specific consequence with respect to some terminological box is introduced, conditions for its existence in the description logic EL and its variants are provided, and means for its computation are developed. Algebraic properties of most specific consequences are explored. Furthermore, several applications that make use of this new notion are proposed and, in particular, it is shown how given terminological knowledge can be incorporated in existing approaches for the axiomatization of observations. For instance, a procedure for an incremental learning of concept inclusions from sequences of interpretations is developed
Attribute Exploration of Gene Regulatory Processes
This thesis aims at the logical analysis of discrete processes, in particular
of such generated by gene regulatory networks. States, transitions and
operators from temporal logics are expressed in the language of Formal Concept
Analysis. By the attribute exploration algorithm, an expert or a computer
program is enabled to validate a minimal and complete set of implications, e.g.
by comparison of predictions derived from literature with observed data. Here,
these rules represent temporal dependencies within gene regulatory networks
including coexpression of genes, reachability of states, invariants or possible
causal relationships. This new approach is embedded into the theory of
universal coalgebras, particularly automata, Kripke structures and Labelled
Transition Systems. A comparison with the temporal expressivity of Description
Logics is made. The main theoretical results concern the integration of
background knowledge into the successive exploration of the defined data
structures (formal contexts). Applying the method a Boolean network from
literature modelling sporulation of Bacillus subtilis is examined. Finally, we
developed an asynchronous Boolean network for extracellular matrix formation
and destruction in the context of rheumatoid arthritis.Comment: 111 pages, 9 figures, file size 2.1 MB, PhD thesis University of
Jena, Germany, Faculty of Mathematics and Computer Science, 2011. Online
available at http://www.db-thueringen.de/servlets/DocumentServlet?id=1960
Constructing and Extending Description Logic Ontologies using Methods of Formal Concept Analysis
Description Logic (abbrv. DL) belongs to the field of knowledge representation and reasoning. DL researchers have developed a large family of logic-based languages, so-called description logics (abbrv. DLs). These logics allow their users to explicitly represent knowledge as ontologies, which are finite sets of (human- and machine-readable) axioms, and provide them with automated inference services to derive implicit knowledge. The landscape of decidability and computational complexity of common reasoning tasks for various description logics has been explored in large parts: there is always a trade-off between expressibility and reasoning costs. It is therefore not surprising that DLs are nowadays applied in a large variety of domains: agriculture, astronomy, biology, defense, education, energy management, geography, geoscience, medicine, oceanography, and oil and gas. Furthermore, the most notable success of DLs is that these constitute the logical underpinning of the Web Ontology Language (abbrv. OWL) in the Semantic Web.
Formal Concept Analysis (abbrv. FCA) is a subfield of lattice theory that allows to analyze data-sets that can be represented as formal contexts. Put simply, such a formal context binds a set of objects to a set of attributes by specifying which objects have which attributes. There are two major techniques that can be applied in various ways for purposes of conceptual clustering, data mining, machine learning, knowledge management, knowledge visualization, etc. On the one hand, it is possible to describe the hierarchical structure of such a data-set in form of a formal concept lattice. On the other hand, the theory of implications (dependencies between attributes) valid in a given formal context can be axiomatized in a sound and complete manner by the so-called canonical base, which furthermore contains a minimal number of implications w.r.t. the properties of soundness and completeness.
In spite of the different notions used in FCA and in DLs, there has been a very fruitful interaction between these two research areas. My thesis continues this line of research and, more specifically, I will describe how methods from FCA can be used to support the automatic construction and extension of DL ontologies from data
Ontology development from the encyclopaedic organization of knowledge
Globalna mreža, internet, ubrzano se transformira u semantičku mrežu prelazeći od
povezivanja dokumenata na povezivanje podataka, odnosno dosadašnji web portali s
klasičnim bazama informacija i znanja postaju povezani podaci (engl. linked data) globalnog
oblaka (engl. cloud computing).
Usporedbom organizacijske strukture tradicionalnih enciklopedija na papirnatom
mediju s onima u mrežnom okruženju uočavaju se određene razlike proizašle iz različitih
vrsta medija koje omogućavaju nove funkcionalnosti pretraživanja. Promjene kroz koje
prolazi enciklopedičko djelo zahtijeva uspostavljanje novog načina modeliranja organizacije
enciklopedičkog znanja u mrežnom okruženju koje će svoje utemeljenje pronaći na analizi
specifičnosti strukture enciklopedičkog članka uvažavanjem temeljnih postavki semantičkog
weba, principa ontologijskog oblikovanja i potreba korisnika kako bi se osigurala njihova
dostupnost i bolja iskoristivost. U svrhu očuvanja korisnosti enciklopedije u današnjoj
mrežnoj informacijskoj eksploziji potrebno je poboljšati sposobnost predstavljanja njihovog
sadržaja na smisleni (semantički) način u mrežnom okruženju.
Cilj ove doktorske disertacije je istražiti koji elementi enciklopedičke organizacije
znanja mogu pružiti podršku za razvoj ontologije te razviti metodu kojom će se generirati
ontologija na osnovi enciklopedički organiziranog znanja.
Proučavanjem literature, analizom sličnih ontologijskih modela odabranih svjetskih
projekata i znanja pohranjenog u biografskim člancima Hrvatske enciklopedije (HE) iz
područja hrvatske književnosti predložio se ontologijski model koji na djelotvoran način
opisuje enciklopedičko znanje navedenog područja. Primijenjena je metodologija analize
sadržaja odabranih 1170 članaka HE te METHONTOLOGY metoda. Upotrijebio se Protege
softver za razvoj ontologije. Klasna hijerarhija ontologije se razvija FCA pristupom te se
provodi LSA metoda u svrhu određivanja pojma kao skupa srodnih termina te pripadnosti
pojedinih dokumenta (članaka) tom pojmu, čime je omogućena automatska klasifikacija
članaka pojedinim ontologijskim klasama. Razvijena ontologija poslužit će za organizaciju,
pretraživanje i pregledavanje znanja mrežne HE iz odabranog područja hrvatske književnosti,
kao i za dobivanje preciznih odgovora na složena pitanja. Obuhvaćen je veliki broj odnosa
potreban za opis produkcije književnika pojedinih nacionalnih književnosti, njihovih
životopisa, međusobnih odnosa, odnosa između pojedinih književnih djela i cjelokupnog
znanja koje se nalazi u biografskim enciklopedijskim člancima iz područja književnosti čime se omogućuje opis književnosti bilo kojeg naroda. Dobivena ontologija omogućuje
uspostavljanje interoperabilnosti i povezivanje s ostalim strukturiranim izvorima
enciklopedijskog znanja na semantičkoj mreži (npr. DBpedia), što će omogućiti povezivanje
relevantnog i bogatog znanja HE u „globalnu mrežu znanja“ koja nastaje i razvija se kroz
projekte semantičke mreže.The global network, the Internet, is hurriedly tranforming into an semantic network by
turning from document connection onto data connection, i.e. today's WEB portals with classic
information databases are becoming linked data of global computing.
By comparing organisational structures of traditional encyclopedias on paper media
with electronical encyclopedias in Web surroundings, you can notice certain differences that
come from different types of media which enable new functionalities of searching. The above
mentioned changes, through which an encyclopedical work passes through, demands an
establishment of a new way of modeling organisation of encyclopedical knowledge in a Web
surrounding that will find its foundation on the analysis of the specificity of a structure of an
encyclopedical chapter by respecting the basic settings of the Semantic Web, principles of
ontological shaping and needs of users so their availability and usability would be ensured.
With the aim of preserving encyclopedical usability in today's Web information explosion,
there is a need of modifying presentation of its content in a meaningful (semantic way) in a
Web surrounding. Semantic interoperability means an existence of infrastructure which will
enable mechanical interpretation and conclusion about content on the Web. Therefore, the key
term of Semantic Web is presented by ontology, the basic component in enabling semantic
interoperability.
The aim of this doctoral thesis is to find out which elements of encyclopedical
knowledge organisation can offer support for the development of ontology and develop a
method by which ontology will be generated on the basis of encyclopedically organised
knowledge. Developed ontology will be used for organisation, searching and browsing data of
Croatia's Web Encyclopedia in the selected field of Croatian literature, as well as receiving
precise answers on asked questions.
In the introductory part of this work are presented its starting points, goals and
methods, as well as the structure of the entire work.
The second chapter deals with a theoretical display and clarification os Semantic Web
with the aim of its full understanding. The goal of this chapter is to point out the basic
theoretical and technical background of Semantic Web, the meaning of the term Semantic
Web is explained, basic difference between the Web we know today and its development
toward Semantic Web, disadvantages of today's Web and advantages of Semantic Web are
explained, basic terms and the architecture of Semantic Web, review of basic ontologic definitions and their main goal and role. The chapter gives a detailed review of basic
languages used on the semantic web with actual examples (RDF, RDFS; OWL and SKOS).
Some of the more important projects of Semantic Web are shown in the third chapter
of the work. During the selection of significant projects, it was considered to choose projects
which are significant for better understanding of Semantic Web in the field of encyclopedics.
That is why one part of this chapter deals with showing ontologies unavoidable for better
understanding of Semantic Web, and the other part of the chapter gives a review of ontologic
projects created entirely on encyclopedical knowledge. The mentioned analysis of existing
encyclopedical ontological projects shows that no former project did try to connect and
research the development of ontology and its constructive elements on the basis of structural
organisation of the encyclopedial chapter by researching significance of individual structural
elements of an encyclopedical chapter for the development of ontology.
The fourth chapter is an introduction to the development of the ontological model of
literature and to the basic settings of Protégé software. Elements of standards and ontologic
languages (i.e. vocabularies RDFS, OWL, SKOKS) are shown and applied in the
development of ontology of Croatian Encyclopedia in the field of Croatian literature. Pointed
out was the possibility of reaching interoperability inasmuch individual ontologic resources,
as well as overall ontology, with existing semantic ontologic projects on the Semantic Web
which will allow conneting relevant and rich knowledge of Croatian Encyclopedia into an
„global network of knowledge“ which appears and develops through projects of Semantic
Web.
The fifth chapter gave an insight into the historical development of encyclopedia in the
world so people could completely realize the context through which encyclopedia had to go
through in other to gain today's familiar features of a modern encyclopedistic work. The
chapter has given basic information about the development of the Croatian central
lexicographical institution; Lexicographical institution „Miroslav Krleža“ (LZMK) that does
lexicography and encyclopedics of particular interest for the Republic of Croatia. It is shown
that this doctoral thesis can contribute to ,with some particular elements , realising the mission
and vision of LZMK. Publishing work of basic and expert enyclopedic editions of LZMK is
shown, embracing editions on a paper media and those in a web surrounding. By analysing
publishing work of web editions, it was found that their substantiation was mostly a matter of
adapting traditional organisations of encyclopedic knowledge from paper media to web
surroundings. That is why this doctoral thesis will research which uses from applying ontologic principles of semantic web would LZMK have, as well as users of these valuable
knowledge sources.
The sixth chapter has shown a structural organisation of encyclopedical articles,
looking back onto basic types of encyclopedic articles and especially on features of
biographic encyclopedial article that is the basis of research of this work. An analysis of type
of data which contain encyclopedical biographic articles has been done. The mentioned
allowed establishing basic ontologic layers i.e. facets by which ontologic relations will
classify.
The seventh chapter identifies constructive ontologic elements from encyclopedical
biographic article, shows the metodology used in development of ontology, as well as
resulting conceptual taxonomy and ontologic relations. The chapter has shown the role of
structural elements of encyclopedical biographic article during ontologic development and
connected them to corresponding constructive ontologic elements. The results of the research
are presented through display of ontologic modules and belonging ontologic relations that can
be used in describing a certain term. A display of structure and sequence of elements of
encylopedical biographic article has been shown, with developed ontologic features by which
can all types of information in the article be stored, from which constant elements have been
determined that can be perceived for development of article infoframes, which is suitable for a
quick insight into most important information of individual articles. Final results are shown
through application of the gained ontology in description of an individual, as well as through
possibility of installing complex semantic questions by unstructured data of encyclopedical
biographic articles and through the possibility of organising browsing encyclopedical data,
which hasn't been possible until now.
The eight chapter explained the FCA approach applied during building of ontology so
a gathering of conceptual features would be established, by which terms in ontology were
defined so classification of terms could be made into a hierarchy. Important definitions were
emphasized to understand places of formal term analysis in methodology of creating
ontology. An actual example of accomplishing FCA approach was shown in 37 articles of
Croatian literature in Croatian Encyclopedia, as well as a transformation of a transformational
grid into a formal language of first order logics. Chapter shows the advantages of applying
FCA analysis because of generating new and unfamilliar terms which could be hardly
established only by handiwork of ontology, because texts specific to the domain do not
include any kind of noun phrase for labeling these new terms. The ninth chapter brings out problems of automated indexing methods and
information fetching. Theoretically, it shows the LSA method and its application on the
example of encyclopedical articles of Croatian Encyclopedia with the goal of learning about
its effieciency and goals in building ontology of an certain area. Conducting LSA method on
chosen articles shows its utility in assessing the term as a gathering of related terms and
affiliations of certain entries of articles (documents) in that matter, by which on the basis of
word forms that selected articles consist of allows an automatic classification of articles by
individual ontologic classes.
The tenth, also final, chapter of this work is a conclusion which combines theoretical
and practical part of the work by giving a short review of research results and showing the
possibility of establishing interoperability and connecting Linked Data concept of Croatian
encyclopedistics with other structural sources of encyclopedical knowledge on the web ( e.g.
DBpedia, Freebase, etc.)
Proceedings of the 5th International Workshop "What can FCA do for Artificial Intelligence?", FCA4AI 2016(co-located with ECAI 2016, The Hague, Netherlands, August 30th 2016)
International audienceThese are the proceedings of the fifth edition of the FCA4AI workshop (http://www.fca4ai.hse.ru/). Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classification that can be used for many purposes, especially for Artificial Intelligence (AI) needs. The objective of the FCA4AI workshop is to investigate two main main issues: how can FCA support various AI activities (knowledge discovery, knowledge representation and reasoning, learning, data mining, NLP, information retrieval), and how can FCA be extended in order to help AI researchers to solve new and complex problems in their domain. Accordingly, topics of interest are related to the following: (i) Extensions of FCA for AI: pattern structures, projections, abstractions. (ii) Knowledge discovery based on FCA: classification, data mining, pattern mining, functional dependencies, biclustering, stability, visualization. (iii) Knowledge processing based on concept lattices: modeling, representation, reasoning. (iv) Application domains: natural language processing, information retrieval, recommendation, mining of web of data and of social networks, etc
Intelligent Support for Exploration of Data Graphs
This research investigates how to support a user’s exploration through data graphs generated from semantic databases in a way leading to expanding the user’s domain knowledge. To be effective, approaches to facilitate exploration of data graphs should take into account the utility from a user’s point of view. Our work focuses on knowledge utility – how useful exploration paths through a data graph are for expanding the user’s knowledge. The main goal of this research is to design an intelligent support mechanism to direct the user to ‘good’ exploration paths through big data graphs for knowledge expansion. We propose a new exploration support mechanism underpinned by the subsumption theory for meaningful learning, which postulates that new knowledge is grasped by starting from familiar concepts in the graph which serve as knowledge anchors from where links to new knowledge are made. A core algorithmic component for adapting the subsumption theory for generating exploration paths is the automatic identification of Knowledge Anchors in a Data Graph (KADG). Several metrics for identifying KADG and the corresponding algorithms for implementation have been developed and evaluated against human cognitive structures. A subsumption algorithm which utilises KADG for generating exploration paths for knowledge expansion is presented and evaluated in the context of a semantic data browser in a musical instrument domain. The resultant exploration paths are evaluated in a controlled user study to examine whether they increase the users’ knowledge as compared to free exploration. The findings show that exploration paths using knowledge anchors and subsumption lead to significantly higher increase in the users’ conceptual knowledge. The approach can be adopted in applications providing data graph exploration to facilitate learning and sensemaking of layman users who are not fully familiar with the domain presented in the data graph