4,118 research outputs found
Classifying sequences by the optimized dissimilarity space embedding approach: a case study on the solubility analysis of the E. coli proteome
We evaluate a version of the recently-proposed classification system named
Optimized Dissimilarity Space Embedding (ODSE) that operates in the input space
of sequences of generic objects. The ODSE system has been originally presented
as a classification system for patterns represented as labeled graphs. However,
since ODSE is founded on the dissimilarity space representation of the input
data, the classifier can be easily adapted to any input domain where it is
possible to define a meaningful dissimilarity measure. Here we demonstrate the
effectiveness of the ODSE classifier for sequences by considering an
application dealing with the recognition of the solubility degree of the
Escherichia coli proteome. Solubility, or analogously aggregation propensity,
is an important property of protein molecules, which is intimately related to
the mechanisms underlying the chemico-physical process of folding. Each protein
of our dataset is initially associated with a solubility degree and it is
represented as a sequence of symbols, denoting the 20 amino acid residues. The
herein obtained computational results, which we stress that have been achieved
with no context-dependent tuning of the ODSE system, confirm the validity and
generality of the ODSE-based approach for structured data classification.Comment: 10 pages, 49 reference
Proceedings of the 15th Conference on Knowledge Organization WissOrg'17 of theGerman Chapter of the International Society for Knowledge Organization (ISKO),30th November - 1st December 2017, Freie Universität Berlin
Wissensorganisation is the name of a series of biennial conferences /
workshops with a long tradition, organized by the German chapter of the
International Society of Knowledge Organization (ISKO). The 15th conference in
this series, held at Freie Universität Berlin, focused on knowledge
organization for the digital humanities. Structuring, and interacting with,
large data collections has become a major issue in the digital humanities. In
these proceedings, various aspects of knowledge organization in the digital
humanities are discussed, and the authors of the papers show how projects in
the digital humanities deal with knowledge organization.Wissensorganisation ist der Name einer Konferenzreihe mit einer langjährigen
Tradition, die von der Deutschen Sektion der International Society of
Knowledge Organization (ISKO) organisiert wird. Die 15. Konferenz dieser
Reihe, die an der Freien Universität Berlin stattfand, hatte ihren Schwerpunkt
im Bereich Wissensorganisation und Digital Humanities. Die Strukturierung von
und die Interaktion mit großen Datenmengen ist ein zentrales Thema in den
Digital Humanities. In diesem Konferenzband werden verschiedene Aspekte der
Wissensorganisation in den Digital Humanities diskutiert, und die Autoren der
einzelnen Beiträge zeigen, wie die Digital Humanities mit Wissensorganisation
umgehen
Duration and Interval Hidden Markov Model for Sequential Data Analysis
Analysis of sequential event data has been recognized as one of the essential
tools in data modeling and analysis field. In this paper, after the examination
of its technical requirements and issues to model complex but practical
situation, we propose a new sequential data model, dubbed Duration and Interval
Hidden Markov Model (DI-HMM), that efficiently represents "state duration" and
"state interval" of data events. This has significant implications to play an
important role in representing practical time-series sequential data. This
eventually provides an efficient and flexible sequential data retrieval.
Numerical experiments on synthetic and real data demonstrate the efficiency and
accuracy of the proposed DI-HMM
- …