132 research outputs found
Hierarchical growing neural gas
“The original publication is available at www.springerlink.com”. Copyright Springer.This paper describes TreeGNG, a top-down unsupervised learning method that produces hierarchical classification schemes. TreeGNG is an extension to the Growing Neural Gas algorithm that maintains a time history of the learned topological mapping. TreeGNG is able to correct poor decisions made during the early phases of the construction of the tree, and provides the novel ability to influence the general shape and form of the learned hierarchy
CorDeep and the Sacrobosco Dataset: Detection of Visual Elements in Historical Documents
Recent advances in object detection facilitated by deep learning have led to numerous solutions in a myriad of fields ranging from medical diagnosis to autonomous driving. However, historical research is yet to reap the benefits of such advances. This is generally due to the low number of large, coherent, and annotated datasets of historical documents, as well as the overwhelming focus on Optical Character Recognition to support the analysis of historical documents. In this paper, we highlight the importance of visual elements, in particular illustrations in historical documents, and offer a public multi-class historical visual element dataset based on the Sphaera corpus. Additionally, we train an image extraction model based on YOLO architecture and publish it through a publicly available web-service to detect and extract multi-class images from historical documents in an effort to bridge the gap between traditional and computational approaches in historical studies
A Multi-signal Variant for the GPU-based Parallelization of Growing Self-Organizing Networks
Among the many possible approaches for the parallelization of self-organizing
networks, and in particular of growing self-organizing networks, perhaps the
most common one is producing an optimized, parallel implementation of the
standard sequential algorithms reported in the literature. In this paper we
explore an alternative approach, based on a new algorithm variant specifically
designed to match the features of the large-scale, fine-grained parallelism of
GPUs, in which multiple input signals are processed at once. Comparative tests
have been performed, using both parallel and sequential implementations of the
new algorithm variant, in particular for a growing self-organizing network that
reconstructs surfaces from point clouds. The experimental results show that
this approach allows harnessing in a more effective way the intrinsic
parallelism that the self-organizing networks algorithms seem intuitively to
suggest, obtaining better performances even with networks of smaller size.Comment: 17 page
Correlation-maximizing surrogate gene space for visual mining of gene expression patterns in developing barley endosperm tissue
<p>Abstract</p> <p>Background</p> <p>Micro- and macroarray technologies help acquire thousands of gene expression patterns covering important biological processes during plant ontogeny. Particularly, faithful visualization methods are beneficial for revealing interesting gene expression patterns and functional relationships of coexpressed genes. Such screening helps to gain deeper insights into regulatory behavior and cellular responses, as will be discussed for expression data of developing barley endosperm tissue. For that purpose, high-throughput multidimensional scaling (HiT-MDS), a recent method for similarity-preserving data embedding, is substantially refined and used for (a) assessing the quality and reliability of centroid gene expression patterns, and for (b) derivation of functional relationships of coexpressed genes of endosperm tissue during barley grain development (0–26 days after flowering).</p> <p>Results</p> <p>Temporal expression profiles of 4824 genes at 14 time points are faithfully embedded into two-dimensional displays. Thereby, similar shapes of coexpressed genes get closely grouped by a correlation-based similarity measure. As a main result, by using power transformation of correlation terms, a characteristic cloud of points with bipolar sandglass shape is obtained that is inherently connected to expression patterns of pre-storage, intermediate and storage phase of endosperm development.</p> <p>Conclusion</p> <p>The new HiT-MDS-2 method helps to create global views of expression patterns and to validate centroids obtained from clustering programs. Furthermore, functional gene annotation for developing endosperm barley tissue is successfully mapped to the visualization, making easy localization of major centroids of enriched functional categories possible.</p
PCA Beyond The Concept of Manifolds: Principal Trees, Metro Maps, and Elastic Cubic Complexes
Multidimensional data distributions can have complex topologies and variable
local dimensions. To approximate complex data, we propose a new type of
low-dimensional ``principal object'': a principal cubic complex. This complex
is a generalization of linear and non-linear principal manifolds and includes
them as a particular case. To construct such an object, we combine a method of
topological grammars with the minimization of an elastic energy defined for its
embedment into multidimensional data space. The whole complex is presented as a
system of nodes and springs and as a product of one-dimensional continua
(represented by graphs), and the grammars describe how these continua transform
during the process of optimal complex construction. The simplest case of a
topological grammar (``add a node'', ``bisect an edge'') is equivalent to the
construction of ``principal trees'', an object useful in many practical
applications. We demonstrate how it can be applied to the analysis of bacterial
genomes and for visualization of cDNA microarray data using the ``metro map''
representation. The preprint is supplemented by animation: ``How the
topological grammar constructs branching principal components
(AnimatedBranchingPCA.gif)''.Comment: 19 pages, 8 figure
BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection
<p>Abstract</p> <p>Background</p> <p>Bioluminescence is a process in which light is emitted by a living organism. Most creatures that emit light are sea creatures, but some insects, plants, fungi etc, also emit light. The biotechnological application of bioluminescence has become routine and is considered essential for many medical and general technological advances. Identification of bioluminescent proteins is more challenging due to their poor similarity in sequence. So far, no specific method has been reported to identify bioluminescent proteins from primary sequence.</p> <p>Results</p> <p>In this paper, we propose a novel predictive method that uses a Support Vector Machine (SVM) and physicochemical properties to predict bioluminescent proteins. BLProt was trained using a dataset consisting of 300 bioluminescent proteins and 300 non-bioluminescent proteins, and evaluated by an independent set of 141 bioluminescent proteins and 18202 non-bioluminescent proteins. To identify the most prominent features, we carried out feature selection with three different filter approaches, ReliefF, infogain, and mRMR. We selected five different feature subsets by decreasing the number of features, and the performance of each feature subset was evaluated.</p> <p>Conclusion</p> <p>BLProt achieves 80% accuracy from training (5 fold cross-validations) and 80.06% accuracy from testing. The performance of BLProt was compared with BLAST and HMM. High prediction accuracy and successful prediction of hypothetical proteins suggests that BLProt can be a useful approach to identify bioluminescent proteins from sequence information, irrespective of their sequence similarity. The BLProt software is available at <url>http://www.inb.uni-luebeck.de/tools-demos/bioluminescent%20protein/BLProt</url></p
A Thalamocortical Neural Mass Model of the EEG during NREM Sleep and Its Response to Auditory Stimulation
Few models exist that accurately reproduce the complex rhythms of the thalamocortical system that are apparent in measured scalp EEG and at the same time, are suitable for large-scale simulations of brain activity. Here, we present a neural mass model of the thalamocortical system during natural non-REM sleep, which is able to generate fast sleep spindles (12–15 Hz), slow oscillations (<1 Hz) and K-complexes, as well as their distinct temporal relations, and response to auditory stimuli. We show that with the inclusion of detailed calcium currents, the thalamic neural mass model is able to generate different firing modes, and validate the model with EEG-data from a recent sleep study in humans, where closed-loop auditory stimulation was applied. The model output relates directly to the EEG, which makes it a useful basis to develop new stimulation protocols
Characterization of K-Complexes and Slow Wave Activity in a Neural Mass Model
NREM sleep is characterized by two hallmarks, namely K-complexes (KCs) during sleep stage N2 and cortical slow oscillations (SOs) during sleep stage N3. While the underlying dynamics on the neuronal level is well known and can be easily measured, the resulting behavior on the macroscopic population level remains unclear. On the basis of an extended neural mass model of the cortex, we suggest a new interpretation of the mechanisms responsible for the generation of KCs and SOs. As the cortex transitions from wake to deep sleep, in our model it approaches an oscillatory regime via a Hopf bifurcation. Importantly, there is a canard phenomenon arising from a homoclinic bifurcation, whose orbit determines the shape of large amplitude SOs. A KC corresponds to a single excursion along the homoclinic orbit, while SOs are noise-driven oscillations around a stable focus. The model generates both time series and spectra that strikingly resemble real electroencephalogram data and points out possible differences between the different stages of natural sleep
- …