9,751 research outputs found
MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction
Predicting protein properties such as solvent accessibility and secondary
structure from its primary amino acid sequence is an important task in
bioinformatics. Recently, a few deep learning models have surpassed the
traditional window based multilayer perceptron. Taking inspiration from the
image classification domain we propose a deep convolutional neural network
architecture, MUST-CNN, to predict protein properties. This architecture uses a
novel multilayer shift-and-stitch (MUST) technique to generate fully dense
per-position predictions on protein sequences. Our model is significantly
simpler than the state-of-the-art, yet achieves better results. By combining
MUST and the efficient convolution operation, we can consider far more
parameters while retaining very fast prediction speeds. We beat the
state-of-the-art performance on two large protein property prediction datasets.Comment: 8 pages ; 3 figures ; deep learning based sequence-sequence
prediction. in AAAI 201
Algorithmic information and incompressibility of families of multidimensional networks
This article presents a theoretical investigation of string-based generalized
representations of families of finite networks in a multidimensional space.
First, we study the recursive labeling of networks with (finite) arbitrary node
dimensions (or aspects), such as time instants or layers. In particular, we
study these networks that are formalized in the form of multiaspect graphs. We
show that, unlike classical graphs, the algorithmic information of a
multidimensional network is not in general dominated by the algorithmic
information of the binary sequence that determines the presence or absence of
edges. This universal algorithmic approach sets limitations and conditions for
irreducible information content analysis in comparing networks with a large
number of dimensions, such as multilayer networks. Nevertheless, we show that
there are particular cases of infinite nesting families of finite
multidimensional networks with a unified recursive labeling such that each
member of these families is incompressible. From these results, we study
network topological properties and equivalences in irreducible information
content of multidimensional networks in comparison to their isomorphic
classical graph.Comment: Extended preprint version of the pape
CT-Mapper: Mapping Sparse Multimodal Cellular Trajectories using a Multilayer Transportation Network
Mobile phone data have recently become an attractive source of information
about mobility behavior. Since cell phone data can be captured in a passive way
for a large user population, they can be harnessed to collect well-sampled
mobility information. In this paper, we propose CT-Mapper, an unsupervised
algorithm that enables the mapping of mobile phone traces over a multimodal
transport network. One of the main strengths of CT-Mapper is its capability to
map noisy sparse cellular multimodal trajectories over a multilayer
transportation network where the layers have different physical properties and
not only to map trajectories associated with a single layer. Such a network is
modeled by a large multilayer graph in which the nodes correspond to
metro/train stations or road intersections and edges correspond to connections
between them. The mapping problem is modeled by an unsupervised HMM where the
observations correspond to sparse user mobile trajectories and the hidden
states to the multilayer graph nodes. The HMM is unsupervised as the transition
and emission probabilities are inferred using respectively the physical
transportation properties and the information on the spatial coverage of
antenna base stations. To evaluate CT-Mapper we collected cellular traces with
their corresponding GPS trajectories for a group of volunteer users in Paris
and vicinity (France). We show that CT-Mapper is able to accurately retrieve
the real cell phone user paths despite the sparsity of the observed trace
trajectories. Furthermore our transition probability model is up to 20% more
accurate than other naive models.Comment: Under revision in Computer Communication Journa
- …