Search CORE

9,751 research outputs found

MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction

Author: Lanchantin Jack
Lin Zeming
Qi Yanjun
Publication venue
Publication date: 21/02/2016
Field of study

Predicting protein properties such as solvent accessibility and secondary structure from its primary amino acid sequence is an important task in bioinformatics. Recently, a few deep learning models have surpassed the traditional window based multilayer perceptron. Taking inspiration from the image classification domain we propose a deep convolutional neural network architecture, MUST-CNN, to predict protein properties. This architecture uses a novel multilayer shift-and-stitch (MUST) technique to generate fully dense per-position predictions on protein sequences. Our model is significantly simpler than the state-of-the-art, yet achieves better results. By combining MUST and the efficient convolution operation, we can consider far more parameters while retaining very fast prediction speeds. We beat the state-of-the-art performance on two large protein property prediction datasets.Comment: 8 pages ; 3 figures ; deep learning based sequence-sequence prediction. in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Algorithmic information and incompressibility of families of multidimensional networks

Author: Abrahão Felipe S.
Wehmuth Klaus
Zenil Hector
Ziviani Artur
Publication venue
Publication date: 01/07/2020
Field of study

This article presents a theoretical investigation of string-based generalized representations of families of finite networks in a multidimensional space. First, we study the recursive labeling of networks with (finite) arbitrary node dimensions (or aspects), such as time instants or layers. In particular, we study these networks that are formalized in the form of multiaspect graphs. We show that, unlike classical graphs, the algorithmic information of a multidimensional network is not in general dominated by the algorithmic information of the binary sequence that determines the presence or absence of edges. This universal algorithmic approach sets limitations and conditions for irreducible information content analysis in comparing networks with a large number of dimensions, such as multilayer networks. Nevertheless, we show that there are particular cases of infinite nesting families of finite multidimensional networks with a unified recursive labeling such that each member of these families is incompressible. From these results, we study network topological properties and equivalences in irreducible information content of multidimensional networks in comparison to their isomorphic classical graph.Comment: Extended preprint version of the pape

arXiv.org e-Print Archive

CT-Mapper: Mapping Sparse Multimodal Cellular Trajectories using a Multilayer Transportation Network

Author: Asgari Fereshteh
El-Yacoubi Mounim
Gauthier Vincent
Sultan Alexis
Xiong Haoyi
Publication venue
Publication date: 22/04/2016
Field of study

Mobile phone data have recently become an attractive source of information about mobility behavior. Since cell phone data can be captured in a passive way for a large user population, they can be harnessed to collect well-sampled mobility information. In this paper, we propose CT-Mapper, an unsupervised algorithm that enables the mapping of mobile phone traces over a multimodal transport network. One of the main strengths of CT-Mapper is its capability to map noisy sparse cellular multimodal trajectories over a multilayer transportation network where the layers have different physical properties and not only to map trajectories associated with a single layer. Such a network is modeled by a large multilayer graph in which the nodes correspond to metro/train stations or road intersections and edges correspond to connections between them. The mapping problem is modeled by an unsupervised HMM where the observations correspond to sparse user mobile trajectories and the hidden states to the multilayer graph nodes. The HMM is unsupervised as the transition and emission probabilities are inferred using respectively the physical transportation properties and the information on the spatial coverage of antenna base stations. To evaluate CT-Mapper we collected cellular traces with their corresponding GPS trajectories for a group of volunteer users in Paris and vicinity (France). We show that CT-Mapper is able to accurately retrieve the real cell phone user paths despite the sparsity of the observed trace trajectories. Furthermore our transition probability model is up to 20% more accurate than other naive models.Comment: Under revision in Computer Communication Journa

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Hal-Diderot