Search CORE

13,453 research outputs found

Research on Pattern Matching with Wildcards and Length Constraints: Methods and Completeness

Author: Hu Xuegang
Wang Haiping
Xiang Taining
Publication venue: 'IntechOpen'
Publication date: 28/11/2012
Field of study

Identification of hot regions in protein-protein interactions by sequential pattern mining

Author: Chen Chien-Yu
Hsu Chen-Ming
Huang Chih-Chang
Laio Min-Hung
Lin Chien-Chieh
Liu Baw-Jhiune
Wu Tzung-Lin
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Identification of protein interacting sites is an important task in computational molecular biology. As more and more protein sequences are deposited without available structural information, it is strongly desirable to predict protein binding regions by their sequences alone. This paper presents a pattern mining approach to tackle this problem. It is observed that a functional region of protein structures usually consists of several peptide segments linked with large wildcard regions. Thus, the proposed mining technology considers large irregular gaps when growing patterns, in order to find the residues that are simultaneously conserved but largely separated on the sequences. A derived pattern is called a cluster-like pattern since the discovered conserved residues are always grouped into several blocks, which each corresponds to a local conserved region on the protein sequence. Results The experiments conducted in this work demonstrate that the derived long patterns automatically discover the important residues that form one or several hot regions of protein-protein interactions. The methodology is evaluated by conducting experiments on the web server MAGIIC-PRO based on a well known benchmark containing 220 protein chains from 72 distinct complexes. Among the tested 218 proteins, there are 900 sequential blocks discovered, 4.25 blocks per protein chain on average. About 92% of the derived blocks are observed to be clustered in space with at least one of the other blocks, and about 66% of the blocks are found to be near the interface of protein-protein interactions. It is summarized that for about 83% of the tested proteins, at least two interacting blocks can be discovered by this approach. Conclusion This work aims to demonstrate that the important residues associated with the interface of protein-protein interactions may be automatically discovered by sequential pattern mining. The detected regions possess high conservation and thus are considered as the computational hot regions. This information would be useful to characterizing protein sequences, predicting protein function, finding potential partners, and facilitating protein docking for drug discovery.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

National Taiwan University Repository

Large-Scale Pattern-Based Information Extraction from the World Wide Web

Author: Blohm Sebastian
Publication venue: KIT Scientific Publishing
Publication date: 30/07/2019
Field of study

Extracting information from text is the task of obtaining structured, machine-processable facts from information that is mentioned in an unstructured manner. It thus allows systems to automatically aggregate information for further analysis, efficient retrieval, automatic validation, or appropriate visualization. This work explores the potential of using textual patterns for Information Extraction from the World Wide Web

Directory of Open Access Books (DOAB)

How product market reforms lubricate shock adjustment in the euro area

Author: Alessandro Maravalle
Jacques Pelkmans
Lourdes Acedo Montoya
Publication venue
Publication date
Field of study

The essay sets out what product market reforms are, as well as the main measurement issues, followed by an analysis of how such reforms lubricate adjustment processes in EMU, in particular via the â€œcompetitiveness channelâ€. Attention is paid to the short-run and longer-run aspects of adjustments to shocks and the scant empirical evidence on the role of product markets in adjustment is discussed.Â The essay investigates empirically the need for product market reforms in the euro area, based on the KLEMS data set. Two questions are addressed: how likely is it for euro area countries to experience an asymmetric shock, and what empirical evidence can we deduce about eurozone countries' capacities to adjust to asymmetric shocks? The approach is disaggregated and highlights (especially services) sectors with relatively greater adjustment problems.Â The record of product market reforms of the euro area countries is briefly summarized. The paper shows that substantial reforms have been undertaken, yet, there is considerable evidence that the eurozone, and in particular with respect to services, could significantly intensify product market reforms and thereby augment the net benefits of having a single currency. Subsequently, product market reforms are placed in the context of wider reforms efforts (complementarities e.g. with labour and financial markets) as well as in the two-tier institutional structure of the euro area and the EU at large (given cross-border spillovers and the case for coordination). Designing reforms in this euro area context is briefly discussed. A final section with five â€œpolicy messagesâ€ concludes the essay.adjustment, product market reforms, asymmetric shocks, Pelkmans, Acedo Montoya, Maravalle

Research Papers in Economics

Interpretable Sequence Clustering

Author: Dong Junjie
He Zengyou
Hu Lianyu
Jiang Mudi
Yang Xinyi
Publication venue
Publication date: 03/09/2023
Field of study

Categorical sequence clustering plays a crucial role in various fields, but the lack of interpretability in cluster assignments poses significant challenges. Sequences inherently lack explicit features, and existing sequence clustering algorithms heavily rely on complex representations, making it difficult to explain their results. To address this issue, we propose a method called Interpretable Sequence Clustering Tree (ISCT), which combines sequential patterns with a concise and interpretable tree structure. ISCT leverages k-1 patterns to generate k leaf nodes, corresponding to k clusters, which provides an intuitive explanation on how each cluster is formed. More precisely, ISCT first projects sequences into random subspaces and then utilizes the k-means algorithm to obtain high-quality initial cluster assignments. Subsequently, it constructs a pattern-based decision tree using a boosting-based construction strategy in which sequences are re-projected and re-clustered at each node before mining the top-1 discriminative splitting pattern. Experimental results on 14 real-world data sets demonstrate that our proposed method provides an interpretable tree structure while delivering fast and accurate cluster assignments.Comment: 11 pages, 6 figure

arXiv.org e-Print Archive

Human Motion Trajectory Prediction: A Survey

Author: Arras Kai O.
Gavrila Dariu M.
Herman Michael
Kitani Kris M.
Palmieri Luigi
Rudenko Andrey
Publication venue: 'SAGE Publications'
Publication date: 17/12/2019
Field of study

With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper provides a survey of human motion trajectory prediction. We review, analyze and structure a large selection of work from different communities and propose a taxonomy that categorizes existing methods based on the motion modeling approach and level of contextual information used. We provide an overview of the existing datasets and performance metrics. We discuss limitations of the state of the art and outline directions for further research.Comment: Submitted to the International Journal of Robotics Research (IJRR), 37 page

arXiv.org e-Print Archive