Search CORE

499 research outputs found

An introduction to spectral distances in networks (extended version)

Author: Furlanello Cesare
Jurman Giuseppe
Visintainer Roberto
Publication venue
Publication date: 26/10/2010
Field of study

Many functions have been recently defined to assess the similarity among networks as tools for quantitative comparison. They stem from very different frameworks - and they are tuned for dealing with different situations. Here we show an overview of the spectral distances, highlighting their behavior in some basic cases of static and dynamic synthetic and real networks

arXiv.org e-Print Archive

CiteSeerX

Archivio della ricerca - Fondazione Bruno Kessler

Dyslexic children's reading pattern as input for ASR: Data, analysis, and pronunciation model

Author: Husni Husniza
Jamaludin Zulikha
Publication venue: Universiti Utara Malaysia Press
Publication date: 01/01/2009
Field of study

To realize an automatic speech recognition (ASR) model that is able to recognize the Bahasa Melayu reading difficulties of dyslexic children, the language corpora has to be generated beforehand. For this purpose, data collection is performed in two public schools involving ten dyslexic children aged between seven to fourteen years old. A total of 114 Bahasa Melayu words,representing 23 consonant-vowel patterns in the spelling system of the language, served as the stimuli. The patterns range from simple to somewhat complex formations of consonant-vowel pairs in words listed in a level one primary school syllabus. An analysis was performed aimed at identifying the most frequent errors made by these dyslexic children when reading aloud, and describing the emerging reading pattern of dyslexic children in general. This paper hence provides an overview of the entire process from data collection to analysis to modeling the pronunciations of words which will serve as the active lexicon for the ASR model. This paper also highlights the challenges of data collection involving dyslexic children when they are reading aloud, and other factors that contribute to the complex nature of the data collected

UUM Repository

Hunting for Pirated Software Using Metamorphic Analysis

Author: Rana Hardikkumar
Publication venue: SJSU ScholarWorks
Publication date: 01/04/2014
Field of study

In this paper, we consider the problem of detecting software that has been pirated and modified. We analyze a variety of detection techniques that have been previously studied in the context of malware detection. For each technique, we empirically determine the detection rate as a function of the degree of modification of the original code. We show that the code must be greatly modified before we fail to reliably distinguish it, and we show that our results offer a significant improvement over previous related work. Our approach can be applied retroactively to any existing software and hence, it is both practical and effective

SJSU ScholarWorks

Gene Family Histories: Theory and Algorithms

Author: Schaller David
Publication venue
Publication date: 01/01/2021
Field of study

Detailed gene family histories and reconciliations with species trees are a prerequisite for studying associations between genetic and phenotypic innovations. Even though the true evolutionary scenarios are usually unknown, they impose certain constraints on the mathematical structure of data obtained from simple yes/no questions in pairwise comparisons of gene sequences. Recent advances in this field have led to the development of methods for reconstructing (aspects of) the scenarios on the basis of such relation data, which can most naturally be represented by graphs on the set of considered genes. We provide here novel characterizations of best match graphs (BMGs) which capture the notion of (reciprocal) best hits based on sequence similarities. BMGs provide the basis for the detection of orthologous genes (genes that diverged after a speciation event). There are two main sources of error in pipelines for orthology inference based on BMGs. Firstly, measurement errors in the estimation of best matches from sequence similarity in general lead to violations of the characteristic properties of BMGs. The second issue concerns the reconstruction of the orthology relation from a BMG. We show how to correct estimated BMG to mathematically valid ones and how much information about orthologs is contained in BMGs. We then discuss implicit methods for horizontal gene transfer (HGT) inference that focus on pairs of genes that have diverged only after the divergence of the two species in which the genes reside. This situation defines the edge set of an undirected graph, the later-divergence-time (LDT) graph. We explore the mathematical structure of LDT graphs and show how much information about all HGT events is contained in such LDT graphs

Qucosa

HSSS - Hochschulschriftenserver der SLUB

MPG.PuRe

Qucosa - Publikationsserver der Universität Leipzig

Scalable string reconciliation by recursive content-dependent shingling

Author: Song Bowen
Publication venue
Publication date: 04/06/2019
Field of study

We consider the problem of reconciling similar strings in a distributed system. Specifically, we are interested in performing this reconciliation in an efficient manner, minimizing the communication cost. Our problem applies to several types of large-scale distributed networks, file synchronization utilities, and any system that manages the consistency of string encoded ordered data. We present the novel Recursive Content-Dependent Shingling (RCDS) protocol that can handle large strings and has the communication complexity that scales with the edit distance between the reconciling strings. Also, we provide analysis, experimental results, and comparisons to existing synchronization software such as the Rsync utility with an implementation of our protocol.2019-12-03T00:00:00

Boston University Institutional Repository (OpenBU)

Alternative Ranking-Based Clustering and Reliability Index-Based Consensus Reaching Process for Hesitant Fuzzy Large Scale Group Decision Making

Author: Ding Ru-Xi
Herrera Triguero Francisco
Liu Xiu
Montes Soldado Rosa Ana
Xu Yejun
Publication venue: Elservier
Publication date: 18/10/2018
Field of study

The paper addresses the growing importance of Large Scale Group Decision Making (LSGDM) problems, focusing on hesitant fuzzy LSGDM. It introduces a Reliability Index-based Consensus Reaching Process (RI-CRP) to enhance efficiency. The proposed method assesses the ordinal consistency of decision makers' (DMs) information, measures deviation, and assigns a reliability index to DMs' opinions. An unreliable DMs management method is presented to filter out unreliable information. Additionally, an Alternative Ranking-based Clustering (ARC) method with hesitant fuzzy reciprocal preference relations is proposed to improve the efficiency of RI-CRP. The numerical example demonstrates the feasibility and effectiveness of the ARC method and RI-CRP for hesitant fuzzy LSGDM problems.Este artículo aborda la creciente importancia de los problemas de Toma de Decisiones en Grupo a Gran Escala (LSGDM), centrándose en el LSGDM difuso vacilante. Introduce un Proceso de Consenso Basado en Índices de Fiabilidad (RI-CRP) para mejorar la eficiencia. El método propuesto evalúa la consistencia ordinal de la información de los decisores, mide la desviación y asigna un índice de fiabilidad a las opiniones de los decisores. Se presenta un método de gestión de los decisores poco fiables para filtrar la información poco fiable. Además, se propone un método de agrupamiento alternativo basado en la clasificación (ARC) con relaciones de preferencia recíproca difusas vacilantes para mejorar la eficacia de RI-CRP. El ejemplo numérico demuestra la viabilidad y eficacia del método ARC y del RI-CRP para problemas LSGDM difusos vacilantes.Instituto Interuniversitario de Investigación en Data Science and Computational Intelligence (DaSCI

Repositorio Institucional Universidad de Granada

Heuristic Algorithms for Best Match Graph Editing

Author: Geiß Manuela
Hellmuth Marc
Schaller David
Stadler Peter F.
Publication venue
Publication date: 01/01/2021
Field of study

Best match graphs (BMGs) are a class of colored digraphs that naturally appear in mathematical phylogenetics and can be approximated with the help of similarity measures between gene sequences, albeit not without errors. The corresponding graph editing problem can be used as a means of error correction. Since the arc set modification problems for BMGs are NP-complete, efficient heuristics are needed if BMGs are to be used for the practical analysis of biological sequence data. Since BMGs have a characterization in terms of consistency of a certain set of rooted triples, we consider heuristics that operate on triple sets. As an alternative, we show that there is a close connection to a set partitioning problem that leads to a class of top-down recursive algorithms that are similar to Aho's supertree algorithm and give rise to BMG editing algorithms that are consistent in the sense that they leave BMGs invariant. Extensive benchmarking shows that community detection algorithms for the partitioning steps perform best for BMG editing

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Novel Method for Measuring Structure and Semantic Similarity of XML Documents Based on Extended Adjacency Matrix

Author: Fan Bao-Quan
Wang Xu
Wei Jin-Mao
Yang Ting
Zhang Xue-Liang
Publication venue: Published by Elsevier B.V.
Publication date: 31/12/2012
Field of study

AbstractSimilarity measurement of XML documents is crucial to meet various needs of approximate searches and document classifications in XML-oriented applications. Some methods have been proposed for this purpose. Nevertheless, few methods can be elegantly exploited to depict structure and semantic information and hence to effectively measure the similarity of XML documents. In this paper, we present a new method of computing the structure and semantic similarity of XML documents based on extended adjacency matrix(EAM). Different from a general adjacency matrix, in an EAM, the structure information of not only the adjacent layers but also the ancestor-descendant layers can be stored. For measuring the similarity of two XML documents, the proposed method firstly stores the structure and semantic information in two extended adjacency matrices(M1, M2). Then it computes similarity of the two documents through cos(M1, M2) Experimental results on bench-mark data show that the method holds high efficiency and accuracy

Elsevier - Publisher Connector

The Circular Variance as a Visual Summary of Synchronized Voltage Angle Measurements

Author: Aksoy Sinan
Becejac Tamara
Betzsold Nick
Bhadra Sraddhanjoli
Buckheit John
Follum Jim
Yin Tianzhixi
Publication venue
Publication date: 03/01/2023
Field of study

Phasor measurement units (PMUs) allow voltage angle differences across power grids to be monitored to identify sudden shifts associated with system disturbances. The Eastern Interconnection Situational Awareness and Monitoring System (ESAMS) was developed to identify such wide-area disturbances and summarize them in reports released the following day. Demonstration of ESAMS in North America's Eastern Interconnection revealed the need for an effective visual summary of the disturbance's impact on voltage angle pairs. This paper proposes the use of the circular variance, a measure of dispersion applicable to angular data, for this purpose. Results based on PMU data from North America's Eastern and Western interconnections indicate that the circular variance provides useful summaries of wide-area voltage angle measurements. They also show that the circular variance may have potential uses when applied to historical data to identify unusual grid conditions

ScholarSpace at University of Hawai'i at Manoa