Search CORE

4,332 research outputs found

Deviation detection in text using conceptual graph interchange format and error tolerance dissimilarity function

Author: Abu Bakar Azuraliza
Hamdan Abdul Razak
Kamaruddin Siti Sakira
Mat Nor Fauzias
Publication venue: 'IOS Press'
Publication date: 01/01/2012
Field of study

The rapid increase in the amount of textual data has brought forward a growing research interest towards mining text to detect deviations. Specialized methods for specific domains have emerged to satisfy various needs in discovering rare patterns in text. This paper focuses on a graph-based approach for text representation and presents a novel error tolerance dissimilarity algorithm for deviation detection. We resolve two non-trivial problems, i.e. semantic representation of text and the complexity of graph matching. We employ conceptual graphs interchange format (CGIF) – a knowledge representation formalism to capture the structure and semantics of sentences. We propose a novel error tolerance dissimilarity algorithm to detect deviations in the CGIFs. We evaluate our method in the context of analyzing real world financial statements for identifying deviating performance indicators. We show that our method performs better when compared with two related text based graph similarity measuring methods. Our proposed method has managed to identify deviating sentences and it strongly correlates with expert judgments. Furthermore, it offers error tolerance matching of CGIFs and retains a linear complexity with the increasing number of CGIFs

UUM Repository

Data and Mined-Knowledge Interoperability in eHealth Systems

Author: Kamran Sartipi
Mehran Najafi
Reza S. Kazemzadeh
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study

IntechOpen

Crossref

Dissimilarity algorithm on conceptual graphs to mine text outliers

Author: Abu Bakar Azuraliza
Hamdan Abdul Razak
Kamaruddin Siti Sakira
Mat Nor Fauzias
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

The graphical text representation method such as Conceptual Graphs (CGs) attempts to capture the structure and semantics of documents.As such, they are the preferred text representation approach for a wide range of problems namely in natural language processing, information retrieval and text mining.In a number of these applications, it is necessary to measure the dissimilarity (or similarity) between knowledge represented in the CGs.In this paper, we would like to present a dissimilarity algorithm to detect outliers from a collection of text represented with Conceptual Graph Interchange Format (CGIF).In order to avoid the NP-complete problem of graph matching algorithm, we introduce the use of a standard CG in the dissimilarity computation.We evaluate our method in the context of analyzing real world financial statements for identifying outlying performance indicators.For evaluation purposes, we compare the proposed dissimilarity function with a dice-coefficient similarity function used in a related previous work.Experimental results indicate that our method outperforms the existing method and correlates better to human judgements. In Comparison to other text outlier detection method, this approach managed to capture the semantics of documents through the use of CGs and is convenient to detect outliers through a simple dissimilarity function.Furthermore, our proposed algorithm retains a linear complexity with the increasing number of CGs

UUM Repository

Legal compliance by design (LCbD) and through design (LCtD) : preliminary survey

Author: Casanovas Pompeu
González-Conejero Jorge
Koker Louis de
Publication venue
Publication date: 01/01/2017
Field of study

1st Workshop on Technologies for Regulatory Compliance co-located with the 30th International Conference on Legal Knowledge and Information Systems (JURIX 2017). The purpose of this paper is twofold: (i) carrying out a preliminary survey of the literature and research projects on Compliance by Design (CbD); and (ii) clarifying the double process of (a) extending business managing techniques to other regulatory fields, and (b) converging trends in legal theory, legal technology and Artificial Intelligence. The paper highlights the connections and differences we found across different domains and proposals. We distinguish three different policydriven types of CbD: (i) business, (ii) regulatory, (iii) and legal. The recent deployment of ethical views, and the implementation of general principles of privacy and data protection lead to the conclusion that, in order to appropriately define legal compliance, Compliance through Design (CtD) should be differentiated from CbD

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Diposit Digital de Documents de la UAB

Integrating data warehouses with web data : a survey

Author: Aramburu Cabo María José
Berlanga Llavori Rafael
Pedersen Torben Bach
Pérez Martínez Juan Manuel
Publication venue: IEEE Computer Society
Publication date: 01/01/2008
Field of study

This paper surveys the most relevant research on combining Data Warehouse (DW) and Web data. It studies the XML technologies that are currently being used to integrate, store, query, and retrieve Web data and their application to DWs. The paper reviews different DW distributed architectures and the use of XML languages as an integration tool in these systems. It also introduces the problem of dealing with semistructured data in a DW. It studies Web data repositories, the design of multidimensional databases for XML data sources, and the XML extensions of OnLine Analytical Processing techniques. The paper addresses the application of information retrieval technology in a DW to exploit text-rich document collections. The authors hope that the paper will help to discover the main limitations and opportunities that offer the combination of the DW and the Web fields, as well as to identify open research line

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Repositori Institucional de la Universitat Jaume I

VBN

Study of the virtual reality technology used in the visualization of business information

Author: Darville Christelle
Van Espen Stéphane
Publication venue
Publication date: 01/01/2000
Field of study

Repository of the University of Namur

ネットワーク情報環境におけるメタデータの長期利用性向上のためのメタデータスキーマの来歴記述に関する研究

Author: LI CHUNQIU
李春秋
Publication venue
Publication date: 01/01/2018
Field of study

筑波大学 (University of Tsukuba)201

Tsukuba Repository

Continuous Process Auditing (CPA): an Audit Rule Ontology Approach to Compliance and Operational Audits

Author: Subhani Numanul Hoque
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2016
Field of study

Continuous Auditing (CA) has been investigated over time and it is, somewhat, in practice within nancial and transactional auditing as a part of continuous assurance and monitoring. Enterprise Information Systems (EIS) that run their activities in the form of processes require continuous auditing of a process that invokes the action(s) speci ed in the policies and rules in a continuous manner and/or sometimes in real-time. This leads to the question: How much could continuous auditing mimic the actual auditing procedures performed by auditing professionals? We investigate some of these questions through Continuous Process Auditing (CPA) relying on heterogeneous activities of processes in the EIS, as well as detecting exceptions and evidence in current and historic databases to provide audit assurance

Scholarship at UWindsor

Development of linguistic linked open data resources for collaborative data-intensive research in the language sciences

Author: Blume Maria
Chiarcos Christian
Lust Barbara C.
Pareja-Lora Antonio
Publication venue: 'MIT Press - Journals'
Publication date: 27/04/2023
Field of study

Making diverse data in linguistics and the language sciences open, distributed, and accessible: perspectives from language/language acquistiion researchers and technical LOD (linked open data) researchers. This volume examines the challenges inherent in making diverse data in linguistics and the language sciences open, distributed, integrated, and accessible, thus fostering wide data sharing and collaboration. It is unique in integrating the perspectives of language researchers and technical LOD (linked open data) researchers. Reporting on both active research needs in the field of language acquisition and technical advances in the development of data interoperability, the book demonstrates the advantages of an international infrastructure for scholarship in the field of language sciences. With contributions by researchers who produce complex data content and scholars involved in both the technology and the conceptual foundations of LLOD (linguistics linked open data), the book focuses on the area of language acquisition because it involves complex and diverse data sets, cross-linguistic analyses, and urgent collaborative research. The contributors discuss a variety of research methods, resources, and infrastructures. Contributors Isabelle Barrière, Nan Bernstein Ratner, Steven Bird, Maria Blume, Ted Caldwell, Christian Chiarcos, Cristina Dye, Suzanne Flynn, Claire Foley, Nancy Ide, Carissa Kang, D. Terence Langendoen, Barbara Lust, Brian MacWhinney, Jonathan Masci, Steven Moran, Antonio Pareja-Lora, Jim Reidy, Oya Y. Rieger, Gary F. Simons, Thorsten Trippel, Kara Warburton, Sue Ellen Wright, Claus Zin

OPUS Augsburg

Managing Information System Integration Technologies--A Study of Text Mined Industry White Papers

Author: Ravindran Balaji
Publication venue: ScholarWorks@UNO
Publication date: 16/05/2003
Field of study

Industry white papers are increasingly being used to explain the philosophy and operation of a product in marketplace or technology context. This explanation is used by senior managers for strategic planning in an organization. This research explores the effectiveness of white papers and strategies for managers to learn about technologies using white papers. The research is conducted by collecting industry white papers in the area of Information System Integration and gleaned relevant information through text-mining tool, Vantage Point. The text mined information is analyzed to provide solutions for practical problems in systems integration market. The indirect findings of the research are New System Integration Business Models, Methods for Calculating ROI of System Integration Project, and Managing Implementation Failures

University of New Orleans