Search CORE

20 research outputs found

Flexible queries in XML native databases

Author: Arfaoui Olfa
Sassi-Hidri Minyar
Publication venue
Publication date: 06/12/2013
Field of study

To date, most of the XML native databases (DB) flexible querying systems are based on exploiting the tree structure of their semi structured data (SSD). However, it becomes important to test the efficiency of Formal Concept Analysis (FCA) formalism for this type of data since it has been proved a great performance in the field of information retrieval (IR). So, the IR in XML databases based on FCA is mainly based on the use of the lattice structure. Each concept of this lattice can be interpreted as a pair (response, query). In this work, we provide a new flexible modeling of XML DB based on fuzzy FCA as a first step towards flexible querying of SSD.Comment: 5 Pages, 1 Figur

arXiv.org e-Print Archive

Mining Semi-structured Data

Author: Arfaoui Olfa
Hidri Minyar Sassi
Publication venue
Publication date: 15/04/2015
Field of study

The need for discovering knowledge from XML documents according to both structure and content features has become challenging, due to the increase in application contexts for which handling both structure and content information in XML data is essential. So, the challenge is to find an hierarchical structure which ensure a combination of data levels and their representative structures. In this work, we will be based on the Formal Concept Analysis-based views to index and query both content and structure. We evaluate given structure in a querying process which allows the searching of user query answers

arXiv.org e-Print Archive

Parallel architectures for fuzzy triadic similarity learning

Author: Alouane-Ksouri Sonia
Barkaoui Kamel
Sassi-Hidri Minyar
Publication venue
Publication date: 21/12/2013
Field of study

In a context of document co-clustering, we define a new similarity measure which iteratively computes similarity while combining fuzzy sets in a three-partite graph. The fuzzy triadic similarity (FT-Sim) model can deal with uncertainty offers by the fuzzy sets. Moreover, with the development of the Web and the high availability of storage spaces, more and more documents become accessible. Documents can be provided from multiple sites and make similarity computation an expensive processing. This problem motivated us to use parallel computing. In this paper, we introduce parallel architectures which are able to treat large and multi-source data sets by a sequential, a merging or a splitting-based process. Then, we proceed to a local and a central (or global) computing using the basic FT-Sim measure. The idea behind these architectures is to reduce both time and space complexities thanks to parallel computation

arXiv.org e-Print Archive

About Summarization in Large Fuzzy Databases

Author: Grissa-Touzi Amel
Hidri Minyar Sassi
Sougui Ines Benali
Publication venue
Publication date: 29/10/2013
Field of study

Moved by the need increased for modeling of the fuzzy data, the success of the systems of exact generation of summary of data, we propose in this paper, a new approach of generation of summary from fuzzy data called Fuzzy-SaintEtiQ. This approach is an extension of the SaintEtiQ model to support the fuzzy data. It presents the following optimizations such as 1) the minimization of the expert risk; 2) the construction of a more detailed and more precise summaries hierarchy, and 3) the co-operation with the user by giving him fuzzy summaries in different hierarchical level

arXiv.org e-Print Archive

Flexible SQLf query based on fuzzy linguistic summaries

Author: Benali-Sougui Ines
Grissa-Touzi Amel
Sassi-Hidri Minyar
Publication venue
Publication date: 02/01/2014
Field of study

Data is often partially known, vague or ambiguous in many real world applications. To deal with such imprecise information, fuzziness is introduced in the classical model. SQLf is one of the practical language to deal with flexible fuzzy querying in Fuzzy DataBases (FDB). However, with a huge amount of fuzzy data, the necessity to work with synthetic views became a challenge for many DB community researchers. The present work deals with Flexible SQLf query based on fuzzy linguistic summaries. We use the fuzzy summaries produced by our Fuzzy-SaintEtiq approach. It provides a description of objects depending on the fuzzy linguistic labels specified as selection criteria

arXiv.org e-Print Archive

Towards a New Extracting and Querying Approach of Fuzzy Summaries

Author: Benali-Sougui Ines
Grissa-Touzi Amel
Hidri Minyar Sassi
Publication venue: 'IGI Global'
Publication date: 28/04/2019
Field of study

Diversification of DB applications highlighted the limitations of relational database management system (RDBMS) particularly on the modeling plan. In fact, in the real world, we are increasingly faced with the situation where applications need to handle imprecise data and to offer a flexible querying to their users. Several theoretical solutions have been proposed. However, the impact of this work in practice remained negligible with the exception of a few research prototypes based on the formal model GEFRED. In this chapter, the authors propose a new approach for exploitation of fuzzy relational databases (FRDB) described by the model GEFRED. This approach consists of 1) a new technique for extracting summary fuzzy data, Fuzzy SAINTETIQ, based on the classification of fuzzy data and formal concepts analysis; 2) an approach of assessing flexible queries in the context of FDB based on the set of fuzzy summaries generated by our fuzzy SAINTETIQ system; 3) an approach of repairing and substituting unanswered query.Comment: 22 pages, 6 figures, 8 tables. Multidisciplinary Approaches to Service-Oriented Engineering, 2018. arXiv admin note: text overlap with arXiv:1401.049

arXiv.org e-Print Archive

Traitement approximatif des requ\^etes flexibles avec groupement d'attributs et jointure

Author: Bdira Soukaina Ben
Sassi-Hidri Minyar
Publication venue
Publication date: 25/04/2013
Field of study

This paper addresses the problem of approximate processing for flexible queries in the form SELECT-FROM-WHERE-GROUP BY with join condition. It offers a flexible framework for online aggregation while promoting response time at the expense of result accuracy.Comment: in French. The 13\`eme Conf\'erence Francophone sur l'Extraction et la Gestion des Connaissances (EGC), pp. 29-30, 201

arXiv.org e-Print Archive

Mod\`ele flou d'expression des pr\'ef\'erences bas\'e sur les CP-Nets

Author: Rezgui Hanene
Sassi-Hidri Minyar
Publication venue
Publication date: 31/03/2013
Field of study

This article addresses the problem of expressing preferences in flexible queries while basing on a combination of the fuzzy logic theory and Conditional Preference Networks or CP-Nets.Comment: 2 pages, EGC 201

arXiv.org e-Print Archive

Dimensionality reduction with missing values imputation

Author: Alouane Nejib Ben-Hadj
Arfaoui Olfa
Gahar Rania Mkhinini
Hidri Minyar Sassi
Publication venue
Publication date: 02/07/2017
Field of study

In this study, we propose a new statical approach for high-dimensionality reduction of heterogenous data that limits the curse of dimensionality and deals with missing values. To handle these latter, we propose to use the Random Forest imputation's method. The main purpose here is to extract useful information and so reducing the search space to facilitate the data exploration process. Several illustrative numeric examples, using data coming from publicly available machine learning repositories are also included. The experimental component of the study shows the efficiency of the proposed analytical approach.Comment: 6 pages, 2 figures, The first Computer science University of Tunis El Manar, PhD Symposium (CUPS'17), Tunisia, May 22-25, 201

arXiv.org e-Print Archive

Classification non supervis\'ee des donn\'ees h\'et\'erog\`enes \`a large \'echelle

Author: Arfaoui Olfa
Ayed Rahma Ben
Hidri Minyar Sassi
Zoghlami Mohamed Ali
Publication venue
Publication date: 02/07/2017
Field of study

When it comes to cluster massive data, response time, disk access and quality of formed classes becoming major issues for companies. It is in this context that we have come to define a clustering framework for large scale heterogeneous data that contributes to the resolution of these issues. The proposed framework is based on, firstly, the descriptive analysis based on MCA, and secondly, the MapReduce paradigm in a large scale environment. The results are encouraging and prove the efficiency of the hybrid deployment on response quality and time component as on qualitative and quantitative data.Comment: 6 pages, in French, 8 figure

arXiv.org e-Print Archive