106,312 research outputs found
Constraint-wish and satisfied-dissatisfied: an overview of two approaches for dealing with bipolar querying
In recent years, there has been an increasing interest in dealing with user preferences in flexible database querying, expressing both positive and negative information in a heterogeneous way. This is what is usually referred to as bipolar database querying. Different frameworks have been introduced to deal with such bipolarity. In this chapter, an overview of two approaches is given. The first approach is based on mandatory and desired requirements. Hereby the complement of a mandatory requirement can be considered as a specification of what is not desired at all. So, mandatory requirements indirectly contribute to negative information (expressing what the user does not want to retrieve), whereas desired requirements can be seen as positive information (expressing what the user prefers to retrieve). The second approach is directly based on positive requirements (expressing what the user wants to retrieve), and negative requirements (expressing what the user does not want to retrieve). Both approaches use pairs of satisfaction degrees as the underlying framework but have different semantics, and thus also different operators for criteria evaluation, ranking, aggregation, etc
BigFCM: Fast, Precise and Scalable FCM on Hadoop
Clustering plays an important role in mining big data both as a modeling
technique and a preprocessing step in many data mining process implementations.
Fuzzy clustering provides more flexibility than non-fuzzy methods by allowing
each data record to belong to more than one cluster to some degree. However, a
serious challenge in fuzzy clustering is the lack of scalability. Massive
datasets in emerging fields such as geosciences, biology and networking do
require parallel and distributed computations with high performance to solve
real-world problems. Although some clustering methods are already improved to
execute on big data platforms, but their execution time is highly increased for
large datasets. In this paper, a scalable Fuzzy C-Means (FCM) clustering named
BigFCM is proposed and designed for the Hadoop distributed data platform. Based
on the map-reduce programming model, it exploits several mechanisms including
an efficient caching design to achieve several orders of magnitude reduction in
execution time. Extensive evaluation over multi-gigabyte datasets shows that
BigFCM is scalable while it preserves the quality of clustering
Ensuring patients privacy in a cryptographic-based-electronic health records using bio-cryptography
Several recent works have proposed and implemented cryptography as a means to
preserve privacy and security of patients health data. Nevertheless, the
weakest point of electronic health record (EHR) systems that relied on these
cryptographic schemes is key management. Thus, this paper presents the
development of privacy and security system for cryptography-based-EHR by taking
advantage of the uniqueness of fingerprint and iris characteristic features to
secure cryptographic keys in a bio-cryptography framework. The results of the
system evaluation showed significant improvements in terms of time efficiency
of this approach to cryptographic-based-EHR. Both the fuzzy vault and fuzzy
commitment demonstrated false acceptance rate (FAR) of 0%, which reduces the
likelihood of imposters gaining successful access to the keys protecting
patients protected health information. This result also justifies the
feasibility of implementing fuzzy key binding scheme in real applications,
especially fuzzy vault which demonstrated a better performance during key
reconstruction
Statistical and fuzzy approach for database security
A new type of database anomaly is described by
addressing the concept of Cumulated Anomaly in this
paper. Dubiety-Determining Model (DDM), which is a
detection model basing on statistical and fuzzy set
theories for Cumulated Anomaly, is proposed. DDM
can measure the dubiety degree of each database
transaction quantitatively. Software system
architecture to support the DDM for monitoring
database transactions is designed. We also
implemented the system and tested it. Our
experimental results show that the DDM method is
feasible and effective
Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses
A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total sales amount 1,000 items indicates a good or bad sales performance" is still unclear. From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore the use of fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three layered data warehouse semantic model, consisting of quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed for capturing and explicating the semantics of warehoused data. Based on the model, several algebraic operators are defined. We also extend the SQL language to allow for flexible queries against such enhanced data warehouses
- …