14,122 research outputs found
A niching memetic algorithm for simultaneous clustering and feature selection
Clustering is inherently a difficult task, and is made even more difficult when the selection of relevant features is also an issue. In this paper we propose an approach for simultaneous clustering and feature selection using a niching memetic algorithm. Our approach (which we call NMA_CFS) makes feature selection an integral part of the global clustering search procedure and attempts to overcome the problem of identifying less promising locally optimal solutions in both clustering and feature selection, without making any a priori assumption about the number of clusters. Within the NMA_CFS procedure, a variable composite representation is devised to encode both feature selection and cluster centers with different numbers of clusters. Further, local search operations are introduced to refine feature selection and cluster centers encoded in the chromosomes. Finally, a niching method is integrated to preserve the population diversity and prevent premature convergence. In an experimental evaluation we demonstrate the effectiveness of the proposed approach and compare it with other related approaches, using both synthetic and real data
A hybrid algorithm for k-medoid clustering of large data sets
In this paper, we propose a novel local search heuristic and then hybridize it with a genetic algorithm for k-medoid clustering of large data sets, which is an NP-hard optimization problem. The local search heuristic selects k-medoids from the data set and tries to efficiently minimize the total dissimilarity within each cluster. In order to deal with the local optimality, the local search heuristic is hybridized with a genetic algorithm and then the Hybrid K-medoid Algorithm (HKA) is proposed. Our experiments show that, compared with previous genetic algorithm based k-medoid clustering approaches - GCA and RAR/sub w/GA, HKA can provide better clustering solutions and do so more efficiently. Experiments use two gene expression data sets, which may involve large noise components
Magnetothermoelectric transport properties in phosphorene
We numerically study the electrical and thermoelectric transport properties
in phosphorene in the presence of both a magnetic field and disorder. The
quantized Hall conductivity is similar to that of a conventional
two-dimensional electron gas, but the positions of all the Hall plateaus shift
to the left due to the spectral asymmetry, in agreement with the experimental
observations. The thermoelectric conductivity and Nernst signal exhibit
remarkable anisotropy, and the thermopower is nearly isotropic. When a bias
voltage is applied between top and bottom layers of phosphorene, both
thermopower and Nernst signal are enhanced and their peak values become large.Comment: 8 pages, 9 figure
Approximate truth discovery via problem scale reduction
Many real-world applications rely on multiple data sources to provide information on their interested items. Due to the noises and uncertainty in data, given a specific item, the information from different sources may conflict. To make reliable decisions based on these data, it is important to identify the trustworthy information by resolving these conflicts, i.e., the truth discovery problem. Current solutions to this problem detect the veracity of each value jointly with the reliability of each source for every data item. In this way, the efficiency of truth discovery is strictly confined by the problem scale, which in turn limits truth discovery algorithms from being applicable on a large scale. To address this issue, we propose an approximate truth discovery approach, which divides sources and values into groups according to a userspecified approximation criterion. The groups are then used for efficient inter-value influence computation to improve the accuracy. Our approach is applicable to most existing truth discovery algorithms. Experiments on real-world datasets show that our approach improves the efficiency compared to existing algorithms while achieving similar or even better accuracy. The scalability is further demonstrated by experiments on large synthetic datasets.Xianzhi Wang, Quan Z. Sheng, Xiu Susie Fang, Xue Li, Xiaofei Xu, and Lina Ya
Magnetic control of the pair creation in spatially localized supercritical fields
We examine the impact of a perpendicular magnetic field on the creation mechanism of electron-positron pairs in a supercritical static electric field, where both fields are localized along the direction of the electric field. In the case where the spatial extent of the magnetic field exceeds that of the electric field, quantum field theoretical simulations based on the Dirac equation predict a suppression of pair creation even if the electric field is supercritical. Furthermore, an arbitrarily small magnetic field outside the interaction zone can bring the creation process even to a complete halt, if it is sufficiently extended. The mechanism for this magnetically induced complete shutoff can be associated with a reopening of the mass gap and the emergence of electrically dressed Landau levels
Empowering truth discovery with multi-truth prediction
Truth discovery is the problem of detecting true values from the con icting data provided by multiple sources on the same data items. Since sources' reliability is unknown a priori, a truth discovery method usually estimates sources' reliability along with the truth discovery process. A major limitation of existing truth discovery methods is that they commonly assume exactly one true value on each data item and therefore cannot deal with the more general case that a data item may have multiple true values (or multi-truth). Since the number of true values may vary from data item to data item, this requires truth discovery methods being able to detect varying numbers of truth values from the multi source data. In this paper, we propose a multi-truth discovery approach, which addresses the above challenges by providing a generic framework for enhancing existing truth discovery methods. In particular, we redeem the numbers of true values as an important clue for facilitating multi-truth discovery. We present the procedure and components of our approach, and propose three models, namely the byproduct model, the joint model, and the synthesis model to implement our approach. We further propose two extensions to enhance our approach, by leveraging the implications of similar numerical values and values' co-occurrence informa- tion in sources' claims to improve the truth discovery accuracy. Experimental studies on real-world datasets demonstrate the effectiveness of our approach.Xianzhi Wang, Quan Z. Sheng, Lina Yao, Xue Li, Xiu Susie Fang, Xiaofei Xu, and Boualem Benatalla
- …
