5,733 research outputs found

    Local Guarantees in Graph Cuts and Clustering

    Full text link
    Correlation Clustering is an elegant model that captures fundamental graph cut problems such as Min s−ts-t Cut, Multiway Cut, and Multicut, extensively studied in combinatorial optimization. Here, we are given a graph with edges labeled ++ or −- and the goal is to produce a clustering that agrees with the labels as much as possible: ++ edges within clusters and −- edges across clusters. The classical approach towards Correlation Clustering (and other graph cut problems) is to optimize a global objective. We depart from this and study local objectives: minimizing the maximum number of disagreements for edges incident on a single node, and the analogous max min agreements objective. This naturally gives rise to a family of basic min-max graph cut problems. A prototypical representative is Min Max s−ts-t Cut: find an s−ts-t cut minimizing the largest number of cut edges incident on any node. We present the following results: (1)(1) an O(n)O(\sqrt{n})-approximation for the problem of minimizing the maximum total weight of disagreement edges incident on any node (thus providing the first known approximation for the above family of min-max graph cut problems), (2)(2) a remarkably simple 77-approximation for minimizing local disagreements in complete graphs (improving upon the previous best known approximation of 4848), and (3)(3) a 1/(2+ε)1/(2+\varepsilon)-approximation for maximizing the minimum total weight of agreement edges incident on any node, hence improving upon the 1/(4+ε)1/(4+\varepsilon)-approximation that follows from the study of approximate pure Nash equilibria in cut and party affiliation games

    Recommender Systems

    Get PDF
    The ongoing rapid expansion of the Internet greatly increases the necessity of effective recommender systems for filtering the abundant information. Extensive research for recommender systems is conducted by a broad range of communities including social and computer scientists, physicists, and interdisciplinary researchers. Despite substantial theoretical and practical achievements, unification and comparison of different approaches are lacking, which impedes further advances. In this article, we review recent developments in recommender systems and discuss the major challenges. We compare and evaluate available algorithms and examine their roles in the future developments. In addition to algorithms, physical aspects are described to illustrate macroscopic behavior of recommender systems. Potential impacts and future directions are discussed. We emphasize that recommendation has a great scientific depth and combines diverse research fields which makes it of interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports

    Incorporating peak grouping information for alignment of multiple liquid chromatography-mass spectrometry datasets

    Get PDF
    Motivation: The combination of liquid chromatography and mass spectrometry (LC/MS) has been widely used for large-scale comparative studies in systems biology, including proteomics, glycomics and metabolomics. In almost all experimental design, it is necessary to compare chromatograms across biological or technical replicates and across sample groups. Central to this is the peak alignment step, which is one of the most important but challenging preprocessing steps. Existing alignment tools do not take into account the structural dependencies between related peaks that co-elute and are derived from the same metabolite or peptide. We propose a direct matching peak alignment method for LC/MS data that incorporates related peaks information (within each LC/MS run) and investigate its effect on alignment performance (across runs). The groupings of related peaks necessary for our method can be obtained from any peak clustering method and are built into a pairwise peak similarity score function. The similarity score matrix produced is used by an approximation algorithm for the weighted matching problem to produce the actual alignment result.<p></p> Results: We demonstrate that related peak information can improve alignment performance. The performance is evaluated on a set of benchmark datasets, where our method performs competitively compared to other popular alignment tools.<p></p> Availability: The proposed alignment method has been implemented as a stand-alone application in Python, available for download at http://github.com/joewandy/peak-grouping-alignment.<p></p&gt

    Soft clustering analysis of galaxy morphologies: A worked example with SDSS

    Full text link
    Context: The huge and still rapidly growing amount of galaxies in modern sky surveys raises the need of an automated and objective classification method. Unsupervised learning algorithms are of particular interest, since they discover classes automatically. Aims: We briefly discuss the pitfalls of oversimplified classification methods and outline an alternative approach called "clustering analysis". Methods: We categorise different classification methods according to their capabilities. Based on this categorisation, we present a probabilistic classification algorithm that automatically detects the optimal classes preferred by the data. We explore the reliability of this algorithm in systematic tests. Using a small sample of bright galaxies from the SDSS, we demonstrate the performance of this algorithm in practice. We are able to disentangle the problems of classification and parametrisation of galaxy morphologies in this case. Results: We give physical arguments that a probabilistic classification scheme is necessary. The algorithm we present produces reasonable morphological classes and object-to-class assignments without any prior assumptions. Conclusions: There are sophisticated automated classification algorithms that meet all necessary requirements, but a lot of work is still needed on the interpretation of the results.Comment: 18 pages, 19 figures, 2 tables, submitted to A

    Network analysis of online bidding activity

    Get PDF
    With the advent of digital media, people are increasingly resorting to online channels for commercial transactions. Online auction is a prototypical example. In such online transactions, the pattern of bidding activity is more complex than traditional online transactions; this is because the number of bidders participating in a given transaction is not bounded and the bidders can also easily respond to the bidding instantaneously. By using the recently developed network theory, we study the interaction patterns between bidders (items) who (that) are connected when they bid for the same item (if the item is bid by the same bidder). The resulting network is analyzed by using the hierarchical clustering algorithm, which is used for clustering analysis for expression data from DNA microarrays. A dendrogram is constructed for the item subcategories; this dendrogram is compared with a traditional classification scheme. The implication of the difference between the two is discussed.Comment: 8 pages and 11 figure
    • …
    corecore