835 research outputs found

    Mining Characteristic Patterns for Comparative Music Corpus Analysis

    Get PDF
    A core issue of computational pattern mining is the identification of interesting patterns. When mining music corpora organized into classes of songs, patterns may be of interest because they are characteristic, describing prevalent properties of classes, or because they are discriminant, capturing distinctive properties of classes. Existing work in computational music corpus analysis has focused on discovering discriminant patterns. This paper studies characteristic patterns, investigating the behavior of different pattern interestingness measures in balancing coverage and discriminability of classes in top k pattern mining and in individual top ranked patterns. Characteristic pattern mining is applied to the collection of Native American music by Frances Densmore, and the discovered patterns are shown to be supported by Densmore’s own analyses

    SeqScout: Using a Bandit Model to Discover Interesting Subgroups in Labeled Sequences

    Get PDF
    International audienceIt is extremely useful to exploit labeled datasets not only to learn models but also to improve our understanding of a domain and its available targeted classes. The so-called subgroup discovery task has been considered for a long time. It concerns the discovery of patterns or descriptions, the set of supporting objects of which have interesting properties, e.g., they characterize or discriminate a given target class. Though many subgroup discovery algorithms have been proposed for transactional data, discovering subgroups within labeled sequential data and thus searching for descriptions as sequential patterns has been much less studied. In that context, exhaustive exploration strategies can not be used for real-life applications and we have to look for heuristic approaches. We propose the algorithm SeqScout to discover interesting subgroups (w.r.t. a chosen quality measure) from labeled sequences of itemsets. This is a new sampling algorithm that mines discriminant sequential patterns using a multi-armed bandit model. It is an anytime algorithm that, for a given budget, finds a collection of local optima in the search space of descriptions and thus subgroups. It requires a light configuration and it is independent from the quality measure used for pattern scoring. Furthermore, it is fairly simple to implement. We provide qualitative and quantitative experiments on several datasets to illustrate its added-value

    From local pattern mining to relevant bi-cluster characterization

    Get PDF
    Abstract. Clustering or bi-clustering techniques have been proved quite useful in many application domains. A weakness of these techniques remains the poor support for grouping characterization. We consider eventually large Boolean data sets which record properties of objects and we assume that a bi-partition is available. We introduce a generic cluster characterization technique which is based on collections of bi-sets (i.e., sets of objects associated to sets of properties) which satisfy some userdefined constraints, and a measure of the accuracy of a given bi-set as a bi-cluster characterization pattern. The method is illustrated on both formal concepts (i.e., "maximal rectangles of true values") and the new type of δ-bi-sets (i.e., "rectangles of true values with a bounded number of exceptions per column"). The added-value is illustrated on benchmark data and two real data sets which are intrinsically noisy: a medical data about meningitis and Plasmodium falciparum gene expression data

    Function Based Design-by-Analogy: A Functional Vector Approach to Analogical Search

    Get PDF
    Design-by-analogy is a powerful approach to augment traditional concept generation methods by expanding the set of generated ideas using similarity relationships from solutions to analogous problems. While the concept of design-by-analogy has been known for some time, few actual methods and tools exist to assist designers in systematically seeking and identifying analogies from general data sources, databases, or repositories, such as patent databases. A new method for extracting functional analogies from data sources has been developed to provide this capability, here based on a functional basis rather than form or conflict descriptions. Building on past research, we utilize a functional vector space model (VSM) to quantify analogous similarity of an idea's functionality. We quantitatively evaluate the functional similarity between represented design problems and, in this case, patent descriptions of products. We also develop document parsing algorithms to reduce text descriptions of the data sources down to the key functions, for use in the functional similarity analysis and functional vector space modeling. To do this, we apply Zipf's law on word count order reduction to reduce the words within the documents down to the applicable functionally critical terms, thus providing a mapping process for function based search. The reduction of a document into functional analogous words enables the matching to novel ideas that are functionally similar, which can be customized various ways. This approach thereby provides relevant sources of design-by-analogy inspiration. As a verification of the approach, two original design problem case studies illustrate the distance range of analogical solutions that can be extracted. This range extends from very near-field, literal solutions to far-field cross-domain analogies.National Science Foundation (U.S.) (Grant CMMI-0855326)National Science Foundation (U.S.) (Grant CMMI-0855510)National Science Foundation (U.S.) (Grant CMMI-0855293)SUTD-MIT International Design Centre (IDC

    Mending the Gaps: An Exercise in Identifying and Understanding Diverse and Multicultural Team Faultlines

    Get PDF
    The Faultlines Exercise, an experiential activity, introduces students to concepts of diversity attributes (surface and deep levels), social identity, and team faultlines. Through individual reflection and team discussion, students apply these concepts to their own diverse multicultural class teams with the goals of (a) preventing negative outcomes that may develop from faultlines and (b) improving team performance. Plenary class discussions reinforce key learning points that can be applied to teamwork throughout the course. Students in both face-to-face and online classes report that the exercise helps improve team performance and helps to identify and resolve problems. Instructions for facilitating classroom discussion and student handouts are provided, as are suggestions for adapting the exercise to other constructs

    Finding relational redescriptions

    Get PDF
    We introduce relational redescription mining, that is, the task of finding two structurally different patterns that describe nearly the same set of object pairs in a relational dataset. By extending redescription mining beyond propositional and real-valued attributes, it provides a powerful tool to match different relational descriptions of the same concept. We propose an alternating scheme for solving this problem. Its core consists of a novel relational query miner that efficiently identifies discriminative connection patterns between pairs of objects. Compared to a baseline Inductive Logic Programming (ILP) approach, our query miner is able to mine more complex queries, much faster. We performed extensive experiments on three real world relational datasets, and present examples of redescriptions found, exhibiting the power of the method to expressively capture relations present in these networks

    Marketing authorization procedures for advanced cancer drugs: exploring the views of patients, oncologists, healthcare decision makers and citizens in France

    Get PDF
    International audienceBackground. The past decades have seen advances in cancer treatments in terms of toxicity and side effects but progress in the treatment of advanced cancer has been modest. New drugs have emerged improving progression free survival but with little impact on overall survival, raising questions about the criteria on which to base decisions to grant marketing authorizations and about the authorization procedure itself. For decisions to be fair, transparent and accountable, it is necessary to consider the views of those with relevant expertise and experience. Methods. We conducted a Q-study to explore the views of a range of stakeholders in France, involving: 54 patients (18 months after diagnosis); 50 members of the general population; 27 oncologists; 19 healthcare decision makers; and 2 individuals from the pharmaceutical industry. Results. Three viewpoints emerged, focussing on different dimensions entitled: 1) ‘Quality of life (QoL), opportunity cost and participative democracy’; 2)‘QoL and patient-centeredness’; and 3) ‘Length of life’. Respondents from all groups were associated with each viewpoint, except for healthcare decision makers, who were only associated with the first one. Conclusion. Our results highlight plurality in the views of stakeholders, emphasize the need for transparency in decision making processes, and illustrate the importance of a re-evaluation of treatments for all 3 viewpoints. In the context of advanced cancer, our results suggest that QoL should be more prominent amongst authorization criteria, as it is a concern for 2 of the 3 viewpoints
    • …
    corecore