7,119 research outputs found

    Prediction of peptides binding to MHC class I alleles by partial periodic pattern mining

    Get PDF
    MHC (Major Histocompatibility Complex) is a key player in the immune response of an organism. It is important to be able to predict which antigenic peptides will bind to a spe-cific MHC allele and which will not, creating possibilities for controlling immune response and for the applications of immunotherapy. However a problem encountered in the computational binding prediction methods for MHC class I is the presence of bulges and loops in the peptides, changing the total length. Most machine learning methods in use to-day require the sequences to be of same length to success-fully mine the binding motifs. We propose the use of time-based data mining methods in motif mining to be able to mine motifs position-independently. Also, the information for both binding and non-binding peptides are used on the contrary to the other methods which only rely on binding peptides. The prediction results are between 70-80% for the tested alleles

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    A Framework for Discovery and Diagnosis of Behavioral Transitions in Event-streams

    Get PDF
    Date stream mining techniques can be used in tracking user behaviors as they attempt to achieve their goals. Quality metrics over stream-mined models identify potential changes in user goal attainment. When the quality of some data mined models varies significantly from nearby models—as defined by quality metrics—then the user’s behavior is automatically flagged as a potentially significant behavioral change. Decision tree, sequence pattern and Hidden Markov modeling being used in this study. These three types of modeling can expose different aspect of user’s behavior. In case of decision tree modeling, the specific changes in user behavior can automatically characterized by differencing the data-mined decision-tree models. The sequence pattern modeling can shed light on how the user changes his sequence of actions and Hidden Markov modeling can identifies the learning transition points. This research describes how model-quality monitoring and these three types of modeling as a generic framework can aid recognition and diagnoses of behavioral changes in a case study of cognitive rehabilitation via emailing. The date stream mining techniques mentioned are used to monitor patient goals as part of a clinical plan to aid cognitive rehabilitation. In this context, real time data mining aids clinicians in tracking user behaviors as they attempt to achieve their goals. This generic framework can be widely applicable to other real-time data-intensive analysis problems. In order to illustrate this fact, the similar Hidden Markov modeling is being used for analyzing the transactional behavior of a telecommunication company for fraud detection. Fraud similarly can be considered as a potentially significant transaction behavioral change

    Mining Frequent Itemsets Using Genetic Algorithm

    Full text link
    In general frequent itemsets are generated from large data sets by applying association rule mining algorithms like Apriori, Partition, Pincer-Search, Incremental, Border algorithm etc., which take too much computer time to compute all the frequent itemsets. By using Genetic Algorithm (GA) we can improve the scenario. The major advantage of using GA in the discovery of frequent itemsets is that they perform global search and its time complexity is less compared to other algorithms as the genetic algorithm is based on the greedy approach. The main aim of this paper is to find all the frequent itemsets from given data sets using genetic algorithm

    Weak signal identification with semantic web mining

    Get PDF
    We investigate an automated identification of weak signals according to Ansoff to improve strategic planning and technological forecasting. Literature shows that weak signals can be found in the organization's environment and that they appear in different contexts. We use internet information to represent organization's environment and we select these websites that are related to a given hypothesis. In contrast to related research, a methodology is provided that uses latent semantic indexing (LSI) for the identification of weak signals. This improves existing knowledge based approaches because LSI considers the aspects of meaning and thus, it is able to identify similar textual patterns in different contexts. A new weak signal maximization approach is introduced that replaces the commonly used prediction modeling approach in LSI. It enables to calculate the largest number of relevant weak signals represented by singular value decomposition (SVD) dimensions. A case study identifies and analyses weak signals to predict trends in the field of on-site medical oxygen production. This supports the planning of research and development (R&D) for a medical oxygen supplier. As a result, it is shown that the proposed methodology enables organizations to identify weak signals from the internet for a given hypothesis. This helps strategic planners to react ahead of time
    • …
    corecore