688,157 research outputs found

    Mining Temporal Association Rules with Temporal Soft Sets

    Get PDF
    This work was partially supported by the National Natural Science Foundation of China (grant no. 11301415), the Shaanxi Provincial Key Research and Development Program (grant no. 2021SF-480), and the Natural Science Basic Research Plan in Shaanxi Province of China (grant no. 2018JM1054).Traditional association rule extraction may run into some difficulties due to ignoring the temporal aspect of the collected data. Particularly, it happens in many cases that some item sets are frequent during specific time periods, although they are not frequent in the whole data set. In this study, we make an effort to enhance conventional rule mining by introducing temporal soft sets. We define temporal granulation mappings to induce granular structures for temporal transaction data. Using this notion, we define temporal soft sets and their Q-clip soft sets to establish a novel framework for mining temporal association rules. A number of useful characterizations and results are obtained, including a necessary and sufficient condition for fast identification of strong temporal association rules. By combining temporal soft sets with NegNodeset-based frequent item set mining techniques, we develop the negFIN-based soft temporal association rule mining (negFIN-STARM) method to extract strong temporal association rules. Numerical experiments are conducted on commonly used data sets to show the feasibility of our approach. Moreover, comparative analysis demonstrates that the newly proposed method achieves higher execution efficiency than three well-known approaches in the literature.National Natural Science Foundation of China (NSFC) 11301415Shaanxi Provincial Key Research and Development Program 2021SF-480Natural Science Basic Research Plan in Shaanxi Province of China 2018JM105

    Discovery of structural and functional features in RNA pseudoknots

    Full text link
    An RNA pseudoknot consists of nonnested double-stranded stems connected by single-stranded loops. There is increasing recognition that RNA pseudoknots are one of the most prevalent RNA structures and fulfill a diverse set of biological roles within cells, and there is an expanding rate of studies into RNA pseudoknotted structures as well as increasing allocation of function. These not only produce valuable structural data but also facilitate an understanding of structural and functional characteristics in RNA molecules. PseudoBase is a database providing structural, functional, and sequence data related to RNA pseudoknots. To capture the features of RNA pseudoknots, we present a novel framework using quantitative association rule mining to analyze the pseudoknot data. The derived rules are classified into specified association groups regarding structure, function, and category of RNA pseudoknots. The discovered association rules assist biologists in filtering out significant knowledge of structure-function and structure-category relationships. A brief biological interpretation to the relationships is presented, and their potential correlations with each other are highlighted.<br /

    Being a binding site: Characterizing residue composition of binding sites on proteins

    Get PDF
    The Protein Data Bank contains the description of more than 45,000 three-dimensional protein and nucleic-acid structures today. Started to exist as the computer-readable depository of crystallographic data complementing printed articles, the proper interpretation of the content of the individual files in the PDB still frequently needs the detailed information found in the citing publication. This fact implies that the fully automatic processing of the whole PDB is a very hard task. We first cleaned and re-structured the PDB data, then analyzed the residue composition of the binding sites in the whole PDB for frequency and for hidden association rules. Main results of the paper: (i) the cleaning and repairing algorithm (ii) redundancy elimination from the data (iii) application of association rule mining to the cleaned non-redundant data set. We have found numerous significant relations of the residue-composition of the ligand binding sites on protein surfaces, summarized in two figures. One of the classical data-mining methods for exploring implication-rules, the association-rule mining, is capable to find previously unknown residue-set preferences of bind ligands on protein surfaces. Since protein-ligand binding is a key step in enzymatic mechanisms and in drug discovery, these uncovered preferences in the study of more than 19,500 binding sites may help in identifying new binding protein-ligand pairs

    Towards a biological modelling tool recommending proper subnetworks

    Get PDF
    The aim of this thesis is to develop methods that suggest the users suitable subnetworks for integration during modelling. To this end, techniques from the field of recommender systems are used, which aim to predict the users’ interest in certain objects in order to filter and recommend the most suitable ones. Especially association rule mining is of particular relevance in this thesis. Its algorithms offer the opportunity to find patterns of joint appearance in a large set of items. For this purpose, biological networks are considered, which are represented as graphs and annotated with standardised ontology terms. Association rule mining then is applied with respect to structural and also to semantic similarity. For a partly modelled biological network the elements are found that may extend it. The obtained results form a solid basis for the development of a recommender system that facilitates the efficient reuse of networks and decreases the manual effort to find and integrate relevant structures

    IMPLEMENTATION OF DYNAMIC AND FAST MINING ALGORITHMS ON INCREMENTAL DATASETS TO DISCOVER QUALITATIVE RULES

    Get PDF
    Association Rule Mining is an important field in knowledge mining that allows the rules of association needed for decision making. Frequent mining of objects presents a difficulty to huge datasets. As the dataset gets bigger and more time and burden to uncover the rules. In this paper, overhead and time-consuming overhead reduction techniques with an IPOC (Incremental Pre-ordered code) tree structure were examined. For the frequent usage of database mining items, those techniques require highly qualified data structures. FIN (Frequent itemset-Nodeset) employs a node-set, a unique and new data structure to extract frequently used Items and an IPOC tree to store frequent data progressively. Different methods have been modified to analyze and assess time and memory use in different data sets. The strategies suggested and executed shows increased performance when producing rules, using time and efficiency

    A Survey on Index Support for Item Set Mining

    Get PDF
    It is very difficult to handle the huge amount of information stored in modern databases. To manage with these databases association rule mining is currently used, which is a costly process that involves a significant amount of time and memory. Therefore, it is necessary to develop an approach to overcome these difficulties. A suitable data structures and algorithms must be developed to effectively perform the item set mining. An index includes all necessary characteristics potentially needed during the mining task; the extraction can be executed with the help of the index, without accessing the database. A database index is a data structure that enhances the speed of information retrieval operations on a database table at very low cost and increased storage space. The use index permits user interaction, in which the user can specify different attributes for item set extraction. Therefore, the extraction can be completed with the use index and without accessing the original database. Index also supports for reusing concept to mine item sets with the use of any support threshold. This paper also focuses on the survey of index support for item set mining which are proposed by various authors

    Iterated learning and grounding: from holistic to compositional languages

    Get PDF
    This paper presents a new computational model for studying the origins and evolution of compositional languages grounded through the interaction between agents and their environment. The model is based on previous work on adaptive grounding of lexicons and the iterated learning model. Although the model is still in a developmental phase, the first results show that a compositional language can emerge in which the structure reflects regularities present in the population's environment
    corecore