6 research outputs found

    Updating Algorithm for Association Rules Based on Fully Mining Incremental Transactions

    Get PDF
    目前已提出了许多快速的关联规则增量更新挖掘算法,但是它们在处理对新增事务敏感的问题时,往往会丢失一些重要规则。为此,文章提出了一种新的挖掘增量更新后的数据库中频繁项集的算法EUFIA(Entirety Update Frequent Itemsets Algorithm),该算法先对新增事务数据分区,然后快速扫描各分区,能全面有效地挖掘出其中的频繁项集,且不丢失重要规则。同时,最多只扫描1次原数据库也能获得更新后事务数据库的全局频繁项集。研究表明,该算法具有很好的可测量性。Incremental Association rules Mining is an important content of data mining technology.This study proposes a new algorithm,called the Entirety Update Frequent Itemsets Algorithm(EUFIA)for efficiently incrementally mining association rules from large transaction database.Rather than rescanning the original database for some new generated frequent itemsets,EUFIA partitions the incremental database logically according to unit time interval,then accumulates the occurrence counts of new generated frequent itemsets and deletes infrequent itemsets obviously by backward method.Thus,EUFIA can discover newly generated frequent itemsets more efficiently and need rescan the original database only once to get overall frequent itemsets in the final database if necessary.EUFIA has good scalability in our simulation.国家自然科学基金项目(50474033);; 福建省自然科学基金项目(A0310008);; 福建省高新技术研究开放计划重点项目(2003H043

    Mining frequent item sets from several conditional FP_trees

    Get PDF
    在数据挖掘中发现关联规则是一个基本问题,而关联规则发现中最昂贵的步骤便是寻找频繁模式。FP_growth(FrequentPatern growth)方法在产生长短频繁项集时不产生候选项集,从而大大提高了挖掘的效率,但是FP_growth在挖掘频繁模式时候产生大量的条件FP树从而占用大量空间,对FP_growth进行研究并提出一种改进算法,该算法不仅利用FP_growth算法所有优点而且避免了FP_growth的缺陷。主要通过建立有限棵条件FP树(数目为事务数据库的属性个数)来挖据长短频繁模式,大大节省了FP_growth算法所需要空间,实验证明该文算法是有效的。Discovering association rules is a basic problem in data mining.Finding frequent item sets is the most expensive step in association rule discovery.Analysing a frequent pattern growth(FP-growth) method is effieient for mining both long and short frequent patterns without candidate generation,but FP_growth would generate a huge number of conditional FP-trees and then occupied memory space,so proposing a new efficient algorithm not only heirs all the advantages in FP-growth method,but also avoids its bottleneck.By establishing several conditional FP_trees(the number is equal the number of database's items) to mine long and short frequent item sets,the improved algorithm could save memory space significantly.Performance study also shows that the improved method is efficient.福建省自然科学基金(the Natural Science Foundation of Fujian Province of China under Grant No.A0310008);; 福建省高新技术研究开放计划重点项目(2003H043

    普通球粒陨石和灶神星陨石钒稳定同位素组成

    No full text
    钒同位素在天体化学中有2个重要的应用:(1)对比陨石与地球的钒同位素组成有助于了解地球的物质来源和增生演化历史,制约不同类型陨石的形成过程和组成地球的物质来源;(2)高能量辐射会造成50V的大量富集,陨石高精度的钒同位素测量值可以揭示太阳系早期辐射历史,检验X-Wind模式理论[1]

    我国大气中挥发性有机物的分布特征

    No full text
    大气中挥发性有机物(VOCs)是臭氧和二次有机气溶胶形成的关键前体物之一,研究表明烷烃、烯烃、芳香烃是我国大气VOCs的重要组分。在不同区域,城市地区烷烃含量最高,而偏远地区芳香烃为含量最丰富的VOCs。VOCs浓度日间变化多呈双峰分布趋势,峰值多出现在早晨与傍晚的上下班高峰期。目前对我国臭氧污染事件的研究均表明芳香烃和烯烃是对臭氧生成贡献最大的化合物。VOCs源解析中广泛运用的模型包括CMB、PMF和PCA/APCS,各模型均存在优点和局限性。比较各地VOCs源解析结果,发现交通排放源和工业排放源为我国VOCs的主要人为来源。VOCs的跨区域传输决定与周边地区的合作将是未来空气治理中的发展方向。</p
    corecore