research

Research and Improvement on String Similarity Search and Join based on Appgram

Abstract

在传统数据库中进行查询时,选择与连接被视为最重要的几个操作之一,而在实际情况中,由于可能的错误或由不同数据格式导致数据呈现不一致,如果使用精确的选择与连接操作,由于数据字段无法完全匹配,数据查询可能无法返回结果。针对上述情形,通过采用一定容错机制,近似选择与连接被引入查询处理中。然而字符串近似搜索与连接不仅仅可以用于数据库中,同时在许多领域都有着广泛的应用,例如DNA序列分析、时间序列处理、重复Web页面检测、拼写检查、数据清洗、数据集成以及搜索引擎中查询建议等等。 针对现有算法处理字符串近似搜索与连接问题时间开销过大的情况,本文基于Appgram算法框架,以提高算法性能为主要目标,简要阐述...When we do the querying in a traditional database, the selection operation and the join operation may be regarded as one of the most important operations. But in real application, since the potential error or the different data representation by different format, if we use the exact selection or join operations, the querying may not return answer due to the mismatch in data field. In order to solv...学位:工学硕士院系专业:软件学院_软件工程学号:2432013115244

    Similar works