倒排文件和动态签名文件对网络的优化
摘要
Web目录是Web文档的类别的分类。这种IR系统介绍了仅限于一个地区的分类图的具体的搜索类型的文件收集。本文介绍了一种特定的数据结构的Web目录提高了性能限制的搜索。该架构是基于混合数据结构的一个组成的倒排文件多种嵌入式签名文件。两个版本的基础上提出的模型介绍:混合结构,总的信息和混合结构的部分资料。对这种结构的有效性作了分析,以发展这两个备选案文比较基本的模式。搜索系统的性能的限制查询明显改善,特别是混合模式部分信息,从而在任何负载下产生了积极的回应。(毕业设计)
ABSTRACT
Web directories are taxonomies for the classification of Web documents. This kind of IR systems present a specific type of search where the document collection is restricted to one area of the category graph. This paper introduces a specific data architecture for Web directories which improves the performance of restricted searches. That architecture is based on a hybrid data structure composed of an inverted file with multiple embedded signature files. Two variants based on the proposed model are presented: hybrid architecture with total information and hybrid architecture with partial information. The validity of this architecture has been analysed by means of developing both variants to be compared with a basic model. The performance of the restricted queries was clearly improved, specially the hybrid model with partial information, which yielded a positive response under any load of the search system.
共4500字