• 中国期刊全文数据库
  • 中国学术期刊综合评价数据库
  • 中国科技论文与引文数据库
  • 中国核心期刊(遴选)数据库
黄昊天, 张敬伟, 吴泽正, 等. 面向不可变存储环境的近似成员查询二级索引[J]. 桂林电子科技大学学报, 2023, 43(5): 345-354. doi: 10.3969/1673-808X.202318
引用本文: 黄昊天, 张敬伟, 吴泽正, 等. 面向不可变存储环境的近似成员查询二级索引[J]. 桂林电子科技大学学报, 2023, 43(5): 345-354. doi: 10.3969/1673-808X.202318
HUANG Haotian, ZHANG Jingwei, WU Zezheng, et al. Approximate membership query secondary index for immutable storage environment[J]. Journal of Guilin University of Electronic Technology, 2023, 43(5): 345-354. doi: 10.3969/1673-808X.202318
Citation: HUANG Haotian, ZHANG Jingwei, WU Zezheng, et al. Approximate membership query secondary index for immutable storage environment[J]. Journal of Guilin University of Electronic Technology, 2023, 43(5): 345-354. doi: 10.3969/1673-808X.202318

面向不可变存储环境的近似成员查询二级索引

Approximate membership query secondary index for immutable storage environment

  • 摘要: 键值存储系统得益于优异的写入性能被广泛应用于各种Web应用。主流的键值存储系统多利用布隆过滤器优化非主键查询性能,而该方法存在查询效率受数据段数量影响和误判引发的多余数据段访问可能性等不足。为了进一步优化键值系统分析处理的查询能力,针对一次写多次读的数据访问特征,提出了新型二级索引。通过构建全局索引,避免了多过滤器探测引发的查询延迟提高。此外,提出了预探测递归驱逐策略,优化了索引构建效率。与传统方法相比,该索引处理存在数据项查询时能够返回完全正确的数据段编号序列。进一步,该索引基于逻辑链的范围查询方案实现了范围查询。实验结果验证了索引结构的有效性。与基线结构相比,该索引的查询性能提升了约10 ~ 50倍。

     

    Abstract: Key-value storage system is widely used in various Web applications because of its excellent write performance. Current key-value storage systems mainly use Bloom filter to optimize non-primary key query performance, but has shortcomings such as query efficiency is affected by the number of data segments and the possibility of accessing redundant data segments caused by misjudgment. In order to optimize the query ability of key-value system analysis processing, a new secondary index was proposed for the data access characteristics of one writes and multiple reads. By building a global index, the increase in query latency caused by multiple filter probes is avoided. In addition, the pre-probe recursive expulsion strategy was proposed to optimize the efficiency of index construction. Compared with traditional methods, this secondary index can return exactly correct sequence of data segment numbers when processing existing data item queries. Furthermore, this secondary index uses a range query scheme based on logical chain to realize range query. Experimental results show that the index structure is effective. Compared to the baseline structure, this secondary index improves query performance by about 10 to 50 times.

     

/

返回文章
返回