Chapter 2: Data Structures for Search
Understand the internal data structures that make search engines fast, scalable, and efficient.
Chapter Overview
Search engines achieve their speed through specialized data structures designed for rapid lookup and efficient storage. This chapter explores the foundational structures that power modern retrieval systems, from classical inverted indexes to graph-based approaches used in vector search.
Understanding these data structures is essential for making informed decisions about indexing strategies, query performance, and storage trade-offs.
2.1 Classical Data Structures
2.1.1 Inverted indexes, term dictionaries, and postings lists
2.1.2 Tries and finite automata
2.1.3 k-d trees
2.1.4 Prefix, binary, and multiway trees
2.2 Compressed Structures
2.2.1 Roaring bitmaps and bitmap compression
2.2.2 Columnar formats for analytics (not search)
2.3 Graph-Based Retrieval
2.3.1 k-nearest neighbor (KNN)
2.3.2 Hierarchical graphs: HNSW, DiskANN
2.3.3 A* routing
Examples
Examples coming soon.
Code examples for this chapter will include implementations of inverted indexes, bitmap operations, and graph-based nearest neighbor search using Lucenia.