Class InitializedHnswGraphBuilder

java.lang.Object
org.apache.lucene.util.hnsw.HnswGraphBuilder
org.apache.lucene.util.hnsw.InitializedHnswGraphBuilder
All Implemented Interfaces:
HnswBuilder

public final class InitializedHnswGraphBuilder extends HnswGraphBuilder
This creates a graph builder that is initialized with the provided HnswGraph. This is useful for merging HnswGraphs from multiple segments.

The builder performs the following operations:

  • Copies the graph structure from the initializer graph with ordinal remapping
  • Identifies and repairs disconnected nodes (nodes that lost a portion of their neighbors due to deletions)
  • Rebalances the graph hierarchy to maintain proper level distribution according to the HNSW probabilistic model
  • Allows incremental addition of new nodes while preserving initialized nodes

Disconnected Node Detection: A node is considered disconnected if it retains less than DISCONNECTED_NODE_FACTOR of its original neighbor count from the source graph. This typically occurs when many of the node's neighbors were deleted documents that couldn't be remapped.

WARNING: This API is experimental and might change in incompatible ways in the next release.
  • Method Details

    • fromGraph

      public static InitializedHnswGraphBuilder fromGraph(RandomVectorScorerSupplier scorerSupplier, int beamWidth, long seed, HnswGraph initializerGraph, int[] newOrdMap, BitSet initializedNodes, int totalNumberOfVectors) throws IOException
      Creates an initialized HNSW graph builder from an existing graph.

      This factory method constructs a new graph builder, initializes it with the structure from the provided graph (applying ordinal remapping), and returns the builder ready for additional operations.

      Parameters:
      scorerSupplier - provides vector similarity scoring for graph operations
      beamWidth - the search beam width for finding neighbors during graph construction
      seed - random seed for level assignment and node promotion during rebalancing
      initializerGraph - the source graph to copy structure from
      newOrdMap - maps old ordinals in the initializer graph to new ordinals in the merged graph; -1 indicates a deleted document that should be skipped
      initializedNodes - bit set marking which nodes are already initialized (can be null if not tracking)
      totalNumberOfVectors - the total number of vectors in the merged graph (used for pre-allocation)
      Returns:
      a new builder initialized with the provided graph structure
      Throws:
      IOException - if an I/O error occurs during graph initialization
    • initGraph

      public static OnHeapHnswGraph initGraph(HnswGraph initializerGraph, int[] newOrdMap, int totalNumberOfVectors, int beamWidth, RandomVectorScorerSupplier scorerSupplier) throws IOException
      Convenience method to create a fully initialized on-heap HNSW graph without tracking initialized nodes. This is useful when you just need the resulting graph structure without planning to add additional nodes incrementally.
      Parameters:
      initializerGraph - the source graph to copy structure from
      newOrdMap - maps old ordinals to new ordinals; -1 indicates deleted documents
      totalNumberOfVectors - the total number of vectors in the merged graph
      beamWidth - the search beam width for graph construction
      scorerSupplier - provides vector similarity scoring
      Returns:
      a fully initialized on-heap HNSW graph
      Throws:
      IOException - if an I/O error occurs during graph initialization
    • addGraphNode

      public void addGraphNode(int node) throws IOException
      Description copied from interface: HnswBuilder
      Inserts a doc with a vector value to the graph
      Specified by:
      addGraphNode in interface HnswBuilder
      Overrides:
      addGraphNode in class HnswGraphBuilder
      Throws:
      IOException