Expand description
This library provides an easy way to create custom indexers.
§Graceful shutdown
The shutdown sequence in the data ingestion system ensures clean termination of all components while preserving data integrity. It is initiated via a CancellationToken, which triggers a hierarchical and graceful shutdown process.
The shutdown process follows a top-down hierarchy:
Worker
: Individual workers within aWorkerPool
detect the cancellation signal, completes current checkpoint processing, sends final progress updates and signals completion to parentWorkerPool
viaWorkerStatus::Shutdown
message.WorkerPool
: Coordinates worker shutdowns, ensures all progress messages are processed, waits for all workers’ shutdown signals and notifiesIndexerExecutor
withWorkerPoolStatus::Shutdown
message when fully terminated.IndexerExecutor
: Orchestrates the shutdown of all worker pools and and finalizes system termination.
Modules§
Structs§
- Data
Ingestion Metrics - File
Progress Store - Manages persistent progress information stored in a JSON file.
- Indexer
Executor - The Executor of the main ingestion pipeline process.
- Reader
Options - Options for configuring how the checkpoint reader fetches new checkpoints.
- Shim
Progress Store - A simple, in-memory progress store primarily used for unit testing.
- Worker
Pool - A pool of
Worker
’s that process checkpoints concurrently.
Enums§
Constants§
Traits§
- Progress
Store - A trait defining the interface for persistent storage of checkpoint progress.
- Reducer
- Processes and commits batches of messages produced by workers.
- Worker
- Processes individual checkpoints and produces messages for optional batch processing.
Functions§
- create_
remote_ store_ client - Creates a remote store client without any retry mechanism.
- create_
remote_ store_ client_ with_ ops - Creates a remote store client with configurable retry behavior and options.
- setup_
single_ workflow - Sets up a single workflow for data ingestion.