Vector Store Architecture¶
Introduction¶
OpenContracts uses a flexible vector store architecture that provides compatibility with multiple agent frameworks (PydanticAI, etc.) while maintaining a clean separation between business logic and framework-specific adapters.
Our approach uses a two-layer architecture: 1. Core Layer: Framework-agnostic business logic (CoreAnnotationVectorStore) 2. Adapter Layer: Thin wrappers for specific frameworks
This design enables efficient vector search across granular, visually-locatable annotations from PDF pages while supporting multiple agent frameworks through a single, well-tested codebase.
Architecture Overview¶
Core Layer: CoreAnnotationVectorStore¶
 The core layer contains all business logic for vector search operations, independent of any specific agent framework:
from opencontractserver.llms.vector_stores.core_vector_stores import (
    CoreAnnotationVectorStore,
    VectorSearchQuery,
    VectorSearchResult,
)
# Initialize core store with filtering parameters
core_store = CoreAnnotationVectorStore(
    corpus_id=123,
    user_id=456,
    embedder_path="sentence-transformers/all-MiniLM-L6-v2",
    embed_dim=384,
)
# Create framework-agnostic query
query = VectorSearchQuery(
    query_text="What are the key findings?",
    similarity_top_k=10,
    filters={"label": "conclusion"}
)
# Execute search
results = core_store.search(query)
# Access results
for result in results:
    annotation = result.annotation  # Django Annotation model
    score = result.similarity_score  # Similarity score (0.0-1.0)
Key features: - Framework Independence: No dependencies on specific AI frameworks - Django Integration: Direct use of Django ORM and VectorSearchViaEmbeddingMixin - Flexible Filtering: Support for corpus, document, user, and metadata filters - Embedding Generation: Automatic text-to-vector conversion using generate_embeddings_from_text - pgvector Integration: Efficient vector similarity search using PostgreSQL's pgvector extension
Adapter Layer: Framework-Specific Wrappers¶
Framework adapters are lightweight classes that translate between the core API and specific framework interfaces.
PydanticAI Adapter¶
class PydanticAIVectorStore:
    """PydanticAI adapter for Django Annotation Vector Store."""
    def __init__(self, corpus_id=None, user_id=None, **kwargs):
        self._core_store = CoreAnnotationVectorStore(
            corpus_id=corpus_id,
            user_id=user_id,
            **kwargs
        )
    async def search(self, query: str, top_k: int = 10) -> list[Document]:
        """Execute search using PydanticAI interface."""
        search_query = VectorSearchQuery(
            query_text=query,
            similarity_top_k=top_k
        )
        results = await self._core_store.search_async(search_query)
        return self._convert_to_documents(results)
Unified Factory Pattern¶
The UnifiedVectorStoreFactory automatically creates the appropriate vector store based on configuration:
from opencontractserver.llms.vector_stores import UnifiedVectorStoreFactory
# Automatically creates the right adapter based on settings
vector_store = UnifiedVectorStoreFactory.create(
    framework=settings.LLMS_DEFAULT_AGENT_FRAMEWORK,  # "pydantic_ai"
    corpus_id=corpus_id,
    user_id=user_id
)
Technical Deep Dive¶
Vector Search Pipeline¶
The search process follows this pipeline:
- Query Reception: Framework adapter receives query in framework-specific format
 - Query Translation: Adapter converts to 
VectorSearchQuery - Core Processing:
 - Build base Django queryset with instance filters (corpus, document, user)
 - Apply metadata filters (labels, etc.)
 - Generate embeddings from text if needed
 - Execute vector similarity search via 
search_by_embeddingmixin - Result Translation: Adapter converts 
VectorSearchResultback to framework format 
Integration with Django ORM and pgvector¶
The core store leverages Django's powerful ORM features combined with pgvector:
def search(self, query: VectorSearchQuery) -> list[VectorSearchResult]:
    """Execute vector search using Django ORM and pgvector."""
    # Build filtered queryset
    queryset = self._build_base_queryset()
    queryset = self._apply_metadata_filters(queryset, query.filters)
    # Perform vector search using mixin
    if query.query_embedding is not None:
        queryset = queryset.search_by_embedding(
            query_vector=query.query_embedding,
            embedder_path=self.embedder_path,
            top_k=query.similarity_top_k
        )
    # Convert to results
    return [
        VectorSearchResult(
            annotation=ann,
            similarity_score=getattr(ann, 'similarity_score', 1.0)
        )
        for ann in queryset
    ]
Under the hood, this uses pgvector's CosineDistance for efficient similarity computation:
-- Generated SQL uses pgvector's <=> operator
SELECT *, (embedding <=> %s) AS similarity_score
FROM annotations
WHERE corpus_id = %s
ORDER BY similarity_score
LIMIT %s
Embedding Management¶
The system automatically handles embedding generation and retrieval:
- Text Queries: Automatically converted to embeddings using corpus-configured embedders
 - Embedding Queries: Used directly for similarity search
 - Multi-dimensional Support: Supports 384, 768, 1536, and 3072 dimensional embeddings
 - Embedder Detection: Automatic detection of corpus-specific embedder configurations
 
Benefits of the Layered Architecture¶
1. Framework Flexibility¶
- Support multiple agent frameworks through simple adapters
 - Business logic remains consistent across frameworks
 - Easy switching between frameworks via configuration
 
2. Maintainability¶
- Single source of truth for search logic
 - Framework-specific code is minimal and focused
 - Bug fixes and improvements benefit all frameworks
 
3. Performance¶
- Direct Django ORM integration
 - Efficient pgvector similarity search
 - Optimized queryset construction with proper filtering
 
4. Extensibility¶
- Easy to add new metadata filters
 - Simple to support additional frameworks
 - Flexible configuration options
 
5. Testing¶
- Core logic can be tested independently
 - Framework adapters have minimal, focused tests
 - Clear separation of concerns
 
Adding Support for New Frameworks¶
To add support for a new framework:
1. Create the Adapter Class¶
class MyFrameworkVectorStore:
    def __init__(self, **kwargs):
        self._core_store = CoreAnnotationVectorStore(**kwargs)
    def search(self, framework_query):
        # Convert framework query to VectorSearchQuery
        core_query = self._convert_query(framework_query)
        # Use core store
        results = self._core_store.search(core_query)
        # Convert results back to framework format
        return self._convert_results(results)
2. Register with Factory¶
# In vector_store_factory.py
class UnifiedVectorStoreFactory:
    @classmethod
    def create(cls, framework: str, **kwargs):
        if framework == "my_framework":
            return MyFrameworkVectorStore(**kwargs)
        # ... other frameworks
3. Test the Adapter¶
def test_my_framework_adapter():
    store = MyFrameworkVectorStore(corpus_id=1)
    results = store.search("test query")
    assert len(results) > 0
Configuration¶
Framework Selection¶
Set the default framework in settings:
# settings.py
LLMS_DEFAULT_AGENT_FRAMEWORK = "pydantic_ai"
Embedder Configuration¶
Configure embedders per corpus:
# Corpus model
corpus.preferred_embedder = "sentence-transformers/all-MiniLM-L6-v2"
corpus.embed_dim = 384
Search Parameters¶
Customize search behavior:
# In your code
vector_store = CoreAnnotationVectorStore(
    similarity_threshold=0.7,  # Minimum similarity score
    max_results=100,           # Maximum results to return
    include_metadata=True      # Include annotation metadata
)
Performance Considerations¶
Indexing¶
Ensure proper PostgreSQL indexes:
-- pgvector index for similarity search
CREATE INDEX ON annotations USING ivfflat (embedding vector_cosine_ops);
-- B-tree indexes for filtering
CREATE INDEX ON annotations (corpus_id, document_id);
CREATE INDEX ON annotations (annotation_label_id);
Batch Operations¶
For bulk searches, use batch processing:
# Process multiple queries efficiently
queries = [VectorSearchQuery(text) for text in texts]
results = await asyncio.gather(*[
    core_store.search_async(q) for q in queries
])
Caching¶
Embeddings are cached automatically: - Document embeddings stored in database - Query embeddings cached in memory (15-minute TTL) - Corpus-level embedding configuration cached
Conclusion¶
This layered architecture provides a robust foundation for vector search capabilities while maintaining compatibility with multiple agent frameworks. By separating core business logic from framework-specific adapters, we achieve:
- Consistency: Same search behavior across all frameworks
 - Maintainability: Single codebase for core functionality
 - Flexibility: Easy addition of new framework support
 - Performance: Direct integration with Django ORM and pgvector
 
This design pattern is applied throughout OpenContracts to create a comprehensive, framework-agnostic foundation for AI-powered document analysis.