OpenContracts Metadata System - Overview¶
Introduction¶
OpenContracts provides a powerful and flexible metadata system that allows users to define custom metadata schemas at the corpus level. This system is built on a unified data model that handles both manual metadata entry and automated data extraction, providing consistency and eliminating code duplication.
Architecture¶
The metadata system uses three core models:
- Fieldsets: Container for metadata schemas, automatically created for each corpus
 - Columns: Define individual metadata fields with data types and validation rules
 - Datacells: Store actual metadata values for documents
 
Key Design Principles¶
- Unified Model: Same infrastructure handles both metadata and extracted data
 - Corpus-Scoped: Metadata schemas are defined at the corpus level
 - Type-Safe: Strong typing with comprehensive validation
 - Flexible: Support for 12 different data types including complex types
 
Documentation Structure¶
Core Documentation¶
- Metadata Fields - Technical reference for the metadata system architecture, models, and validation
 - Metadata API Examples - Comprehensive GraphQL API examples and integration patterns
 - Frontend Requirements - Frontend implementation guide with component specifications
 
Related Documentation¶
- Extraction Tutorial - Understanding the extraction system that shares the same infrastructure
 - Corpus Management - How to create and manage corpuses with metadata
 
Quick Start¶
1. Define Your Schema¶
Create metadata columns for your corpus using the GraphQL API:
mutation CreateMetadataColumn {
  createMetadataColumn(
    corpusId: "your-corpus-id",
    name: "Contract Type",
    dataType: "CHOICE",
    validationConfig: {
      required: true,
      choices: ["Service", "Purchase", "NDA"]
    }
  ) {
    ok
    obj { id }
  }
}
2. Set Metadata Values¶
Add metadata to documents:
mutation SetMetadataValue {
  setMetadataValue(
    documentId: "doc-id",
    corpusId: "corpus-id",
    columnId: "column-id",
    value: "Service"
  ) {
    ok
  }
}
3. Query Metadata¶
Retrieve metadata for analysis:
query GetDocumentMetadata {
  documentMetadataDatacells(documentId: "doc-id", corpusId: "corpus-id") {
    column { name }
    data
  }
}
Supported Data Types¶
| Type | Use Case | Example | 
|---|---|---|
| STRING | Short text (names, IDs) | "CONT-2024-001" | 
| TEXT | Long descriptions | "This agreement..." | 
| BOOLEAN | Yes/No fields | true/false | 
| INTEGER | Whole numbers | 42 | 
| FLOAT | Decimal values | 1234.56 | 
| DATE | Calendar dates | "2024-01-15" | 
| DATETIME | Timestamps | "2024-01-15T10:30:00Z" | 
| URL | Web links | "https://example.com" | 
| Email addresses | "user@example.com" | |
| CHOICE | Single selection | "Active" | 
| MULTI_CHOICE | Multiple selections | ["Legal", "Finance"] | 
| JSON | Complex data | {"key": "value"} | 
Key Features¶
Comprehensive Validation¶
- Type checking
 - Range constraints
 - Pattern matching
 - Required field enforcement
 - Custom validation rules
 
Flexible Schema Management¶
- Add/modify columns at any time
 - Set default values
 - Provide help text
 - Control display order
 
User Interface¶
- Excel-like grid editing
 - Inline validation
 - Bulk operations
 - Keyboard navigation
 
Integration¶
- Full GraphQL API
 - TypeScript support
 - React components
 - Apollo Client integration
 
Common Use Cases¶
Contract Management¶
Define metadata for contract lifecycle: - Contract type, status, dates - Vendor information - Financial values - Department assignments
Document Classification¶
Organize documents with: - Document types - Categories and tags - Processing status - Review states
Compliance Tracking¶
Track regulatory requirements: - Compliance status - Review dates - Approval workflows - Audit trails
Best Practices¶
- Plan Your Schema: Design metadata fields before adding documents
 - Use Appropriate Types: Choose the most specific data type
 - Set Sensible Defaults: Provide default values for common cases
 - Add Help Text: Guide users with clear descriptions
 - Validate Early: Use validation rules to catch errors at entry
 
Migration Notes¶
If migrating from the legacy annotation-based system: 1. Export existing metadata annotations 2. Create corresponding columns in the new system 3. Import values as datacells 4. Update any integrations to use the new API
Next Steps¶
- Review the technical documentation for detailed architecture information
 - Explore API examples for integration patterns
 - Check frontend requirements for UI implementation details