Data Models Reference¶

Complete documentation of all dataclasses and Pydantic models used throughout Common Secretary Services.

Overview¶

All data models follow a consistent structure: - Type Safety: Strict type annotations with Python typing module - Validation: Field validation in __post_init__ methods - Serialization: to_dict() and from_dict() methods for JSON conversion - Immutability: Many models use frozen=True for immutability - Slots: Performance optimization with slots=True where applicable

Model Categories¶

Base Models¶

Fundamental models used by all processors:

Base Models - BaseResponse, ErrorInfo, RequestInfo, ProcessInfo

Processor-Specific Models¶

Models for specific processors:

Audio Models - AudioResponse, TranscriptionResult, TranscriptionSegment
Video Models - VideoResponse, VideoProcessingResult, VideoSource
PDF Models - PDFResponse, PDFMetadata, PDFProcessingResult
ImageOCR Models - ImageOCRResponse, ImageOCRMetadata
Transformer Models - TransformerResponse, TemplateField, TemplateFields
Session Models - SessionResponse, SessionInput, SessionOutput, SessionData
Event Models - EventResponse, EventInput, EventOutput, EventData
Track Models - TrackResponse, TrackInput, TrackOutput, TrackData
Story Models - StoryResponse, StoryProcessorInput, StoryProcessorOutput
YouTube Models - YoutubeResponse, YoutubeMetadata, YoutubeProcessingResult

System Models¶

Models for system functionality:

Job Models - Job, Batch, JobStatus, JobProgress, JobResults
LLM Models - LLMInfo, LLMRequest, LLModel
Metadata Models - MetadataResponse, ContentMetadata, TechnicalMetadata
Translation Models - Translation
Notion Models - NotionBlock, NotionPage, NotionResponse

Enums and Types¶

Enums - ProcessorType, ProcessingStatus, OutputFormat, EventFormat, PublicationStatus, LanguageCode

Common Patterns¶

Response Format¶

All API responses follow this structure:

@dataclass(frozen=True)
class BaseResponse:
    status: ProcessingStatus
    request: RequestInfo
    process: Optional[ProcessInfo]
    error: Optional[ErrorInfo]
    data: Any  # Processor-specific data

Error Handling¶

Errors are structured as:

@dataclass
class ErrorInfo:
    code: str
    message: str
    details: Dict[str, Any]

Process Tracking¶

Process information includes LLM tracking:

@dataclass
class ProcessInfo:
    id: str
    main_processor: str
    started: str
    completed: Optional[str]
    duration: Optional[float]
    llm_info: Optional[LLMInfo]  # LLM usage tracking
    is_from_cache: bool
    cache_key: Optional[str]

Usage Examples¶

Creating a Response¶

from src.core.models.base import BaseResponse, RequestInfo, ProcessInfo
from src.core.models.enums import ProcessingStatus

response = BaseResponse(
    status=ProcessingStatus.SUCCESS,
    request=RequestInfo(
        processor="audio",
        timestamp="2024-01-01T00:00:00Z"
    ),
    process=ProcessInfo(
        id="process-123",
        main_processor="audio",
        started="2024-01-01T00:00:00Z"
    ),
    data={"transcription": "..."}
)

Serialization¶

# Convert to dictionary
response_dict = response.to_dict()

# Convert to JSON
import json
json_str = json.dumps(response_dict)

Deserialization¶

# From dictionary
response = BaseResponse.from_dict(response_dict)

# From JSON
response_dict = json.loads(json_str)
response = BaseResponse.from_dict(response_dict)

API Reference - API endpoints using these models
Configuration Reference - Configuration options
Code Index - All documented modules