Files
sdb2-claude/CLAUDE.md

549 lines
30 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Development Commands
### Backend (Python/FastAPI)
- **Development server**: `cd backend && uv run python run.py` or `cd backend && uv run uvicorn app.main:app --reload`
- **Tests**: `cd backend && uv run pytest` (uses pytest with asyncio support)
- **Coverage**: `cd backend && uv run coverage run -m pytest && uv run coverage report`
- **Linting**: `cd backend && uv run ruff check` and `cd backend && uv run ruff format`
- **Type checking**: `cd backend && uv run mypy .`
- **Install dependencies**: `cd backend && uv sync`
### Frontend (React/TypeScript/Vite)
- **Development server**: `cd frontend && bun dev` (runs on port 8001)
- **Build**: `cd frontend && bun run build`
- **Linting**: `cd frontend && bun run lint`
- **Preview build**: `cd frontend && bun run preview`
- **Install dependencies**: `cd frontend && bun install`
## Architecture Overview
This is a soundboard application with a FastAPI backend and React frontend.
### Backend Architecture
- **Framework**: FastAPI with SQLModel for database ORM
- **Database**: SQLite with aiosqlite async driver
- **Authentication**: JWT tokens with OAuth2 support (Google, GitHub)
- **Dependencies**: FastAPI, SQLModel, aiosqlite, bcrypt, PyJWT, pydantic-settings, uvicorn, ffmpeg-python, yt-dlp, python-vlc
- **Structure**:
- `app/api/v1/`: API endpoints for v1 (auth.py, main.py, sounds.py, extractions.py, socket.py, player.py, playlists.py)
- `app/api/v1/admin/`: Admin-only API endpoints (sounds.py, extractions.py)
- `app/models/`: Database models (User, Sound, Playlist, Extraction, Plan, UserOAuth, CreditTransaction, SoundPlayed, etc.)
- `app/services/`: Business logic layer (auth.py, oauth.py, socket.py, player.py, sound_scanner.py, sound_normalizer.py, extraction.py, extraction_processor.py, credit.py)
- `app/repositories/`: Data access layer (base.py, user.py, user_oauth.py, sound.py, extraction.py, credit_transaction.py, playlist.py)
- `app/schemas/`: Pydantic schemas for API requests/responses (auth.py)
- `app/core/`: Configuration, database setup, logging, dependencies, seeds
- `app/middleware/`: Custom middleware (logging)
- `app/utils/`: Utility functions (auth.py, cookies.py, audio.py)
- `tests/`: Comprehensive test suite with pytest and asyncio support
### Frontend Architecture
- **Framework**: React 19 with TypeScript
- **Build Tool**: Vite with SWC for fast development and builds
- **UI Library**: Comprehensive Radix UI component system with shadcn/ui
- **Styling**: Tailwind CSS v4 with custom components
- **Theming**: next-themes with dark mode support via ThemeProvider and ModeToggle
- **Routing**: React Router v7
- **Package Manager**: Bun for fast package management
- **Key Dependencies**: @radix-ui components, lucide-react icons, recharts, sonner notifications
- **Structure**:
- `src/components/ui/`: Complete UI component library (button, card, dialog, table, etc.)
- `src/components/`: App-specific components (ThemeProvider, ModeToggle)
- `src/hooks/`: Custom React hooks (use-mobile.ts)
- `src/lib/`: Utility functions (utils.ts with cn helper)
- `src/contexts/`: React contexts (empty, ready for state management)
- `src/assets/`: Static assets
### Key Models
- **User**: Authentication, plans, credits, API tokens
- **UserOAuth**: OAuth provider connections (Google, GitHub) with unique constraints
- **Plan**: User subscription plans and limits
- **Sound**: Audio files with metadata, normalization fields, play counts, **unique hash constraint**
- **Playlist**: User-created sound collections
- **PlaylistSound**: Many-to-many relationship between playlists and sounds
- **Extraction**: Audio extraction jobs from external services with async processing and flexible service detection
- **SoundPlayed**: Play history tracking with user and sound associations
- **CreditTransaction**: Comprehensive credit system transaction logging with metadata
### Database
- SQLite database at `backend/data/soundboard.db`
- Models use SQLModel (Pydantic + SQLAlchemy)
- Async database operations with aiosqlite
- **Data Integrity**: Unique constraints on sound hash, OAuth provider+user combinations
- **Foreign Key Relationships**: Proper cascading and relationship management
### Configuration
- Backend settings in `backend/app/core/config.py` using pydantic-settings
- Environment variables loaded from `.env` files
- Configurable settings: database URL, JWT secrets, OAuth2 client credentials, logging, cookies, audio normalization, audio extraction, credits system
- Default ports: Backend (8000), Frontend (8001)
- OAuth redirect URL: `http://localhost:8001/auth/callback`
### Development Notes
- Backend runs on port 8000 by default (configurable via HOST/PORT env vars)
- Frontend dev server runs on port 8001
- Project uses Python 3.12+ with uv package manager for backend
- Frontend uses TypeScript 5.8+ with strict mode enabled
- Comprehensive linting: Ruff (backend), ESLint (frontend)
- Type checking: mypy (backend), TypeScript (frontend)
- Testing: pytest with asyncio support and coverage reporting (90+ comprehensive tests including repository, service, and integration tests)
- Logs stored in `backend/logs/app.log` with rotation
- Audio files stored in `backend/sounds/` directory structure (originals, normalized, extracted)
- Database file at `backend/data/soundboard.db`
- Extraction processing uses background workers with configurable concurrency limits
## Credit System
The application includes a comprehensive credit-based system for managing user actions and resource consumption.
### Credit Features
- **Action-based Deductions**: Credits are deducted for specific actions (VLC play, audio extraction, etc.)
- **Transaction Logging**: All credit changes are logged with detailed metadata
- **Plan Integration**: Credit limits and replenishment tied to user subscription plans
- **Real-time Updates**: WebSocket events notify users of credit changes
- **Admin Management**: Administrative controls for credit adjustments
### Credit Actions
- **VLC Play Sound**: Deducts credits when playing sounds through VLC
- **Audio Extraction**: Deducts credits for extracting audio from external URLs
- **Credit Addition**: Administrative credit bonuses and plan-based replenishment
### Database Schema (CreditTransaction Model)
- **Comprehensive Tracking**: User ID, action type, amount, balance before/after
- **Metadata Storage**: JSON metadata for action-specific details
- **Success Tracking**: Boolean flag for successful/failed transactions
- **Temporal Ordering**: Created/updated timestamps for audit trails
### API Integration
- **Automatic Deduction**: Services automatically deduct credits during operations
- **Balance Checking**: Credit validation before expensive operations
- **Transaction History**: API endpoints for viewing credit transaction history
- **Real-time Events**: WebSocket emission of `user_credits_changed` events
### Technical Implementation
- **Service**: `app/services/credit.py` - Core credit management with WebSocket integration
- **Repository**: `app/repositories/credit_transaction.py` - Database operations for credit transactions
- **Models**: `CreditTransaction` model with comprehensive metadata tracking
- **Testing**: 14 comprehensive tests covering all credit scenarios
## Sound Management System
Enhanced sound management with comprehensive duplicate prevention and integrity features.
### Sound Features
- **Duplicate Prevention**: Unique hash constraint prevents duplicate audio files
- **Metadata Tracking**: Complete audio file metadata (duration, size, hash, type)
- **Play Count Tracking**: Usage statistics for popular sounds analysis
- **Type Classification**: SDB (soundboard), TTS (text-to-speech), EXT (extracted) categorization
- **Normalization Support**: Integration with audio normalization system
- **File Integrity**: SHA-256 hash verification for data integrity
### Database Constraints
- **Unique Hash**: `UniqueConstraint("hash", name="uq_sound_hash")` prevents duplicate files
- **Data Integrity**: Proper foreign key relationships and nullable field handling
- **Indexed Fields**: Optimized queries for common operations (filename, hash, type)
### API Endpoints
- `GET /api/v1/sounds/`: Get all sounds with optional type filtering (authenticated users only)
- Query parameters: `types` (can be specified multiple times for filtering by multiple types)
- Examples:
- `GET /api/v1/sounds/` - Returns all sounds
- `GET /api/v1/sounds/?types=SDB` - Returns only SDB type sounds
- `GET /api/v1/sounds/?types=SDB&types=EXT` - Returns SDB and EXT type sounds
- `POST /api/v1/sounds/play/{sound_id}`: Play sound with VLC (requires 1 credit)
- `POST /api/v1/sounds/stop`: Stop all VLC instances
### Sound Type Filtering Features
- **Authentication Required**: All sound endpoints require valid user authentication
- **Type-based Filtering**: Filter sounds by one or more types (SDB, TTS, EXT)
- **Flexible Query Parameters**: Multiple `types` parameters supported for complex filtering
- **Empty Results**: Invalid types return empty list without error
- **Performance Optimized**: Uses SQLAlchemy `IN` clause for efficient multi-type queries
### Technical Implementation
- **Repository**: `app/repositories/sound.py` - Complete CRUD operations with specialized queries including `get_by_types()` for type filtering
- **Models**: Enhanced `Sound` model with unique constraints and relationship management
- **API Integration**: Sound creation, update, deletion with duplicate prevention, authenticated sound retrieval
- **Testing**: 15+ comprehensive tests covering all sound operations including constraint validation and API endpoint testing
## Player System
Comprehensive audio player service with VLC backend for playlist management and audio playback.
### Player Features
- **VLC Integration**: Uses VLC media player as the backend for reliable audio playback
- **Playlist Management**: Dynamic playlist loading and reloading with state persistence
- **Intelligent Track Handling**: Smart playlist reload logic with track position tracking
- **Multiple Playback Modes**: Continuous, loop, loop-one, random, and single play modes
- **Play Count Tracking**: Automatic play count updates with 20% threshold detection
- **Real-time Position Tracking**: Background thread for position updates and auto-advance
- **WebSocket Broadcasting**: Real-time state updates via WebSocket connections
- **Credit Integration**: Automatic credit deduction for VLC-based sound plays
### Playlist Reload Logic
- **ID-based Comparison**: Compares playlist IDs to determine reload behavior
- **Playlist Change Handling**: When playlist ID changes, stops player and resets to first track
- **Track Position Tracking**: When same playlist, tracks if current song moved to different index
- **Missing Track Handling**: When current track removed, stops player and sets first available track
- **Empty Playlist Support**: Graceful handling of empty playlists with state clearing
- **State Consistency**: Ensures player state remains consistent across all reload scenarios
### Player State Management
- **Status Tracking**: Playing, paused, stopped states with proper transitions
- **Sound Information**: Current track ID, index, position, duration tracking
- **Playlist Metadata**: Playlist ID, name, length, total duration, and sound list
- **Volume Control**: Volume management with range validation (0-100)
- **Position Tracking**: Real-time playback position with seek functionality
### Database Integration
- **Play History**: Records `SoundPlayed` entries for player-based plays (no user association)
- **Sound Statistics**: Updates sound play counts automatically when 20% threshold reached
- **Playlist Synchronization**: Syncs with database playlist changes via reload mechanism
- **Session Management**: Proper async database session handling with connection cleanup
### Technical Implementation
- **Service**: `app/services/player.py` - Core player logic with VLC integration
- **State Management**: PlayerState class for comprehensive state tracking
- **Background Threading**: Position tracking thread for non-blocking operations
- **Async Operations**: Full async/await support for database operations
- **Error Handling**: Comprehensive error handling with graceful degradation
- **Memory Management**: Proper cleanup of resources and background tasks
### Player Modes
- **Continuous**: Plays through playlist once then stops
- **Loop**: Repeats entire playlist indefinitely
- **Loop One**: Repeats current track indefinitely
- **Random**: Plays tracks in random order
- **Single**: Plays current track once then stops
### API Integration
- **REST Endpoints**: Player control via HTTP API (`app/api/v1/player.py`)
- **WebSocket Events**: Real-time state broadcasting to connected clients
- **Authentication**: Supports both authenticated and unauthenticated playback
- **Global Service**: Singleton player service accessible throughout the application
### Testing Coverage
- **49 comprehensive tests** covering all player functionality including:
- State management and serialization
- Playback control (play, pause, stop, seek)
- Playlist reload scenarios with ID changes
- Track position tracking and updates
- Helper method validation
- Mode switching and volume control
- Play count tracking and credit integration
- Error handling and edge cases
## Repository Pattern & Testing
Comprehensive repository pattern implementation with full test coverage for data access layer.
### Repository Architecture
- **Base Repository**: `app/repositories/base.py` - Generic CRUD operations with type safety
- **Specialized Repositories**: Domain-specific repositories extending base functionality
- **Async Operations**: Full async/await support for non-blocking database operations
- **Error Handling**: Comprehensive exception handling with logging
### Repository Coverage
- **User Repository**: User management, authentication, role-based operations
- **Sound Repository**: Audio file management with specialized queries including type-based filtering (`get_by_types()`, `get_by_type()`, `get_by_hash()`, `search_by_name()`, etc.)
- **Credit Transaction Repository**: Credit system transaction management
- **User OAuth Repository**: OAuth provider management and authentication
- **Playlist Repository**: Playlist management and sound associations
- **Extraction Repository**: Audio extraction job management
### Testing Infrastructure
- **80+ Repository Tests**: Comprehensive test coverage across all repositories
- **Async Test Support**: Proper async/await testing with pytest-asyncio
- **SQLAlchemy Integration**: Proper session management and lazy loading handling
- **Type Safety**: Complete mypy type checking compliance
- **Fixture Management**: Reusable test fixtures with proper dependency injection
### Test Categories
- **CRUD Operations**: Create, read, update, delete operations for all entities
- **Constraint Validation**: Unique constraint and foreign key relationship testing
- **Pagination Testing**: Limit/offset pagination with proper ordering
- **Error Scenarios**: Exception handling and error condition testing
- **Performance Tests**: Query optimization and efficient data access patterns
## Sound Normalization System
The application includes a comprehensive audio normalization system using FFmpeg's loudnorm filter for professional-quality audio processing.
### Normalization Features
- **Two-pass normalization**: Default high-quality mode with analysis and normalization phases
- **One-pass normalization**: Fast mode for quick processing or as fallback
- **Intelligent fallback**: Automatically switches to one-pass for problematic audio (infinite analysis values)
- **Batch processing**: Normalize all sounds or filter by type (SDB, TTS, EXT)
- **Admin-only access**: Normalization endpoints require administrator privileges
- **Comprehensive logging**: Detailed FFmpeg output and error handling
### Directory Structure
```
backend/sounds/
├── originals/
│ ├── soundboard/ # SDB type sounds
│ ├── text_to_speech/ # TTS type sounds
│ └── extracted/ # EXT type sounds
└── normalized/
├── soundboard/ # Normalized SDB sounds
├── text_to_speech/ # Normalized TTS sounds
└── extracted/ # Normalized EXT sounds
```
### Configuration (Environment Variables)
- `NORMALIZED_AUDIO_FORMAT`: Output format (default: "mp3")
- `NORMALIZED_AUDIO_BITRATE`: Bitrate setting (default: "256k")
- `NORMALIZED_AUDIO_PASSES`: 1 for one-pass, 2 for two-pass (default: 2)
### Database Fields (Sound Model)
- `is_normalized`: Boolean flag indicating normalization status
- `normalized_filename`: Filename of normalized audio file
- `normalized_duration`: Duration in milliseconds of normalized file
- `normalized_size`: File size in bytes of normalized file
- `normalized_hash`: SHA-256 hash of normalized file for integrity
### API Endpoints
- `POST /api/v1/sounds/normalize/all`: Normalize all unnormalized sounds
- `POST /api/v1/sounds/normalize/type/{sound_type}`: Normalize sounds by type
- `POST /api/v1/sounds/normalize/{sound_id}`: Normalize specific sound
- **Parameters**: `force` (re-normalize already processed), `one_pass` (override config)
### Technical Implementation
- **Service**: `app/services/sound_normalizer.py` - Core normalization logic
- **API**: `app/api/v1/sounds.py` - REST endpoints (consolidated with other sound endpoints)
- **Repository**: Enhanced `app/repositories/sound.py` with normalization queries
- **Dependencies**: Requires FFmpeg installed on system, uses ffmpeg-python library
- **Error Handling**: Graceful fallback for edge cases (silent audio, infinite values)
- **Session Management**: Handles SQLModel session detachment in batch operations
### Normalization Process
1. **Analysis Phase** (two-pass only): Analyze audio characteristics
2. **Validation**: Check for invalid analysis values (inf, -inf, nan)
3. **Fallback Logic**: Switch to one-pass if analysis contains invalid values
4. **Normalization**: Apply loudnorm filter with target levels (I=-23, TP=-2, LRA=7)
5. **Database Update**: Store normalized file metadata and set is_normalized flag
### Testing
- 17 comprehensive service tests covering all normalization scenarios
- 16 API endpoint tests with authentication and authorization checks
- Edge case handling for problematic audio files
- Mock FFmpeg operations for reliable testing
## Sound Scanner System
The application includes a sound scanner service for automatically discovering, importing, and managing audio files in the filesystem.
### Scanner Features
- **File Discovery**: Recursively scans sound directories for audio files
- **Format Support**: Handles multiple audio formats (.mp3, .wav, .flac, .ogg, .m4a, etc.)
- **Metadata Extraction**: Uses FFmpeg to extract duration and file information
- **Database Sync**: Automatically adds new files, updates existing ones, removes deleted files
- **Admin-only Access**: Scanning operations require administrator privileges
- **Comprehensive Reporting**: Detailed results showing added, updated, deleted, and skipped files
- **Duplicate Prevention**: Integration with unique hash constraint system
### Technical Implementation
- **Service**: `app/services/sound_scanner.py` - Core scanning and import logic
- **API**: `app/api/v1/sounds.py` - REST endpoint for scanning operations
- **Dependencies**: Requires FFmpeg for metadata extraction
- **Error Handling**: Graceful handling of corrupted or unreadable files
- **Hash-based Detection**: Uses SHA-256 hashing to detect file changes and prevent duplicates
### Scanning Process
1. **Directory Traversal**: Recursively scan configured sound directories
2. **File Validation**: Check file extensions and accessibility
3. **Metadata Extraction**: Extract duration, size, and hash using FFmpeg
4. **Database Comparison**: Compare with existing database records
5. **Duplicate Detection**: Check unique hash constraint before insertion
6. **Sync Operations**: Add new files, update changed files, remove deleted files
7. **Results Reporting**: Return detailed scan results with statistics
### API Endpoints
- `POST /api/v1/sounds/scan`: Scan and sync sound directories
## WebSocket/Socket.IO System
Real-time communication system using WebSocket connections for live updates and messaging.
### Socket Features
- **Real-time Communication**: WebSocket-based messaging between users
- **Connection Management**: Track connected users and connection status
- **User-to-User Messaging**: Send messages to specific users
- **Connection Status**: Get current connection status and user count
- **Authentication Integration**: Uses existing user authentication system
- **Credit Change Notifications**: Real-time credit balance updates via `user_credits_changed` events
### Technical Implementation
- **Service**: `app/services/socket.py` - Socket.IO manager and connection handling
- **API**: `app/api/v1/socket.py` - REST endpoints for socket operations
- **Manager**: Centralized socket connection management with user tracking
- **Authentication**: Integrated with existing JWT authentication system
- **Event System**: Structured event emission for various application events
### API Endpoints
- `GET /api/v1/socket/status`: Get current socket connection status
- `POST /api/v1/socket/send-message`: Send a message to a specific user via WebSocket
### Socket Events
- **Connection Management**: Connection and disconnection tracking
- **User Messages**: User-specific message routing
- **Credit Updates**: `user_credits_changed` events with detailed transaction data
- **Real-time Status**: Live application status updates
## Audio Utilities
Shared utility functions for audio file processing used across multiple services.
### Audio Utility Functions
- **File Hashing**: `get_file_hash()` - Calculate SHA-256 hash of audio files for integrity checking
- **File Size**: `get_file_size()` - Get file size in bytes for metadata storage
- **Duration Extraction**: `get_audio_duration()` - Extract audio duration in milliseconds using FFmpeg
### Technical Implementation
- **Module**: `app/utils/audio.py` - Shared audio processing utilities
- **Dependencies**: Uses FFmpeg via ffmpeg-python for duration extraction
- **Error Handling**: Graceful fallback for corrupted or unreadable files
- **Consistent Interface**: Same function signatures across all audio services
### Usage
- **Sound Scanner**: Uses utilities for file discovery and metadata extraction
- **Sound Normalizer**: Uses utilities for normalized file verification and metadata
- **Audio Extraction**: Uses utilities for extracted audio file metadata and validation
- **Duplicate Prevention**: Hash calculation for unique constraint enforcement
- **Centralized Logic**: Eliminates code duplication between audio processing services
## Audio Extraction System
The application includes a comprehensive audio extraction system for downloading and processing audio content from external services using yt-dlp.
### Extraction Features
- **Immediate Response**: API endpoints return immediately without waiting for yt-dlp processing
- **Background Processing**: Actual extraction happens asynchronously in background worker threads
- **Multi-Service Support**: Supports YouTube, SoundCloud, Vimeo, DailyMotion, TikTok, Twitter, Instagram
- **Non-blocking Operations**: yt-dlp operations run in thread pools to prevent event loop blocking
- **Concurrent Processing**: Configurable maximum concurrent extractions with queue management
- **Automatic Normalization**: Extracted audio is automatically normalized using the sound normalization system
- **Error Handling**: Comprehensive error handling with detailed logging and status tracking
- **Credit Integration**: Automatic credit deduction for extraction operations
### Database Schema (Extraction Model)
- **Flexible Service Detection**: `service` and `service_id` are nullable during creation, populated during processing
- **Status Tracking**: `pending``processing``completed`/`failed`
- **Metadata Storage**: URL, title, user association, linked sound record
- **Error Logging**: Detailed error messages for failed extractions
### Directory Structure
```
backend/sounds/temp/ # Temporary extraction workspace
backend/sounds/originals/extracted/ # Final extracted audio files
backend/sounds/originals/extracted/thumbnails/ # Extracted thumbnails
```
### Configuration (Environment Variables)
- `EXTRACTION_AUDIO_FORMAT`: Output audio format (default: "mp3")
- `EXTRACTION_AUDIO_BITRATE`: Audio bitrate setting (default: "256k")
- `EXTRACTION_TEMP_DIR`: Temporary extraction directory (default: "sounds/temp")
- `EXTRACTION_THUMBNAILS_DIR`: Thumbnail storage directory (default: "sounds/originals/extracted/thumbnails")
- `EXTRACTION_MAX_CONCURRENT`: Maximum concurrent extractions (default: 2)
### API Endpoints
- `POST /api/v1/extractions/`: Create extraction job (immediate response)
- `GET /api/v1/admin/extractions/status`: Get extraction processor status (admin only)
- `GET /api/v1/extractions/{extraction_id}`: Get specific extraction info
- `GET /api/v1/extractions/`: Get user's extraction history
### Technical Implementation
- **Service**: `app/services/extraction.py` - Core extraction logic with async yt-dlp operations
- **Processor**: `app/services/extraction_processor.py` - Background queue manager with concurrency control
- **Repository**: `app/repositories/extraction.py` - Database operations for extraction records
- **API**: `app/api/v1/extractions.py` - Dedicated extraction API endpoints, `app/api/v1/admin/extractions.py` - Admin extraction endpoints
- **Dependencies**: Requires yt-dlp for media extraction, FFmpeg for audio processing
- **Async Operations**: All blocking I/O operations wrapped in `asyncio.to_thread()` for non-blocking execution
### Extraction Process
1. **Creation**: Immediate API response with extraction record (service info null)
2. **Queue**: Background processor picks up pending extractions
3. **Service Detection**: yt-dlp identifies service and media metadata (non-blocking)
4. **Duplicate Check**: Verify no existing extraction for same service/media
5. **Media Download**: Extract audio and thumbnails using yt-dlp (non-blocking)
6. **File Processing**: Move files to final locations with sanitized names
7. **Sound Creation**: Create Sound database record with metadata and unique hash
8. **Normalization**: Automatically normalize extracted audio
9. **Status Update**: Mark extraction as completed with sound association
### Concurrency and Performance
- **Thread Pool Execution**: yt-dlp operations run in separate threads
- **Queue Management**: Background processor manages extraction queue
- **Concurrent Limits**: Configurable maximum concurrent extractions
- **Non-blocking API**: Other endpoints remain responsive during extraction
- **Resource Management**: Automatic cleanup of temporary files
### Error Handling
- **Service Detection Failures**: Invalid URLs handled gracefully during processing
- **Download Failures**: Network issues, geo-restrictions, or unavailable content
- **Processing Failures**: File system errors, FFmpeg issues, or corruption
- **Duplicate Prevention**: Service-level duplicate detection during processing
- **Comprehensive Logging**: Detailed error messages and extraction status tracking
### API Organization
- **Dedicated Extraction Endpoints**: Extraction functionality separated into `/api/v1/extractions/` for better organization
- **Admin Separation**: Admin-only endpoints moved to `/api/v1/admin/extractions/` for proper access control
- **Consistent URL Structure**: RESTful endpoint design following FastAPI best practices
- **Router Registration**: Proper router mounting and tag organization for API documentation
### Testing
- **16 comprehensive service tests** covering all extraction scenarios including async operations
- **API endpoint tests** with authentication and background processing validation
- **Error handling tests** for various failure scenarios
- **Mock yt-dlp operations** for reliable testing without network dependencies
- **Concurrency tests** validating non-blocking behavior and thread pool execution
- **Endpoint migration tests** ensuring proper URL routing and authentication
## Data Integrity & Performance
### Database Constraints
- **Sound Hash Uniqueness**: Prevents duplicate audio files via unique hash constraint
- **OAuth Provider Uniqueness**: Prevents duplicate OAuth connections per provider
- **Foreign Key Integrity**: Proper cascading relationships between all models
- **Index Optimization**: Strategic indexing for common query patterns
### Type Safety & Code Quality
- **Full mypy Compliance**: Complete type checking across all Python code
- **Async/Await Patterns**: Proper async programming throughout the stack
- **Error Handling**: Comprehensive exception handling with detailed logging
- **Test Coverage**: 95+ comprehensive tests with 100% critical path coverage including repository, service, and integration tests
### Performance Optimizations
- **Lazy Loading Management**: Proper SQLAlchemy relationship loading
- **Query Optimization**: Efficient database queries with pagination support
- **Background Processing**: Non-blocking operations for expensive tasks
- **Resource Management**: Proper cleanup of temporary files and connections
## Development Best Practices
### Code Organization
- **Repository Pattern**: Clean separation of data access logic
- **Service Layer**: Business logic encapsulation with dependency injection
- **Type Safety**: Comprehensive type annotations and mypy compliance
- **Error Handling**: Structured exception handling with proper logging
### Testing Strategy
- **Unit Tests**: Comprehensive repository and service layer testing
- **Integration Tests**: End-to-end API testing with authentication
- **Async Testing**: Proper async/await testing patterns with pytest-asyncio
- **Mock Strategies**: External service mocking for reliable testing
### Security & Authentication
- **JWT Token Management**: Secure token-based authentication
- **OAuth Integration**: Third-party authentication with proper scoping
- **Role-based Access**: Admin/user role separation for sensitive operations
- **Input Validation**: Comprehensive request validation with Pydantic schemas
### Monitoring & Logging
- **Structured Logging**: Consistent logging patterns across all services
- **Error Tracking**: Comprehensive exception logging with context
- **Performance Monitoring**: Request timing and resource usage tracking
- **Audit Trails**: Complete transaction history for credit and user operations