sdb2-claude/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Development Commands

### Backend (Python/FastAPI)
- **Development server**: `cd backend && uv run python run.py` or `cd backend && uv run uvicorn app.main:app --reload`
- **Tests**: `cd backend && uv run pytest` (uses pytest with asyncio support)
- **Coverage**: `cd backend && uv run coverage run -m pytest && uv run coverage report`
- **Linting**: `cd backend && uv run ruff check` and `cd backend && uv run ruff format`
- **Type checking**: `cd backend && uv run mypy .`
- **Install dependencies**: `cd backend && uv sync`

### Frontend (React/TypeScript/Vite)
- **Development server**: `cd frontend && bun dev` (runs on port 8001)
- **Build**: `cd frontend && bun run build`
- **Linting**: `cd frontend && bun run lint`
- **Preview build**: `cd frontend && bun run preview`
- **Install dependencies**: `cd frontend && bun install`

## Architecture Overview

This is a soundboard application with a FastAPI backend and React frontend.

### Backend Architecture
- **Framework**: FastAPI with SQLModel for database ORM
- **Database**: SQLite with aiosqlite async driver
- **Authentication**: JWT tokens with OAuth2 support (Google, GitHub)
- **Dependencies**: FastAPI, SQLModel, aiosqlite, bcrypt, PyJWT, pydantic-settings, uvicorn, ffmpeg-python, yt-dlp, python-vlc
- **Structure**:
  - `app/api/v1/`: API endpoints for v1 (auth.py, main.py, sounds.py, extractions.py, socket.py, player.py, playlists.py)
  - `app/api/v1/admin/`: Admin-only API endpoints (sounds.py, extractions.py)
  - `app/models/`: Database models (User, Sound, Playlist, Extraction, Plan, UserOAuth, CreditTransaction, SoundPlayed, etc.)
  - `app/services/`: Business logic layer (auth.py, oauth.py, socket.py, player.py, sound_scanner.py, sound_normalizer.py, extraction.py, extraction_processor.py, credit.py)
  - `app/repositories/`: Data access layer (base.py, user.py, user_oauth.py, sound.py, extraction.py, credit_transaction.py, playlist.py)
  - `app/schemas/`: Pydantic schemas for API requests/responses (auth.py)
  - `app/core/`: Configuration, database setup, logging, dependencies, seeds
  - `app/middleware/`: Custom middleware (logging)
  - `app/utils/`: Utility functions (auth.py, cookies.py, audio.py)
  - `tests/`: Comprehensive test suite with pytest and asyncio support

### Frontend Architecture
- **Framework**: React 19 with TypeScript
- **Build Tool**: Vite with SWC for fast development and builds
- **UI Library**: Comprehensive Radix UI component system with shadcn/ui
- **Styling**: Tailwind CSS v4 with custom components
- **Theming**: next-themes with dark mode support via ThemeProvider and ModeToggle
- **Routing**: React Router v7
- **Package Manager**: Bun for fast package management
- **Key Dependencies**: @radix-ui components, lucide-react icons, recharts, sonner notifications
- **Structure**:
  - `src/components/ui/`: Complete UI component library (button, card, dialog, table, etc.)
  - `src/components/`: App-specific components (ThemeProvider, ModeToggle)
  - `src/hooks/`: Custom React hooks (use-mobile.ts)
  - `src/lib/`: Utility functions (utils.ts with cn helper)
  - `src/contexts/`: React contexts (empty, ready for state management)
  - `src/assets/`: Static assets

### Key Models
- **User**: Authentication, plans, credits, API tokens
- **UserOAuth**: OAuth provider connections (Google, GitHub) with unique constraints
- **Plan**: User subscription plans and limits
- **Sound**: Audio files with metadata, normalization fields, play counts, **unique hash constraint**
- **Playlist**: User-created sound collections
- **PlaylistSound**: Many-to-many relationship between playlists and sounds
- **Extraction**: Audio extraction jobs from external services with async processing and flexible service detection
- **SoundPlayed**: Play history tracking with user and sound associations
- **CreditTransaction**: Comprehensive credit system transaction logging with metadata

### Database
- SQLite database at `backend/data/soundboard.db`
- Models use SQLModel (Pydantic + SQLAlchemy)
- Async database operations with aiosqlite
- **Data Integrity**: Unique constraints on sound hash, OAuth provider+user combinations
- **Foreign Key Relationships**: Proper cascading and relationship management

### Configuration
- Backend settings in `backend/app/core/config.py` using pydantic-settings
- Environment variables loaded from `.env` files
- Configurable settings: database URL, JWT secrets, OAuth2 client credentials, logging, cookies, audio normalization, audio extraction, credits system
- Default ports: Backend (8000), Frontend (8001)
- OAuth redirect URL: `http://localhost:8001/auth/callback`

### Development Notes
- Backend runs on port 8000 by default (configurable via HOST/PORT env vars)
- Frontend dev server runs on port 8001
- Project uses Python 3.12+ with uv package manager for backend
- Frontend uses TypeScript 5.8+ with strict mode enabled
- Comprehensive linting: Ruff (backend), ESLint (frontend)
- Type checking: mypy (backend), TypeScript (frontend)
- Testing: pytest with asyncio support and coverage reporting (90+ comprehensive tests including repository, service, and integration tests)
- Logs stored in `backend/logs/app.log` with rotation
- Audio files stored in `backend/sounds/` directory structure (originals, normalized, extracted)
- Database file at `backend/data/soundboard.db`
- Extraction processing uses background workers with configurable concurrency limits

## Credit System

The application includes a comprehensive credit-based system for managing user actions and resource consumption.

### Credit Features
- **Action-based Deductions**: Credits are deducted for specific actions (VLC play, audio extraction, etc.)
- **Transaction Logging**: All credit changes are logged with detailed metadata
- **Plan Integration**: Credit limits and replenishment tied to user subscription plans
- **Real-time Updates**: WebSocket events notify users of credit changes
- **Admin Management**: Administrative controls for credit adjustments

### Credit Actions
- **VLC Play Sound**: Deducts credits when playing sounds through VLC
- **Audio Extraction**: Deducts credits for extracting audio from external URLs
- **Credit Addition**: Administrative credit bonuses and plan-based replenishment

### Database Schema (CreditTransaction Model)
- **Comprehensive Tracking**: User ID, action type, amount, balance before/after
- **Metadata Storage**: JSON metadata for action-specific details
- **Success Tracking**: Boolean flag for successful/failed transactions
- **Temporal Ordering**: Created/updated timestamps for audit trails

### API Integration
- **Automatic Deduction**: Services automatically deduct credits during operations
- **Balance Checking**: Credit validation before expensive operations
- **Transaction History**: API endpoints for viewing credit transaction history
- **Real-time Events**: WebSocket emission of `user_credits_changed` events

### Technical Implementation
- **Service**: `app/services/credit.py` - Core credit management with WebSocket integration
- **Repository**: `app/repositories/credit_transaction.py` - Database operations for credit transactions
- **Models**: `CreditTransaction` model with comprehensive metadata tracking
- **Testing**: 14 comprehensive tests covering all credit scenarios

## Sound Management System

Enhanced sound management with comprehensive duplicate prevention and integrity features.

### Sound Features
- **Duplicate Prevention**: Unique hash constraint prevents duplicate audio files
- **Metadata Tracking**: Complete audio file metadata (duration, size, hash, type)
- **Play Count Tracking**: Usage statistics for popular sounds analysis
- **Type Classification**: SDB (soundboard), TTS (text-to-speech), EXT (extracted) categorization
- **Normalization Support**: Integration with audio normalization system
- **File Integrity**: SHA-256 hash verification for data integrity

### Database Constraints
- **Unique Hash**: `UniqueConstraint("hash", name="uq_sound_hash")` prevents duplicate files
- **Data Integrity**: Proper foreign key relationships and nullable field handling
- **Indexed Fields**: Optimized queries for common operations (filename, hash, type)

### API Endpoints
- `GET /api/v1/sounds/`: Get all sounds with optional type filtering (authenticated users only)
  - Query parameters: `types` (can be specified multiple times for filtering by multiple types)
  - Examples:
    - `GET /api/v1/sounds/` - Returns all sounds
    - `GET /api/v1/sounds/?types=SDB` - Returns only SDB type sounds
    - `GET /api/v1/sounds/?types=SDB&types=EXT` - Returns SDB and EXT type sounds
- `POST /api/v1/sounds/play/{sound_id}`: Play sound with VLC (requires 1 credit)
- `POST /api/v1/sounds/stop`: Stop all VLC instances

### Sound Type Filtering Features
- **Authentication Required**: All sound endpoints require valid user authentication
- **Type-based Filtering**: Filter sounds by one or more types (SDB, TTS, EXT)
- **Flexible Query Parameters**: Multiple `types` parameters supported for complex filtering
- **Empty Results**: Invalid types return empty list without error
- **Performance Optimized**: Uses SQLAlchemy `IN` clause for efficient multi-type queries

### Technical Implementation
- **Repository**: `app/repositories/sound.py` - Complete CRUD operations with specialized queries including `get_by_types()` for type filtering
- **Models**: Enhanced `Sound` model with unique constraints and relationship management
- **API Integration**: Sound creation, update, deletion with duplicate prevention, authenticated sound retrieval
- **Testing**: 15+ comprehensive tests covering all sound operations including constraint validation and API endpoint testing

## Player System

Comprehensive audio player service with VLC backend for playlist management and audio playback.

### Player Features
- **VLC Integration**: Uses VLC media player as the backend for reliable audio playback
- **Playlist Management**: Dynamic playlist loading and reloading with state persistence
- **Intelligent Track Handling**: Smart playlist reload logic with track position tracking
- **Multiple Playback Modes**: Continuous, loop, loop-one, random, and single play modes
- **Play Count Tracking**: Automatic play count updates with 20% threshold detection
- **Real-time Position Tracking**: Background thread for position updates and auto-advance
- **WebSocket Broadcasting**: Real-time state updates via WebSocket connections
- **Credit Integration**: Automatic credit deduction for VLC-based sound plays

### Playlist Reload Logic
- **ID-based Comparison**: Compares playlist IDs to determine reload behavior
- **Playlist Change Handling**: When playlist ID changes, stops player and resets to first track
- **Track Position Tracking**: When same playlist, tracks if current song moved to different index
- **Missing Track Handling**: When current track removed, stops player and sets first available track
- **Empty Playlist Support**: Graceful handling of empty playlists with state clearing
- **State Consistency**: Ensures player state remains consistent across all reload scenarios

### Player State Management
- **Status Tracking**: Playing, paused, stopped states with proper transitions
- **Sound Information**: Current track ID, index, position, duration tracking
- **Playlist Metadata**: Playlist ID, name, length, total duration, and sound list
- **Volume Control**: Volume management with range validation (0-100)
- **Position Tracking**: Real-time playback position with seek functionality

### Database Integration
- **Play History**: Records `SoundPlayed` entries for player-based plays (no user association)
- **Sound Statistics**: Updates sound play counts automatically when 20% threshold reached
- **Playlist Synchronization**: Syncs with database playlist changes via reload mechanism
- **Session Management**: Proper async database session handling with connection cleanup

### Technical Implementation
- **Service**: `app/services/player.py` - Core player logic with VLC integration
- **State Management**: PlayerState class for comprehensive state tracking
- **Background Threading**: Position tracking thread for non-blocking operations
- **Async Operations**: Full async/await support for database operations
- **Error Handling**: Comprehensive error handling with graceful degradation
- **Memory Management**: Proper cleanup of resources and background tasks

### Player Modes
- **Continuous**: Plays through playlist once then stops
- **Loop**: Repeats entire playlist indefinitely
- **Loop One**: Repeats current track indefinitely
- **Random**: Plays tracks in random order
- **Single**: Plays current track once then stops

### API Integration
- **REST Endpoints**: Player control via HTTP API (`app/api/v1/player.py`)
- **WebSocket Events**: Real-time state broadcasting to connected clients
- **Authentication**: Supports both authenticated and unauthenticated playback
- **Global Service**: Singleton player service accessible throughout the application

### Testing Coverage
- **49 comprehensive tests** covering all player functionality including:
  - State management and serialization
  - Playback control (play, pause, stop, seek)
  - Playlist reload scenarios with ID changes
  - Track position tracking and updates
  - Helper method validation
  - Mode switching and volume control
  - Play count tracking and credit integration
  - Error handling and edge cases

## Repository Pattern & Testing

Comprehensive repository pattern implementation with full test coverage for data access layer.

### Repository Architecture
- **Base Repository**: `app/repositories/base.py` - Generic CRUD operations with type safety
- **Specialized Repositories**: Domain-specific repositories extending base functionality
- **Async Operations**: Full async/await support for non-blocking database operations
- **Error Handling**: Comprehensive exception handling with logging

### Repository Coverage
- **User Repository**: User management, authentication, role-based operations
- **Sound Repository**: Audio file management with specialized queries including type-based filtering (`get_by_types()`, `get_by_type()`, `get_by_hash()`, `search_by_name()`, etc.)
- **Credit Transaction Repository**: Credit system transaction management
- **User OAuth Repository**: OAuth provider management and authentication
- **Playlist Repository**: Playlist management and sound associations
- **Extraction Repository**: Audio extraction job management

### Testing Infrastructure
- **80+ Repository Tests**: Comprehensive test coverage across all repositories
- **Async Test Support**: Proper async/await testing with pytest-asyncio
- **SQLAlchemy Integration**: Proper session management and lazy loading handling
- **Type Safety**: Complete mypy type checking compliance
- **Fixture Management**: Reusable test fixtures with proper dependency injection

### Test Categories
- **CRUD Operations**: Create, read, update, delete operations for all entities
- **Constraint Validation**: Unique constraint and foreign key relationship testing
- **Pagination Testing**: Limit/offset pagination with proper ordering
- **Error Scenarios**: Exception handling and error condition testing
- **Performance Tests**: Query optimization and efficient data access patterns

## Sound Normalization System

The application includes a comprehensive audio normalization system using FFmpeg's loudnorm filter for professional-quality audio processing.

### Normalization Features
- **Two-pass normalization**: Default high-quality mode with analysis and normalization phases
- **One-pass normalization**: Fast mode for quick processing or as fallback
- **Intelligent fallback**: Automatically switches to one-pass for problematic audio (infinite analysis values)
- **Batch processing**: Normalize all sounds or filter by type (SDB, TTS, EXT)
- **Admin-only access**: Normalization endpoints require administrator privileges
- **Comprehensive logging**: Detailed FFmpeg output and error handling

### Directory Structure
```
backend/sounds/
├── originals/
│   ├── soundboard/     # SDB type sounds
│   ├── text_to_speech/ # TTS type sounds
│   └── extracted/      # EXT type sounds
└── normalized/
    ├── soundboard/     # Normalized SDB sounds
    ├── text_to_speech/ # Normalized TTS sounds
    └── extracted/      # Normalized EXT sounds
```

### Configuration (Environment Variables)
- `NORMALIZED_AUDIO_FORMAT`: Output format (default: "mp3")
- `NORMALIZED_AUDIO_BITRATE`: Bitrate setting (default: "256k")
- `NORMALIZED_AUDIO_PASSES`: 1 for one-pass, 2 for two-pass (default: 2)

### Database Fields (Sound Model)
- `is_normalized`: Boolean flag indicating normalization status
- `normalized_filename`: Filename of normalized audio file
- `normalized_duration`: Duration in milliseconds of normalized file
- `normalized_size`: File size in bytes of normalized file
- `normalized_hash`: SHA-256 hash of normalized file for integrity

### API Endpoints
- `POST /api/v1/sounds/normalize/all`: Normalize all unnormalized sounds
- `POST /api/v1/sounds/normalize/type/{sound_type}`: Normalize sounds by type
- `POST /api/v1/sounds/normalize/{sound_id}`: Normalize specific sound
- **Parameters**: `force` (re-normalize already processed), `one_pass` (override config)

### Technical Implementation
- **Service**: `app/services/sound_normalizer.py` - Core normalization logic
- **API**: `app/api/v1/sounds.py` - REST endpoints (consolidated with other sound endpoints)
- **Repository**: Enhanced `app/repositories/sound.py` with normalization queries
- **Dependencies**: Requires FFmpeg installed on system, uses ffmpeg-python library
- **Error Handling**: Graceful fallback for edge cases (silent audio, infinite values)
- **Session Management**: Handles SQLModel session detachment in batch operations

### Normalization Process
1. **Analysis Phase** (two-pass only): Analyze audio characteristics
2. **Validation**: Check for invalid analysis values (inf, -inf, nan)
3. **Fallback Logic**: Switch to one-pass if analysis contains invalid values
4. **Normalization**: Apply loudnorm filter with target levels (I=-23, TP=-2, LRA=7)
5. **Database Update**: Store normalized file metadata and set is_normalized flag

### Testing
- 17 comprehensive service tests covering all normalization scenarios
- 16 API endpoint tests with authentication and authorization checks
- Edge case handling for problematic audio files
- Mock FFmpeg operations for reliable testing

## Sound Scanner System

The application includes a sound scanner service for automatically discovering, importing, and managing audio files in the filesystem.

### Scanner Features
- **File Discovery**: Recursively scans sound directories for audio files
- **Format Support**: Handles multiple audio formats (.mp3, .wav, .flac, .ogg, .m4a, etc.)
- **Metadata Extraction**: Uses FFmpeg to extract duration and file information
- **Database Sync**: Automatically adds new files, updates existing ones, removes deleted files
- **Admin-only Access**: Scanning operations require administrator privileges
- **Comprehensive Reporting**: Detailed results showing added, updated, deleted, and skipped files
- **Duplicate Prevention**: Integration with unique hash constraint system

### Technical Implementation
- **Service**: `app/services/sound_scanner.py` - Core scanning and import logic
- **API**: `app/api/v1/sounds.py` - REST endpoint for scanning operations
- **Dependencies**: Requires FFmpeg for metadata extraction
- **Error Handling**: Graceful handling of corrupted or unreadable files
- **Hash-based Detection**: Uses SHA-256 hashing to detect file changes and prevent duplicates

### Scanning Process
1. **Directory Traversal**: Recursively scan configured sound directories
2. **File Validation**: Check file extensions and accessibility
3. **Metadata Extraction**: Extract duration, size, and hash using FFmpeg
4. **Database Comparison**: Compare with existing database records
5. **Duplicate Detection**: Check unique hash constraint before insertion
6. **Sync Operations**: Add new files, update changed files, remove deleted files
7. **Results Reporting**: Return detailed scan results with statistics

### API Endpoints
- `POST /api/v1/sounds/scan`: Scan and sync sound directories

## WebSocket/Socket.IO System

Real-time communication system using WebSocket connections for live updates and messaging.

### Socket Features
- **Real-time Communication**: WebSocket-based messaging between users
- **Connection Management**: Track connected users and connection status
- **User-to-User Messaging**: Send messages to specific users
- **Connection Status**: Get current connection status and user count
- **Authentication Integration**: Uses existing user authentication system
- **Credit Change Notifications**: Real-time credit balance updates via `user_credits_changed` events

### Technical Implementation
- **Service**: `app/services/socket.py` - Socket.IO manager and connection handling
- **API**: `app/api/v1/socket.py` - REST endpoints for socket operations
- **Manager**: Centralized socket connection management with user tracking
- **Authentication**: Integrated with existing JWT authentication system
- **Event System**: Structured event emission for various application events

### API Endpoints
- `GET /api/v1/socket/status`: Get current socket connection status
- `POST /api/v1/socket/send-message`: Send a message to a specific user via WebSocket

### Socket Events
- **Connection Management**: Connection and disconnection tracking
- **User Messages**: User-specific message routing
- **Credit Updates**: `user_credits_changed` events with detailed transaction data
- **Real-time Status**: Live application status updates

## Audio Utilities

Shared utility functions for audio file processing used across multiple services.

### Audio Utility Functions
- **File Hashing**: `get_file_hash()` - Calculate SHA-256 hash of audio files for integrity checking
- **File Size**: `get_file_size()` - Get file size in bytes for metadata storage
- **Duration Extraction**: `get_audio_duration()` - Extract audio duration in milliseconds using FFmpeg

### Technical Implementation
- **Module**: `app/utils/audio.py` - Shared audio processing utilities
- **Dependencies**: Uses FFmpeg via ffmpeg-python for duration extraction
- **Error Handling**: Graceful fallback for corrupted or unreadable files
- **Consistent Interface**: Same function signatures across all audio services

### Usage
- **Sound Scanner**: Uses utilities for file discovery and metadata extraction
- **Sound Normalizer**: Uses utilities for normalized file verification and metadata
- **Audio Extraction**: Uses utilities for extracted audio file metadata and validation
- **Duplicate Prevention**: Hash calculation for unique constraint enforcement
- **Centralized Logic**: Eliminates code duplication between audio processing services

## Audio Extraction System

The application includes a comprehensive audio extraction system for downloading and processing audio content from external services using yt-dlp.

### Extraction Features
- **Immediate Response**: API endpoints return immediately without waiting for yt-dlp processing
- **Background Processing**: Actual extraction happens asynchronously in background worker threads
- **Multi-Service Support**: Supports YouTube, SoundCloud, Vimeo, DailyMotion, TikTok, Twitter, Instagram
- **Non-blocking Operations**: yt-dlp operations run in thread pools to prevent event loop blocking
- **Concurrent Processing**: Configurable maximum concurrent extractions with queue management
- **Automatic Normalization**: Extracted audio is automatically normalized using the sound normalization system
- **Error Handling**: Comprehensive error handling with detailed logging and status tracking
- **Credit Integration**: Automatic credit deduction for extraction operations

### Database Schema (Extraction Model)
- **Flexible Service Detection**: `service` and `service_id` are nullable during creation, populated during processing
- **Status Tracking**: `pending` → `processing` → `completed`/`failed`
- **Metadata Storage**: URL, title, user association, linked sound record
- **Error Logging**: Detailed error messages for failed extractions

### Directory Structure
```
backend/sounds/temp/          # Temporary extraction workspace
backend/sounds/originals/extracted/  # Final extracted audio files
backend/sounds/originals/extracted/thumbnails/  # Extracted thumbnails
```

### Configuration (Environment Variables)
- `EXTRACTION_AUDIO_FORMAT`: Output audio format (default: "mp3")
- `EXTRACTION_AUDIO_BITRATE`: Audio bitrate setting (default: "256k")
- `EXTRACTION_TEMP_DIR`: Temporary extraction directory (default: "sounds/temp")
- `EXTRACTION_THUMBNAILS_DIR`: Thumbnail storage directory (default: "sounds/originals/extracted/thumbnails")
- `EXTRACTION_MAX_CONCURRENT`: Maximum concurrent extractions (default: 2)

### API Endpoints
- `POST /api/v1/extractions/`: Create extraction job (immediate response)
- `GET /api/v1/admin/extractions/status`: Get extraction processor status (admin only)
- `GET /api/v1/extractions/{extraction_id}`: Get specific extraction info
- `GET /api/v1/extractions/`: Get user's extraction history

### Technical Implementation
- **Service**: `app/services/extraction.py` - Core extraction logic with async yt-dlp operations
- **Processor**: `app/services/extraction_processor.py` - Background queue manager with concurrency control
- **Repository**: `app/repositories/extraction.py` - Database operations for extraction records
- **API**: `app/api/v1/extractions.py` - Dedicated extraction API endpoints, `app/api/v1/admin/extractions.py` - Admin extraction endpoints
- **Dependencies**: Requires yt-dlp for media extraction, FFmpeg for audio processing
- **Async Operations**: All blocking I/O operations wrapped in `asyncio.to_thread()` for non-blocking execution

### Extraction Process
1. **Creation**: Immediate API response with extraction record (service info null)
2. **Queue**: Background processor picks up pending extractions
3. **Service Detection**: yt-dlp identifies service and media metadata (non-blocking)
4. **Duplicate Check**: Verify no existing extraction for same service/media
5. **Media Download**: Extract audio and thumbnails using yt-dlp (non-blocking)
6. **File Processing**: Move files to final locations with sanitized names
7. **Sound Creation**: Create Sound database record with metadata and unique hash
8. **Normalization**: Automatically normalize extracted audio
9. **Status Update**: Mark extraction as completed with sound association

### Concurrency and Performance
- **Thread Pool Execution**: yt-dlp operations run in separate threads
- **Queue Management**: Background processor manages extraction queue
- **Concurrent Limits**: Configurable maximum concurrent extractions
- **Non-blocking API**: Other endpoints remain responsive during extraction
- **Resource Management**: Automatic cleanup of temporary files

### Error Handling
- **Service Detection Failures**: Invalid URLs handled gracefully during processing
- **Download Failures**: Network issues, geo-restrictions, or unavailable content
- **Processing Failures**: File system errors, FFmpeg issues, or corruption
- **Duplicate Prevention**: Service-level duplicate detection during processing
- **Comprehensive Logging**: Detailed error messages and extraction status tracking

### API Organization
- **Dedicated Extraction Endpoints**: Extraction functionality separated into `/api/v1/extractions/` for better organization
- **Admin Separation**: Admin-only endpoints moved to `/api/v1/admin/extractions/` for proper access control
- **Consistent URL Structure**: RESTful endpoint design following FastAPI best practices
- **Router Registration**: Proper router mounting and tag organization for API documentation

### Testing
- **16 comprehensive service tests** covering all extraction scenarios including async operations
- **API endpoint tests** with authentication and background processing validation
- **Error handling tests** for various failure scenarios
- **Mock yt-dlp operations** for reliable testing without network dependencies
- **Concurrency tests** validating non-blocking behavior and thread pool execution
- **Endpoint migration tests** ensuring proper URL routing and authentication

## Data Integrity & Performance

### Database Constraints
- **Sound Hash Uniqueness**: Prevents duplicate audio files via unique hash constraint
- **OAuth Provider Uniqueness**: Prevents duplicate OAuth connections per provider
- **Foreign Key Integrity**: Proper cascading relationships between all models
- **Index Optimization**: Strategic indexing for common query patterns

### Type Safety & Code Quality
- **Full mypy Compliance**: Complete type checking across all Python code
- **Async/Await Patterns**: Proper async programming throughout the stack
- **Error Handling**: Comprehensive exception handling with detailed logging
- **Test Coverage**: 95+ comprehensive tests with 100% critical path coverage including repository, service, and integration tests

### Performance Optimizations
- **Lazy Loading Management**: Proper SQLAlchemy relationship loading
- **Query Optimization**: Efficient database queries with pagination support
- **Background Processing**: Non-blocking operations for expensive tasks
- **Resource Management**: Proper cleanup of temporary files and connections

## Development Best Practices

### Code Organization
- **Repository Pattern**: Clean separation of data access logic
- **Service Layer**: Business logic encapsulation with dependency injection
- **Type Safety**: Comprehensive type annotations and mypy compliance
- **Error Handling**: Structured exception handling with proper logging

### Testing Strategy
- **Unit Tests**: Comprehensive repository and service layer testing
- **Integration Tests**: End-to-end API testing with authentication
- **Async Testing**: Proper async/await testing patterns with pytest-asyncio
- **Mock Strategies**: External service mocking for reliable testing

### Security & Authentication
- **JWT Token Management**: Secure token-based authentication
- **OAuth Integration**: Third-party authentication with proper scoping
- **Role-based Access**: Admin/user role separation for sensitive operations
- **Input Validation**: Comprehensive request validation with Pydantic schemas

### Monitoring & Logging
- **Structured Logging**: Consistent logging patterns across all services
- **Error Tracking**: Comprehensive exception logging with context
- **Performance Monitoring**: Request timing and resource usage tracking
- **Audit Trails**: Complete transaction history for credit and user operations