Configuration¶

Configure the Text Classification API for different environments and use cases.

Environment Variables¶

The API supports configuration through environment variables. Create a .env file in the api directory:

# Model Configuration
MODEL_PATH=../final_best_model.pkl
VECTORIZER_PATH=../tfidf_vectorizer.pkl

# Performance Settings
MAX_BATCH_SIZE=50
MAX_TEXT_LENGTH=10000
ENABLE_METRICS=true

# Server Configuration
PORT=8000
HOST=0.0.0.0

# Development Settings
DEBUG=false
LOG_LEVEL=INFO

Configuration Options¶

Model Configuration¶

Variable	Default	Description
`MODEL_PATH`	`../final_best_model.pkl`	Path to the trained model file
`VECTORIZER_PATH`	`../tfidf_vectorizer.pkl`	Path to the TF-IDF vectorizer file

Performance Settings¶

Variable	Default	Description
`MAX_BATCH_SIZE`	`50`	Maximum number of texts per batch request
`MAX_TEXT_LENGTH`	`10000`	Maximum characters allowed per text input
`ENABLE_METRICS`	`true`	Enable performance metrics endpoint

Server Configuration¶

Variable	Default	Description
`PORT`	`8000`	Server port number
`HOST`	`0.0.0.0`	Server host address

Development Settings¶

Variable	Default	Description
`DEBUG`	`false`	Enable debug mode
`LOG_LEVEL`	`INFO`	Logging level (DEBUG, INFO, WARNING, ERROR)

Free Tier Optimization¶

For deployment on free tier platforms, use these optimized settings:

# Memory-optimized settings
MAX_BATCH_SIZE=25
MAX_TEXT_LENGTH=5000
ENABLE_METRICS=false

# Performance settings
PORT=8000
LOG_LEVEL=WARNING

Docker Configuration¶

Environment File¶

Create api/.env for Docker deployments:

MODEL_PATH=/app/final_best_model.pkl
VECTORIZER_PATH=/app/tfidf_vectorizer.pkl
MAX_BATCH_SIZE=25
ENABLE_METRICS=false
PORT=8000

Docker Compose Override¶

Create api/docker-compose.override.yml for local development:

version: '3.8'
services:
  text-classifier-api:
    environment:
      - DEBUG=true
      - LOG_LEVEL=DEBUG
      - MAX_BATCH_SIZE=10
    volumes:
      - ../final_best_model.pkl:/app/final_best_model.pkl
      - ../tfidf_vectorizer.pkl:/app/tfidf_vectorizer.pkl

Production Configuration¶

Security Settings¶

# Disable debug mode
DEBUG=false

# Set appropriate log level
LOG_LEVEL=WARNING

# Limit batch size for stability
MAX_BATCH_SIZE=25

# Enable metrics for monitoring
ENABLE_METRICS=true

Performance Tuning¶

# Optimize for your server capacity
MAX_BATCH_SIZE=50
MAX_TEXT_LENGTH=5000

# Enable metrics for monitoring
ENABLE_METRICS=true

Configuration Validation¶

The API validates configuration on startup. Invalid values will cause the application to fail fast with clear error messages.

Common Configuration Issues¶

Model files not found

Solution: Ensure MODEL_PATH and VECTORIZER_PATH point to valid files

Port already in use

Solution: Change PORT to an available port

Memory limits exceeded

Solution: Reduce MAX_BATCH_SIZE and MAX_TEXT_LENGTH

Runtime Configuration¶

Some settings can be changed at runtime through API calls (future feature). Currently, all configuration must be set before starting the server.

Configuration for Different Environments¶

Development¶

DEBUG=true
LOG_LEVEL=DEBUG
MAX_BATCH_SIZE=10
ENABLE_METRICS=true

Staging¶

DEBUG=false
LOG_LEVEL=INFO
MAX_BATCH_SIZE=25
ENABLE_METRICS=true

Production¶

DEBUG=false
LOG_LEVEL=WARNING
MAX_BATCH_SIZE=50
ENABLE_METRICS=true

Free Tier¶

DEBUG=false
LOG_LEVEL=WARNING
MAX_BATCH_SIZE=25
MAX_TEXT_LENGTH=5000
ENABLE_METRICS=false

Monitoring Configuration¶

When ENABLE_METRICS=true, additional endpoints become available:

GET /metrics - Performance metrics
Enhanced health checks with memory monitoring

Troubleshooting¶

Configuration Loading Issues¶

Check that .env file exists in the correct location
Verify environment variable names match exactly
Ensure file paths are correct and accessible
Check file permissions for model files

Performance Issues¶

Monitor memory usage with /health endpoint
Adjust MAX_BATCH_SIZE based on available RAM
Check /metrics for performance bottlenecks
Consider enabling/disabling metrics based on needs