Skip to content

Configuration

Configure the Text Classification API for different environments and use cases.

Environment Variables

The API supports configuration through environment variables. Create a .env file in the api directory:

# Model Configuration
MODEL_PATH=../final_best_model.pkl
VECTORIZER_PATH=../tfidf_vectorizer.pkl

# Performance Settings
MAX_BATCH_SIZE=50
MAX_TEXT_LENGTH=10000
ENABLE_METRICS=true

# Server Configuration
PORT=8000
HOST=0.0.0.0

# Development Settings
DEBUG=false
LOG_LEVEL=INFO

Configuration Options

Model Configuration

Variable Default Description
MODEL_PATH ../final_best_model.pkl Path to the trained model file
VECTORIZER_PATH ../tfidf_vectorizer.pkl Path to the TF-IDF vectorizer file

Performance Settings

Variable Default Description
MAX_BATCH_SIZE 50 Maximum number of texts per batch request
MAX_TEXT_LENGTH 10000 Maximum characters allowed per text input
ENABLE_METRICS true Enable performance metrics endpoint

Server Configuration

Variable Default Description
PORT 8000 Server port number
HOST 0.0.0.0 Server host address

Development Settings

Variable Default Description
DEBUG false Enable debug mode
LOG_LEVEL INFO Logging level (DEBUG, INFO, WARNING, ERROR)

Free Tier Optimization

For deployment on free tier platforms, use these optimized settings:

# Memory-optimized settings
MAX_BATCH_SIZE=25
MAX_TEXT_LENGTH=5000
ENABLE_METRICS=false

# Performance settings
PORT=8000
LOG_LEVEL=WARNING

Docker Configuration

Environment File

Create api/.env for Docker deployments:

MODEL_PATH=/app/final_best_model.pkl
VECTORIZER_PATH=/app/tfidf_vectorizer.pkl
MAX_BATCH_SIZE=25
ENABLE_METRICS=false
PORT=8000

Docker Compose Override

Create api/docker-compose.override.yml for local development:

version: '3.8'
services:
  text-classifier-api:
    environment:
      - DEBUG=true
      - LOG_LEVEL=DEBUG
      - MAX_BATCH_SIZE=10
    volumes:
      - ../final_best_model.pkl:/app/final_best_model.pkl
      - ../tfidf_vectorizer.pkl:/app/tfidf_vectorizer.pkl

Production Configuration

Security Settings

# Disable debug mode
DEBUG=false

# Set appropriate log level
LOG_LEVEL=WARNING

# Limit batch size for stability
MAX_BATCH_SIZE=25

# Enable metrics for monitoring
ENABLE_METRICS=true

Performance Tuning

# Optimize for your server capacity
MAX_BATCH_SIZE=50
MAX_TEXT_LENGTH=5000

# Enable metrics for monitoring
ENABLE_METRICS=true

Configuration Validation

The API validates configuration on startup. Invalid values will cause the application to fail fast with clear error messages.

Common Configuration Issues

  1. Model files not found

    Solution: Ensure MODEL_PATH and VECTORIZER_PATH point to valid files
    

  2. Port already in use

    Solution: Change PORT to an available port
    

  3. Memory limits exceeded

    Solution: Reduce MAX_BATCH_SIZE and MAX_TEXT_LENGTH
    

Runtime Configuration

Some settings can be changed at runtime through API calls (future feature). Currently, all configuration must be set before starting the server.

Configuration for Different Environments

Development

DEBUG=true
LOG_LEVEL=DEBUG
MAX_BATCH_SIZE=10
ENABLE_METRICS=true

Staging

DEBUG=false
LOG_LEVEL=INFO
MAX_BATCH_SIZE=25
ENABLE_METRICS=true

Production

DEBUG=false
LOG_LEVEL=WARNING
MAX_BATCH_SIZE=50
ENABLE_METRICS=true

Free Tier

DEBUG=false
LOG_LEVEL=WARNING
MAX_BATCH_SIZE=25
MAX_TEXT_LENGTH=5000
ENABLE_METRICS=false

Monitoring Configuration

When ENABLE_METRICS=true, additional endpoints become available:

  • GET /metrics - Performance metrics
  • Enhanced health checks with memory monitoring

Troubleshooting

Configuration Loading Issues

  1. Check that .env file exists in the correct location
  2. Verify environment variable names match exactly
  3. Ensure file paths are correct and accessible
  4. Check file permissions for model files

Performance Issues

  1. Monitor memory usage with /health endpoint
  2. Adjust MAX_BATCH_SIZE based on available RAM
  3. Check /metrics for performance bottlenecks
  4. Consider enabling/disabling metrics based on needs