Configuration¶
Configure the Text Classification API for different environments and use cases.
Environment Variables¶
The API supports configuration through environment variables. Create a .env file in the api directory:
# Model Configuration
MODEL_PATH=../final_best_model.pkl
VECTORIZER_PATH=../tfidf_vectorizer.pkl
# Performance Settings
MAX_BATCH_SIZE=50
MAX_TEXT_LENGTH=10000
ENABLE_METRICS=true
# Server Configuration
PORT=8000
HOST=0.0.0.0
# Development Settings
DEBUG=false
LOG_LEVEL=INFO
Configuration Options¶
Model Configuration¶
| Variable | Default | Description |
|---|---|---|
MODEL_PATH |
../final_best_model.pkl |
Path to the trained model file |
VECTORIZER_PATH |
../tfidf_vectorizer.pkl |
Path to the TF-IDF vectorizer file |
Performance Settings¶
| Variable | Default | Description |
|---|---|---|
MAX_BATCH_SIZE |
50 |
Maximum number of texts per batch request |
MAX_TEXT_LENGTH |
10000 |
Maximum characters allowed per text input |
ENABLE_METRICS |
true |
Enable performance metrics endpoint |
Server Configuration¶
| Variable | Default | Description |
|---|---|---|
PORT |
8000 |
Server port number |
HOST |
0.0.0.0 |
Server host address |
Development Settings¶
| Variable | Default | Description |
|---|---|---|
DEBUG |
false |
Enable debug mode |
LOG_LEVEL |
INFO |
Logging level (DEBUG, INFO, WARNING, ERROR) |
Free Tier Optimization¶
For deployment on free tier platforms, use these optimized settings:
# Memory-optimized settings
MAX_BATCH_SIZE=25
MAX_TEXT_LENGTH=5000
ENABLE_METRICS=false
# Performance settings
PORT=8000
LOG_LEVEL=WARNING
Docker Configuration¶
Environment File¶
Create api/.env for Docker deployments:
MODEL_PATH=/app/final_best_model.pkl
VECTORIZER_PATH=/app/tfidf_vectorizer.pkl
MAX_BATCH_SIZE=25
ENABLE_METRICS=false
PORT=8000
Docker Compose Override¶
Create api/docker-compose.override.yml for local development:
version: '3.8'
services:
text-classifier-api:
environment:
- DEBUG=true
- LOG_LEVEL=DEBUG
- MAX_BATCH_SIZE=10
volumes:
- ../final_best_model.pkl:/app/final_best_model.pkl
- ../tfidf_vectorizer.pkl:/app/tfidf_vectorizer.pkl
Production Configuration¶
Security Settings¶
# Disable debug mode
DEBUG=false
# Set appropriate log level
LOG_LEVEL=WARNING
# Limit batch size for stability
MAX_BATCH_SIZE=25
# Enable metrics for monitoring
ENABLE_METRICS=true
Performance Tuning¶
# Optimize for your server capacity
MAX_BATCH_SIZE=50
MAX_TEXT_LENGTH=5000
# Enable metrics for monitoring
ENABLE_METRICS=true
Configuration Validation¶
The API validates configuration on startup. Invalid values will cause the application to fail fast with clear error messages.
Common Configuration Issues¶
-
Model files not found
-
Port already in use
-
Memory limits exceeded
Runtime Configuration¶
Some settings can be changed at runtime through API calls (future feature). Currently, all configuration must be set before starting the server.
Configuration for Different Environments¶
Development¶
Staging¶
Production¶
Free Tier¶
Monitoring Configuration¶
When ENABLE_METRICS=true, additional endpoints become available:
GET /metrics- Performance metrics- Enhanced health checks with memory monitoring
Troubleshooting¶
Configuration Loading Issues¶
- Check that
.envfile exists in the correct location - Verify environment variable names match exactly
- Ensure file paths are correct and accessible
- Check file permissions for model files
Performance Issues¶
- Monitor memory usage with
/healthendpoint - Adjust
MAX_BATCH_SIZEbased on available RAM - Check
/metricsfor performance bottlenecks - Consider enabling/disabling metrics based on needs