Docker Deployment¶
Deploy the Text Classification API using Docker containers.
Prerequisites¶
- Docker 20.10+
- Docker Compose (optional)
Quick Start with Docker¶
1. Build the Image¶
2. Run the Container¶
docker run -p 8000:8000 \
-v $(pwd)/../final_best_model.pkl:/app/final_best_model.pkl \
-v $(pwd)/../tfidf_vectorizer.pkl:/app/tfidf_vectorizer.pkl \
text-classifier-api
3. Test the API¶
Docker Compose¶
Create docker-compose.yml for easier deployment:
version: '3.8'
services:
text-classifier-api:
build: .
ports:
- "8000:8000"
volumes:
- ../final_best_model.pkl:/app/final_best_model.pkl
- ../tfidf_vectorizer.pkl:/app/tfidf_vectorizer.pkl
environment:
- MODEL_PATH=/app/final_best_model.pkl
- VECTORIZER_PATH=/app/tfidf_vectorizer.pkl
- MAX_BATCH_SIZE=25
- ENABLE_METRICS=false
restart: unless-stopped
Run with Docker Compose:
Environment Variables¶
Configure the container with environment variables:
docker run -p 8000:8000 \
-e MODEL_PATH=/app/final_best_model.pkl \
-e VECTORIZER_PATH=/app/tfidf_vectorizer.pkl \
-e MAX_BATCH_SIZE=25 \
-e ENABLE_METRICS=false \
-v $(pwd)/../final_best_model.pkl:/app/final_best_model.pkl \
-v $(pwd)/../tfidf_vectorizer.pkl:/app/tfidf_vectorizer.pkl \
text-classifier-api
Multi-Stage Build¶
The Dockerfile uses multi-stage builds for optimization:
# Build stage
FROM python:3.11-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user -r requirements.txt
# Runtime stage
FROM python:3.11-slim as runtime
COPY --from=builder /root/.local /root/.local
COPY . .
EXPOSE 8000
CMD ["python", "main.py"]
Benefits: - Smaller final image size (~150MB vs ~500MB) - Faster deployments - Better security (no build tools in final image)
Volume Mounting¶
Mount model files as volumes for development:
docker run -p 8000:8000 \
-v /path/to/model.pkl:/app/final_best_model.pkl \
-v /path/to/vectorizer.pkl:/app/tfidf_vectorizer.pkl \
text-classifier-api
Docker Best Practices¶
Security¶
- Run as non-root user
- Use specific image tags
- Scan images for vulnerabilities
- Keep base images updated
Performance¶
- Use multi-stage builds
- Optimize layer caching
- Minimize image layers
- Use appropriate base images
Development¶
- Mount source code as volumes
- Use development-specific Dockerfiles
- Enable hot reload when possible
Troubleshooting Docker Issues¶
Container Won't Start¶
Check logs:
Common issues: - Model files not found - Port already in use - Insufficient memory
Performance Issues¶
Monitor resource usage:
Optimize: - Increase memory limits - Adjust CPU shares - Use faster storage
Networking Issues¶
Check port mapping:
Test connectivity:
Production Deployment¶
Docker Swarm¶
Kubernetes¶
Create deployment YAML:
apiVersion: apps/v1
kind: Deployment
metadata:
name: text-classifier-api
spec:
replicas: 3
selector:
matchLabels:
app: text-classifier-api
template:
metadata:
labels:
app: text-classifier-api
spec:
containers:
- name: text-classifier-api
image: text-classifier-api:latest
ports:
- containerPort: 8000
env:
- name: MODEL_PATH
value: "/app/final_best_model.pkl"
volumeMounts:
- name: model-volume
mountPath: /app/final_best_model.pkl
volumes:
- name: model-volume
configMap:
name: model-config
Health Checks¶
Docker health checks:
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
Monitoring¶
Container Logs¶
Resource Monitoring¶
Application Metrics¶
Access metrics endpoint:
Scaling¶
Horizontal Scaling¶
Load Balancing¶
Use nginx or traefik for load balancing:
version: '3.8'
services:
text-classifier-api:
# ... existing config
deploy:
replicas: 3
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
depends_on:
- text-classifier-api
Backup and Recovery¶
Data Persistence¶
Use named volumes for persistent data:
Backup¶
docker run --rm -v model-data:/data -v $(pwd):/backup alpine tar czf /backup/models.tar.gz -C /data .
Recovery¶
CI/CD Integration¶
GitHub Actions¶
name: Build and Push Docker Image
on:
push:
branches: [ main ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Build Docker image
run: docker build -t text-classifier-api ./api
- name: Push to registry
run: |
echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
docker tag text-classifier-api ${{ secrets.DOCKER_USERNAME }}/text-classifier-api:latest
docker push ${{ secrets.DOCKER_USERNAME }}/text-classifier-api:latest
This provides a complete Docker deployment solution with best practices for development, production, scaling, and monitoring.