Contributing¶
Guidelines for contributing to the Text Classification API project.
Development Setup¶
Prerequisites¶
- Python 3.11+
- Docker 20.10+
- Git
Local Development¶
-
Clone the repository:
-
Set up virtual environment:
-
Install dependencies:
-
Set up environment variables:
-
Run the API:
-
Test the API:
Docker Development¶
-
Build the image:
-
Run the container:
Development Workflow¶
1. Choose an Issue¶
- Check existing issues for tasks
- Create a new issue for bugs or features
- Discuss large changes before starting work
2. Create a Branch¶
3. Make Changes¶
- Follow the coding standards
- Write tests for new features
- Update documentation
- Test your changes
4. Commit Changes¶
5. Push and Create PR¶
Create a pull request with: - Clear description of changes - Screenshots for UI changes - Test results - Documentation updates
Coding Standards¶
Python Style¶
- Follow PEP 8 style guide
- Use Black for code formatting
- Use isort for import sorting
- Maximum line length: 88 characters
Code Quality¶
- Write descriptive variable and function names
- Add docstrings to all functions and classes
- Use type hints for function parameters and return values
- Keep functions small and focused
Example Code Style¶
from typing import List, Optional
from pydantic import BaseModel
class PredictionRequest(BaseModel):
"""Request model for text classification."""
texts: List[str]
batch_size: Optional[int] = 10
async def predict_texts(
request: PredictionRequest,
model_cache: ModelCache
) -> PredictionResponse:
"""
Classify multiple texts using the trained model.
Args:
request: Prediction request with texts and batch size
model_cache: Cached model instance
Returns:
Prediction response with classifications
"""
# Implementation here
pass
Testing¶
Running Tests¶
# Run all tests
pytest
# Run with coverage
pytest --cov=api --cov-report=html
# Run specific test file
pytest tests/test_api.py
# Run with verbose output
pytest -v
Writing Tests¶
- Use
pytestframework - Place tests in
tests/directory - Name test files
test_*.py - Use descriptive test names
import pytest
from fastapi.testclient import TestClient
def test_predict_endpoint(client: TestClient):
"""Test the predict endpoint with valid input."""
response = client.post(
"/predict",
json={"texts": ["Great product!"]}
)
assert response.status_code == 200
data = response.json()
assert "predictions" in data
assert len(data["predictions"]) == 1
def test_predict_invalid_input(client: TestClient):
"""Test the predict endpoint with invalid input."""
response = client.post(
"/predict",
json={"invalid": "data"}
)
assert response.status_code == 422
Test Coverage¶
- Aim for >80% code coverage
- Cover happy path and error cases
- Test edge cases and boundary conditions
Documentation¶
API Documentation¶
- Update API reference for new endpoints
- Add examples for new features
- Document breaking changes
Code Documentation¶
- Add docstrings to new functions
- Update README for new features
- Update configuration documentation
Building Docs¶
Performance Guidelines¶
Memory Usage¶
- Keep memory usage under 512MB for free tier compatibility
- Use streaming for large responses
- Implement proper cleanup for resources
Response Times¶
- API responses should be < 2 seconds
- Implement timeouts for long operations
- Use async processing for I/O operations
Scalability¶
- Design for horizontal scaling
- Use stateless operations
- Implement proper caching strategies
Security Considerations¶
Input Validation¶
- Validate all user inputs
- Sanitize text inputs
- Implement rate limiting (future)
Dependencies¶
- Keep dependencies updated
- Use trusted packages only
- Scan for vulnerabilities regularly
Secrets Management¶
- Never commit secrets to code
- Use environment variables for configuration
- Document required environment variables
Git Guidelines¶
Commit Messages¶
Follow conventional commit format:
Types:
- feat: New feature
- fix: Bug fix
- docs: Documentation
- style: Code style changes
- refactor: Code refactoring
- test: Testing
- chore: Maintenance
Examples:
feat: add batch prediction endpoint
- Support processing multiple texts in single request
- Improve throughput by 3x
- Add comprehensive input validation
Closes #123
fix: handle empty text input
- Return appropriate error for empty strings
- Add test case for edge case
Fixes #456
Branch Naming¶
feature/description-of-featurefix/issue-number-descriptiondocs/update-documentationrefactor/component-name
Pull Request Process¶
Before Submitting¶
- [ ] Code follows style guidelines
- [ ] Tests pass and coverage >80%
- [ ] Documentation updated
- [ ] No breaking changes without discussion
- [ ] Commit messages follow conventions
PR Description¶
Include: - What changes were made - Why changes were needed - How to test the changes - Screenshots/videos for UI changes - Breaking changes and migration notes
Review Process¶
- Automated checks run (tests, linting)
- Code review by maintainer
- Address review comments
- Merge when approved
Issue Reporting¶
Bug Reports¶
Include: - Steps to reproduce - Expected behavior - Actual behavior - Environment details - Error messages/logs
Feature Requests¶
Include: - Use case description - Proposed solution - Alternative solutions considered - Additional context
Community¶
Code of Conduct¶
- Be respectful and inclusive
- Focus on constructive feedback
- Help newcomers learn
- Report issues appropriately
Getting Help¶
- Check existing issues and documentation
- Ask questions in discussions
- Be patient with responses
License¶
By contributing, you agree that your contributions will be licensed under the same license as the project.
Recognition¶
Contributors will be acknowledged in: - CHANGELOG.md for significant contributions - GitHub contributors list - Release notes
Thank you for contributing to the Text Classification API! 🚀