Files
roa2web-service-auto/.gitignore
Marius Mutu 495790411f feat(ocr): Add docTR OCR engine with metrics infrastructure
Add docTR as primary OCR engine with 2-tier sequential processing,
OCR metrics tracking, and simplified engine selection.

Features:
- docTR OCR engine with light+medium preprocessing tiers
- doctr_plus mode with early exit optimization (~65% fast path)
- OCR metrics dashboard with per-engine statistics
- User OCR preference persistence
- Parallel worker pool for OCR processing
- Cross-validation for extraction quality

Engine options: tesseract, doctr, doctr_plus (recommended), paddleocr

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 05:37:16 +02:00

528 lines
11 KiB
Plaintext

!.env.example
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# For a library or package, you might want to ignore these files since the code is
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
# in version control.
# install all needed dependencies.
# intended to run in multiple environments; otherwise, check them in:
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# Cursor is an AI-powered code editor.`.cursorignore` specifies files/directories to
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# Usually these files are written by a python script from a template
# and can be added to the global gitignore or merged into this file. For a more nuclear
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# before PyInstaller builds the exe, so as to inject date/other infos into it.
# exclude from AI features like autocomplete and code analysis. Recommended for sensitive data
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
# refer to https://docs.cursor.com/context/ignore-files
# .python-version
# ============================================================================
# ============================================================================
# ============================================================================
# Backup Directories (generated by security scripts)
# Backup and Temporary Files
# Backup files that might contain secrets
# Byte-compiled / optimized / DLL files
# C extensions
# Celery stuff
# Claude IDE Settings (local configurations)
# Cloud and Infrastructure
# Configuration with Secrets
# Cursor
# Cython debug symbols
# Database
# Database Files
# Database and Connection Strings
# Deployment temporary files
# Deployment temporary files
# Development Logs
# Development Logs and Debug Files
# Development Tools Cache (may contain sensitive data)
# Development and Build Artifacts
# Distribution / packaging
# Django stuff:
# Docker
# Docker Development Files
# Docker secrets
# Environment variables
# Environments
# FastAPI
# Flask stuff:
# IDE
# IDE and Editor Files
# IPython
# Installer logs
# Jupyter Notebook
# Logs
# Nginx logs
# Node.js & Vue.js
# OS
# Oracle
# Oracle and Database
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
# Package Manager Cache and Locks
# Playwright
# Production Credentials (generated by setup scripts)
# Production Setup Files (generated by scripts)
# PyBuilder
# PyCharm
# PyInstaller
# PyPI configuration file
# Pyre type checker
# Python
# ROA2WEB .gitignore
# Rope project settings
# Ruff stuff:
# SSH Keys and Certificates
# SSH Keys și Secrets (IMPORTANT!)
# SSH Test Files
# SageMath parsed files
# Scan Reports and Security Artifacts
# Scrapy stuff:
# Secrets and Credentials
# Security Scan Reports
# Serena Cache și Memories (Local Development)
# Sphinx documentation
# Spyder project settings
# Telegram Bot SQLite Database (STANDALONE)
# Temporary Test Scripts
# Test Results and Reports
# Test Scripts and Temporary Files
# Translations
# UV
# Unit test / coverage reports
# Virtual environments
# Windows deployment package (generated by Build-Frontend.ps1)
# Windows deployment package (generated by Build-Frontend.ps1)
# mkdocs documentation
# mypy
# pdm
# pipenv
# poetry
# pyenv
# pytype static type analyzer
# 📦 DEPLOYMENT ARTIFACTS - DO NOT COMMIT
# 📦 DEPLOYMENT ARTIFACTS - DO NOT COMMIT
# 🔒 SECURITY CRITICAL PATTERNS - DO NOT COMMIT THESE FILES
# 🧹 ROA2WEB SPECIFIC TEMPORARY FILES - DO NOT COMMIT
# 🧹 TEMPORARY FILES AND DEVELOPMENT ARTIFACTS - DO NOT COMMIT
#.idea/
#Pipfile.lock
#pdm.lock
#poetry.lock
#uv.lock
*$py.class
*$py.class
*.backup
*.bak
*.cover
*.cover
*.crt
*.csr
*.db
*.db
*.debug
*.debug
*.egg
*.egg
*.egg-info/
*.egg-info/
*.jks
*.key
*.key
*.keystore
*.log
*.log
*.manifest
*.mo
*.old
*.ora
*.ora
*.orig
*.p12
*.pem
*.pem
*.pfx
*.pot
*.pub
*.py,cover
*.py[cod]
*.py[cod]
*.rsa
*.sage.py
*.so
*.so
*.spec
*.sqlite
*.sqlite3
*.sqlite3
*.sublime-*
*.swo
*.swo
*.swp
*.swp
*.temp
*.tmp
*.wallet
*_backup_*
*_backup_*
*_rsa
*_rsa.pub
*_temp_*
*_tmp_*
# Sensitive configuration files (specific patterns only)
*passwd*
*password*.txt
*password*.json
*password*.yml
*password*.yaml
*.env.prod
*.env.production
*.env.staging
*prod.env*
*production.env*
*staging.env*
*credentials*.json
*credentials*.txt
*credentials*.yml
*credentials*.yaml
*secret*.txt
*secret*.json
*secret*.yml
*secret*.yaml
*secrets*.txt
*secrets*.json
*secrets*.yml
*secrets*.yaml
*token*.txt
*token*.json
*dsn*.txt
*dsn*.json
# Security scan and cleanup reports
*cleanup*.json
*report*.json
*scan*.json
*security*.json
# Temporary test scripts (but allow proper test files)
*ssh_test*.sh
*tunnel_test*.sh
temp_test.*
quick_test.*
*test_*.bat
*~
*~
.DS_Store
.DS_Store
.DS_Store?
.DS_Store?
.Python
.Python
.Spotlight-V100
.Spotlight-V100
.Trashes
.Trashes
._*
._*
.aws/
.azure/
.cache
.claude/settings.local.json
.coverage
.coverage
.coverage.*
.coverage.*
.credentials/
.cursorignore
.cursorindexingignore
.dmypy.json
.docker/
.dockerignore
.eggs/
.eggs/
# Environment files - ignore ALL .env files (security best practice)
.env
.env.*
.env.*.local
.env.local
# Allow only .env.example templates (no credentials)
!.env.example
!**/.env.example
!.env.*.example
!**/.env.*.example
# Allow .dockerignore files (Docker build configuration)
!.dockerignore
!**/.dockerignore
.gcp/
.hypothesis/
.hypothesis/
.idea/
.idea/
.installed.cfg
.installed.cfg
.ipynb_checkpoints
.mypy_cache/
.nox/
.nuxt/
.output/
.pdm-build/
.pdm-python
.pdm.toml
.pnpm-debug.log*
.pnpm-debug.log*
.pybuilder/
.pypirc
.pyre/
.pytest_cache/
.pytest_cache/
.pytype/
.ropeproject
.ruff_cache/
.scrapy
.secrets/
.serena/cache/
.serena/cache/
.serena/memories/
.serena/memories/
.spyderproject
.spyproject
.terraform/
.tox/
.venv
.venv
.vite/
.vscode/
.vscode/launch.json
.vscode/settings.json
.vscode/tasks.json
.webassets-cache
.yarn/build-state.yml
.yarn/cache/
.yarn/install-state.gz
.yarn/unplugged/
/blob-report/
/playwright-report/
/playwright/.cache/
/site
ENV/
ENV/
MANIFEST
MANIFEST
PRODUCTION_CREDENTIALS.md
PRODUCTION_CREDENTIALS.md
Thumbs.db
Thumbs.db
__pycache__/
__pycache__/
__pypackages__/
access.log
ansible-vault*
authorized_keys
build/
build/
celerybeat-schedule
celerybeat.pid
config.prod.*
config.production.*
cover/
coverage.xml
coverage.xml
coverage/
credentials/
cython_debug/
db.sqlite3
db.sqlite3-journal
debug.log
# ==============================================================================
# 📦 WINDOWS DEPLOYMENT ARTIFACTS - Generated by Build-ROA2WEB.ps1
# ==============================================================================
# These files are build artifacts and should NOT be committed to git
# They are generated fresh by running: .\Build-ROA2WEB.ps1
# Deployment packages (all variations)
deployment/windows/deploy-package/
deploy-package/
**/deploy-package/
# Build cache (npm node_modules cache for faster builds)
deployment/windows/.build-cache/
deployment/windows/.build-cache-*/
.build-cache/
.build-cache-*/
**/.build-cache/
**/.build-cache-*/
# Shared folder copied during build (temporary)
deployment/shared/
# Temp frontend build directories
**/.temp-frontend-build/
# Deployment logs and temporary files
deployment/windows/scripts/*.log
deployment/windows/temp/
deployment/windows/*.log
# Cache and database files in deployment artifacts
**/cache_data.db
**/roa2web_cache.db
# Deployment temporary files
deploy_production.sh
dev.log
dev.log
develop-eggs/
develop-eggs/
dist/
dist/
dmypy.json
docker-compose.override.yml
docs/_build/
downloads/
downloads/
eggs/
eggs/
ehthumbs.db
ehthumbs.db
env.bak/
env.bak/
env/
env/
error.log
htmlcov/
htmlcov/
id_rsa*
id_rsa*
instance/
ipython_config.py
known_hosts
ldap.ora
lib/
lib/
lib64/
lib64/
local_settings.py
logs/
nginx/logs/*.log
node_modules/
node_modules/
nosetests.xml
npm-debug.log*
npm-debug.log*
package-lock.json
parts/
parts/
pip-delete-this-directory.txt
pip-log.txt
playwright-report/
profile_default/
quick_test.*
quick_test.*
run_tests.*
run_tests.*
scan_*.json
sdist/
sdist/
secrets/
# Allow documentation in secrets directories
!**/secrets/README.md
security_*.json
share/python-wheels/
sqlnet.ora
ssh-tunnel/*_rsa
ssh-tunnel/*_rsa.pub
ssh-tunnel/roa_oracle_server
ssh_host_*
target/
temp_test.*
terraform.tfstate*
test-results/
test_*.bat
test_*.py
test_*.sh
test_results/
test-results/
tnsnames.ora
tnsnames.ora
var/
var/
venv.bak/
venv.bak/
venv/
venv/
venv-win/
ocr_benchmark_*.json
wallet/
wheels/
wheels/
yarn-debug.log*
yarn-debug.log*
yarn-error.log*
yarn-error.log*
# Allow proper test files (pytest, unittest) but exclude temporary test scripts
!**/tests/test_*.py
!**/test_*.py
# ============================================================================
# 🧹 TEMPORARY FILES FROM DEBUGGING SESSION - DO NOT COMMIT
# ============================================================================
# PID files from bot processes
*.pid
bot.pid
# Requirements checksums (generated by start-dev.sh)
*.checksum
# Temporary debugging/testing scripts in telegram-bot
reports-app/telegram-bot/check_db.py
reports-app/telegram-bot/test_email.py
# Temporary planning documents
TELEGRAM_EMAIL_AUTH_PLAN*.md
# Weird pip artifacts
=*
# ============================================================================
# 🔒 ENCRYPTED SECRETS BACKUP (Optional)
# ============================================================================
# Encrypted backups created by scripts/backup-secrets.sh
# Option 1: Commit encrypted .gpg files (safe, password-protected)
# Option 2: Ignore all backups (keep only local)
# Uncomment to ignore all backups (Option 2):
# secrets-backup/
# Keep only encrypted files, ignore decrypted ones (Option 1):
secrets-backup/**/*.env
secrets-backup/**/.env.*
!secrets-backup/**/*.gpg
.playwright-mcp/*
# Auto-Build local data (worktrees, cache)
.auto-build-data/
# ============================================================================
# 🏗️ ULTRATHIN MONOLITH BACKEND DATA - DO NOT COMMIT
# ============================================================================
# Backend unified data directories (cache, receipts, telegram)
backend/data/cache/*.db
backend/data/receipts/*.db
backend/data/telegram/*.db
backend/data/receipts/uploads/*
backend/data/ocr_queue/
!backend/data/*/.gitkeep