refactor(docs): consolidate and cleanup documentation

- Delete 9 deprecated/obsolete docs (~6,300 lines removed)
- Move test PDFs to tests/fixtures/ocr-samples/
- Create docs/DEPLOYMENT.md as principal guide
- Create tests/ocr-validation/README.md
- Update all refs for ultrathin monolith architecture
- Update OCR tests to use relative paths

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Claude Agent
2026-01-22 09:14:51 +00:00
parent 1b9ebf1d8f
commit 62f86250cc
55 changed files with 604 additions and 6334 deletions

247
docs/telegram/DEPLOYMENT.md Normal file
View File

@@ -0,0 +1,247 @@
# Telegram Bot Deployment Notes
## Overview
The ROA2WEB application includes an integrated Telegram bot module that runs as part of the unified FastAPI backend. This requires special configuration when deploying as a Windows service.
## Critical Configuration: Single Worker Only
### The Problem
**Symptom**: Telegram bot errors in logs:
```
telegram.error.Conflict: Conflict: terminated by other getUpdates request;
make sure that only one bot instance is running
```
**Root Cause**:
- Uvicorn is configured to run with multiple workers (e.g., `--workers 4`)
- Each worker process spawns its own Telegram bot instance
- Telegram API allows **only ONE instance** per bot token to poll for updates
- Multiple instances conflict with each other → continuous errors
### The Solution
**✅ ALWAYS use `--workers 1` for the backend service**
The NSSM service MUST be configured with:
```powershell
--workers 1
```
**Why this works**:
- Single uvicorn worker = Single bot instance
- No polling conflicts
- Telegram bot runs cleanly
**Performance Impact**:
- ✅ Minimal - the application is I/O bound (Oracle database queries), not CPU bound
- ✅ Single worker can handle hundreds of concurrent requests with async/await
- ✅ Connection pooling (Oracle + SQLite) ensures efficient resource usage
## Deployment Instructions
### New Installation
When installing with `Install-ROA2WEB.ps1`, the script now **automatically** uses `--workers 1` (as of 2025-12-29).
No manual configuration needed.
### Existing Installation (Upgrade)
If you have an existing installation with `--workers 4`, run the fix script:
```powershell
cd C:\TEMP\ROA2WEB-Scripts
.\Fix-TelegramWorkers.ps1
```
This script will:
1. Stop the ROA2WEB-Backend service
2. Update NSSM parameters to `--workers 1`
3. Restart the service
4. Verify health and check logs for errors
### Manual Fix (Alternative)
If you prefer to fix manually:
```powershell
# Stop service
Stop-Service ROA2WEB-Backend
# Update NSSM parameters
nssm set ROA2WEB-Backend AppParameters "-m uvicorn main:app --host 127.0.0.1 --port 8000 --workers 1"
# Verify
nssm get ROA2WEB-Backend AppParameters
# Start service
Start-Service ROA2WEB-Backend
# Monitor logs
Get-Content C:\inetpub\wwwroot\roa2web\logs\backend-stderr.log -Tail 50 -Wait
```
## Verification
### 1. Check Service Parameters
```powershell
nssm get ROA2WEB-Backend AppParameters
```
Expected output:
```
-m uvicorn main:app --host 127.0.0.1 --port 8000 --workers 1
```
### 2. Check Logs for Bot Startup
```powershell
Get-Content C:\inetpub\wwwroot\roa2web\logs\backend-stderr.log -Tail 50
```
Expected output (should appear **ONCE**, not 4 times):
```
22:42:39 - main - INFO - [TELEGRAM] Starting bot...
22:42:40 - main - INFO - [TELEGRAM] ✅ Bot running: @ROA2WEBBot
```
### 3. Verify No Conflict Errors
After waiting 1-2 minutes, check logs again:
```powershell
Get-Content C:\inetpub\wwwroot\roa2web\logs\backend-stderr.log -Tail 100 | Select-String "Conflict"
```
Expected output: **No results** (no conflict errors)
### 4. Check Process Count
```powershell
Get-Process -Name python | Where-Object { $_.MainWindowTitle -eq "" }
```
You should see:
- **1 parent process** (uvicorn master)
- **1 child process** (the single worker)
- **Total: 2 python processes**
With `--workers 4`, you would see 5 processes (1 parent + 4 workers) ❌
## Troubleshooting
### Still seeing conflict errors after fix?
1. **Verify service parameters**:
```powershell
nssm get ROA2WEB-Backend AppParameters
```
Should show `--workers 1`
2. **Fully restart the service**:
```powershell
Stop-Service ROA2WEB-Backend -Force
Start-Sleep -Seconds 5
Start-Service ROA2WEB-Backend
```
3. **Check for old processes**:
```powershell
Get-Process python | Stop-Process -Force
Start-Service ROA2WEB-Backend
```
### Service won't start after change?
1. **Check logs** for the actual error:
```powershell
Get-Content C:\inetpub\wwwroot\roa2web\logs\backend-stderr.log -Tail 50
```
2. **Common issues**:
- Oracle connection timeout → Check SSH tunnel or Oracle listener
- Module import errors → Verify PYTHONPATH in NSSM config
- Port already in use → Kill process using port 8000
### Cache stats endpoint (502 error)
The `/api/reports/cache/stats` endpoint may return 502 Bad Gateway with multiple workers because:
- Multiple workers try to access the same SQLite cache database file
- SQLite locking conflicts cause worker crashes
- **Resolved by using `--workers 1`** ✅
## Related Issues
### Pydantic v2 Warnings
**Warning**:
```
UserWarning: Valid config keys have changed in V2:
* 'schema_extra' has been renamed to 'json_schema_extra'
```
**Fix**: Updated in `backend/modules/reports/models/trial_balance.py` (2025-12-29)
**Not critical** - just warnings, doesn't affect functionality.
### PTB Handler Warning
**Warning**:
```
PTBUserWarning: If 'per_message=False', 'CallbackQueryHandler' will not be tracked for every message.
```
**Location**: `backend/modules/telegram/bot/email_handlers.py:742`
**Impact**: Informational warning, bot works correctly
**Fix**: Add `per_message=True` to ConversationHandler (optional enhancement)
## Architecture Notes
### Why Not Multiple Workers?
**Question**: Why not run the Telegram bot in a separate process/service?
**Answer**: Possible, but current architecture is simpler:
**Current (Recommended)**:
- ✅ Single service manages everything
- ✅ Unified logging and monitoring
- ✅ Simpler deployment
- ✅ Single process to debug
- ⚠️ Must use `--workers 1`
**Alternative (Separate Bot Service)**:
- ✅ Could use `--workers 4` for web backend
- ❌ Requires two Windows services
- ❌ More complex deployment
- ❌ Two processes to monitor
- ❌ Coordination required
**Decision**: Keep integrated bot with single worker. Performance is excellent for our use case.
### Performance Characteristics
With `--workers 1`:
- **Concurrent requests**: Handled via async/await (100+ concurrent)
- **Database pooling**: 5-10 connections in Oracle pool (shared)
- **Memory usage**: ~300-500 MB per worker
- **CPU usage**: Low (I/O bound, not CPU bound)
- **Response times**: 50-200ms (mostly database query time)
Performance is **more than adequate** for:
- 50-100 concurrent users
- 1000+ requests per minute
- Multiple modules (Reports, Data Entry, Telegram)
## Conclusion
**Always deploy ROA2WEB backend with `--workers 1` to avoid Telegram bot conflicts.**
The fix script (`Fix-TelegramWorkers.ps1`) makes this change automatically for existing installations.
New installations (Install-ROA2WEB.ps1) are pre-configured correctly.

View File

@@ -1,11 +1,7 @@
# ROA2WEB Telegram Bot
> ⚠️ **ARCHITECTURE NOTE**
>
> This documentation partially references the old standalone bot architecture.
> The current architecture is an **ultrathin monolith** where the Telegram bot runs
> as a background task within the main backend (`backend/main.py`) on port 8000/8001.
> The `INTERNAL_API_PORT` variable is no longer used - internal API is at `/api/telegram/internal/*`.
> **Architecture:** Ultrathin Monolith - Telegram bot runs as a background task within the unified backend on port 8000.
> See `docs/telegram/DEPLOYMENT.md` for single-worker requirement.
> **Telegram Frontend for ROA2WEB ERP System** with Direct Command Interface
@@ -71,78 +67,45 @@ The bot uses a standalone SQLite database for:
- **telegram_auth_codes**: Temporary 8-character linking codes (15 min expiry)
- **telegram_sessions**: Active company selection and session state
## Installation & Setup
## Setup & Configuration
### Prerequisites
- Python 3.11+
- Telegram account
- ROA2WEB backend API running (default: http://localhost:8001)
- ROA2WEB backend running (the bot runs as part of the unified backend)
- Telegram Bot Token (from @BotFather)
### Step 1: Create Telegram Bot
### Step 1: Create Telegram Bot (One-time)
1. Open Telegram and search for `@BotFather`
2. Send `/newbot` command
3. Follow prompts to create bot
4. Save the bot token provided
### Step 2: Install Dependencies
### Step 2: Configure Environment
Add to `backend/.env`:
```bash
# Navigate to telegram-bot directory
cd backend/modules/telegram
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
source venv/bin/activate # Linux/Mac
# or
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
```
### Step 3: Configure Environment
```bash
# Copy environment template
cp .env.example .env
# Edit .env file with your configuration
nano .env
```
Required configuration in `.env`:
```bash
# Required
# Telegram Bot Configuration
MODULE_TELEGRAM_ENABLED=true
TELEGRAM_BOT_TOKEN=your_bot_token_from_botfather
BACKEND_URL=http://localhost:8001
# Database
SQLITE_DB_PATH=./data/telegram_bot.db
# Internal API - DEPRECATED (now served via main backend at /api/telegram/internal/*)
# INTERNAL_API_PORT=8002
# Optional
LOG_LEVEL=INFO
SENTRY_DSN=https://your-sentry-dsn
ENVIRONMENT=production
```
### Step 4: Run the Bot
### Step 3: Start the Application
```bash
# Make sure backend is running first
# Backend should be at http://localhost:8001 (or configured BACKEND_URL)
# From project root - starts backend with Telegram bot integrated
./start-prod.sh
# Run the bot
python -m app.main
# Or for testing:
./start-test.sh
```
The bot starts automatically as a background task when `MODULE_TELEGRAM_ENABLED=true`.
> **Important:** Backend must run with `--workers 1` for Telegram bot to work correctly.
> See `docs/telegram/DEPLOYMENT.md` for details.
## Usage
### Available Commands

File diff suppressed because it is too large Load Diff

View File

@@ -1,502 +0,0 @@
# 🐳 Docker Testing Guide - ROA2WEB Telegram Bot
> ⚠️ **DEPRECATED ARCHITECTURE**
>
> This guide references the OLD microservices architecture (separate Telegram container on port 8002).
> The current architecture is an **ultrathin monolith** where the Telegram bot runs as a background
> task within the main backend (port 8000/8001). No separate Docker container needed for Telegram.
This guide provides instructions for testing the Telegram bot Docker deployment.
## 📋 Prerequisites
Before testing Docker deployment:
- [ ] Docker Engine installed (version 20.10+)
- [ ] Docker Compose installed (version 2.0+)
- [ ] At least 2GB RAM available for containers
- [ ] At least 10GB disk space available
## 🔍 Pre-Build Verification
### 1. Check Dockerfile Syntax
```bash
cd /path/to/roa2web/backend/modules/telegram
# Verify Dockerfile exists and is valid
cat Dockerfile
# Check for common issues
grep -n "COPY\|RUN\|ENV" Dockerfile
```
**Expected**:
- Multi-stage build with `builder` and `production` stages
- Non-root user `telegrambot` created
- All required dependencies installed
- Tini init system configured
- Health check defined
### 2. Verify Required Files
```bash
# Check all required files exist
ls -la app/
ls -la requirements.txt
ls -la .env.example
ls -la Dockerfile
```
**Required files**:
-`Dockerfile`
-`requirements.txt`
-`.env.example`
-`app/` directory with all modules
-`.dockerignore` (optional but recommended)
### 3. Check .dockerignore
```bash
cat .dockerignore
```
**Should exclude**:
- `venv/`
- `__pycache__/`
- `*.pyc`
- `.env`
- `data/*.db`
- `.git/`
- `tests/`
---
## 🏗️ Build Tests
### Test 1: Build Telegram Bot Image
```bash
cd /path/to/roa2web/backend/modules/telegram
# Build the image
docker build -t roa2web/telegram-bot:test --target production .
```
**Expected**:
- ✅ Build completes without errors
- ✅ Both stages (builder + production) execute
- ✅ Final image size < 500MB (typically ~300-400MB)
- No security warnings in output
**Troubleshooting**:
- If build fails at requirements install check `requirements.txt` syntax
- If permission errors ensure Dockerfile uses correct user
- If large image size verify multi-stage build is working
### Test 2: Inspect Built Image
```bash
# Check image size
docker images roa2web/telegram-bot:test
# Inspect image details
docker inspect roa2web/telegram-bot:test
# Check layers
docker history roa2web/telegram-bot:test
```
**Expected**:
- Image size: 300-450 MB
- Base: `python:3.11-slim`
- User: `telegrambot` (not root)
- Working dir: `/app`
- Health check configured
### Test 3: Build with Docker Compose
```bash
cd /path/to/roa2web
# Build telegram-bot service
docker-compose build roa-telegram-bot
```
**Expected**:
- Service builds successfully
- Image tagged as `roa2web/telegram-bot:latest`
- No errors or warnings
---
## 🚀 Runtime Tests
### Test 4: Run Standalone Container (Without Backend)
```bash
# Create test .env file
cat > .env.test <<EOF
TELEGRAM_BOT_TOKEN=test_token_here
CLAUDE_API_KEY=test_api_key_here
BACKEND_URL=http://localhost:8000
SQLITE_DB_PATH=/app/data/telegram_bot.db
INTERNAL_API_PORT=8002
LOG_LEVEL=DEBUG
EOF
# Run container in test mode
docker run --rm \
--name telegram-bot-test \
--env-file .env.test \
-p 8002:8002 \
-v $(pwd)/data:/app/data \
roa2web/telegram-bot:test
```
**Expected**:
- Container starts without errors
- Logs show "Telegram bot starting..."
- Database initialized at `/app/data/telegram_bot.db`
- Internal API listening on port 8002
- May fail to connect to Telegram API (expected without valid token)
**Verify**:
```bash
# In another terminal:
# Check container is running
docker ps | grep telegram-bot-test
# Check logs
docker logs telegram-bot-test
# Test internal API health endpoint
curl http://localhost:8002/internal/health
```
### Test 5: Run with Docker Compose (Full Stack)
```bash
cd /path/to/roa2web
# Ensure .env file exists with all required variables
cp .env.example .env
# Edit .env and add:
# - TELEGRAM_BOT_TOKEN
# - CLAUDE_API_KEY
# - ORACLE credentials
# - JWT secret
# Start just the telegram-bot service (depends on backend)
docker-compose up roa-telegram-bot
```
**Expected**:
- Dependencies start first: `roa-redis`, `roa-ssh-tunnel`, `roa-backend`
- Backend health check passes
- Telegram bot starts and connects to backend
- SQLite database persists in `telegram-bot-data` volume
**Troubleshooting**:
```bash
# Check service status
docker-compose ps
# View logs
docker-compose logs roa-telegram-bot
docker-compose logs roa-backend
# Check networks
docker network ls | grep roa-network
docker network inspect roa-network
```
### Test 6: Health Check
```bash
# Wait for service to be healthy
docker-compose ps roa-telegram-bot
# Should show: (healthy)
```
**Expected**:
- Health check passes after ~40 seconds (start_period)
- Health check endpoint returns 200 OK
- Service status shows `(healthy)`
**Manual health check**:
```bash
# Access container
docker exec -it roa-telegram-bot /bin/bash
# Inside container:
python -c "import httpx; import asyncio; asyncio.run(httpx.AsyncClient().get('http://localhost:8002/internal/health'))"
# Should output: 200 OK
```
### Test 7: Database Persistence
```bash
# Start service
docker-compose up -d roa-telegram-bot
# Check database file exists
docker exec roa-telegram-bot ls -la /app/data/
# Expected: telegram_bot.db file
# Stop and remove container
docker-compose down
# Start again
docker-compose up -d roa-telegram-bot
# Verify data persisted
docker exec roa-telegram-bot sqlite3 /app/data/telegram_bot.db "SELECT COUNT(*) FROM telegram_users;"
```
**Expected**:
- Database file persists across container restarts
- Data remains intact in `telegram-bot-data` volume
- No data loss
### Test 8: Environment Variables
```bash
# Check environment variables are set
docker exec roa-telegram-bot env | grep -E "TELEGRAM|CLAUDE|BACKEND|SQLITE"
```
**Expected**:
```
TELEGRAM_BOT_TOKEN=<your_token>
CLAUDE_API_KEY=<your_key>
BACKEND_URL=http://roa-backend:8000
SQLITE_DB_PATH=/app/data/telegram_bot.db
INTERNAL_API_PORT=8002
```
### Test 9: Network Connectivity
```bash
# Test bot can reach backend
docker exec roa-telegram-bot curl -v http://roa-backend:8000/health
# Test backend can reach bot internal API
docker exec roa-backend curl -v http://roa-telegram-bot:8002/internal/health
```
**Expected**:
- Both services can communicate via `roa-network`
- DNS resolution works (service names resolve)
- Health endpoints return 200 OK
### Test 10: Logs and Monitoring
```bash
# View real-time logs
docker-compose logs -f roa-telegram-bot
# View last 100 lines
docker-compose logs --tail=100 roa-telegram-bot
# Search for errors
docker-compose logs roa-telegram-bot | grep -i error
```
**Expected**:
- Logs are readable and structured
- No critical errors
- Log level respects `LOG_LEVEL` env var
---
## 🔒 Security Tests
### Test 11: User Permissions
```bash
# Check container is not running as root
docker exec roa-telegram-bot whoami
# Expected: telegrambot
# Check file permissions
docker exec roa-telegram-bot ls -la /app/
# Expected: All files owned by telegrambot:telegrambot
```
### Test 12: Port Exposure
```bash
# Check exposed ports
docker port roa-telegram-bot
# Should only show:
# 8002/tcp -> 0.0.0.0:8002
```
**Expected**:
- Only internal API port (8002) exposed
- No unnecessary ports open
### Test 13: Volume Mounts
```bash
# Check volumes
docker volume inspect roa2web_telegram-bot-data
# Check mount point
docker inspect roa-telegram-bot | grep -A 10 Mounts
```
**Expected**:
- Only `/app/data` is mounted
- Volume is named `telegram-bot-data`
- No sensitive files mounted
---
## 🧪 Integration Tests
### Test 14: Full Stack Integration
```bash
# Start all services
cd /path/to/roa2web
docker-compose up -d
# Wait for all services to be healthy
docker-compose ps
# Test complete flow:
# 1. Backend generates auth code
# 2. Bot verifies code
# 3. User links account
# 4. Bot queries backend API
```
**Test Steps**:
1. **Generate Auth Code via Backend**:
```bash
curl -X POST http://localhost:8000/api/telegram/auth/generate-code \
-H "Authorization: Bearer <jwt_token>" \
-H "Content-Type: application/json" \
-d '{"telegram_user_id": 123456}'
```
2. **Verify Code in Bot Database**:
```bash
docker exec roa-telegram-bot sqlite3 /app/data/telegram_bot.db \
"SELECT * FROM telegram_auth_codes WHERE telegram_user_id = 123456;"
```
3. **Link via Telegram Bot**:
- Send code to bot via Telegram app
- Verify linking succeeds
4. **Query Dashboard**:
- Ask bot: "Show dashboard for company 1"
- Verify data is retrieved from backend
---
## 🛑 Cleanup
### Remove Test Containers
```bash
# Stop and remove containers
docker-compose down
# Remove volumes (WARNING: deletes data)
docker-compose down -v
# Remove images
docker rmi roa2web/telegram-bot:test
docker rmi roa2web/telegram-bot:latest
```
### Clean Build Cache
```bash
# Remove build cache
docker builder prune -a
# Remove unused images
docker image prune -a
```
---
## 📊 Test Results Checklist
| Test ID | Description | Status | Notes |
|---------|-------------|--------|-------|
| 1 | Build telegram-bot image | | |
| 2 | Inspect image | | |
| 3 | Build with docker-compose | | |
| 4 | Run standalone container | | |
| 5 | Run with docker-compose | | |
| 6 | Health check | | |
| 7 | Database persistence | | |
| 8 | Environment variables | | |
| 9 | Network connectivity | | |
| 10 | Logs and monitoring | | |
| 11 | User permissions | | |
| 12 | Port exposure | | |
| 13 | Volume mounts | | |
| 14 | Full stack integration | | |
**Overall Result**: PASS FAIL
---
## 🐛 Common Issues & Solutions
### Issue 1: Build Fails - "requirements.txt not found"
**Solution**: Ensure you're in the correct directory (`telegram-bot/`) when building
### Issue 2: Permission Denied Errors
**Solution**: Check Dockerfile uses correct user and permissions are set with `chown`
### Issue 3: Health Check Fails
**Solution**:
- Check internal API is starting on port 8002
- Verify httpx is installed in requirements.txt
- Check logs: `docker logs roa-telegram-bot`
### Issue 4: Can't Connect to Backend
**Solution**:
- Ensure both containers are on `roa-network`
- Check backend is healthy before starting bot
- Use service name `roa-backend` not `localhost`
### Issue 5: Database Not Persisting
**Solution**:
- Verify volume is mounted: `docker inspect roa-telegram-bot`
- Check volume exists: `docker volume ls | grep telegram-bot-data`
- Ensure `/app/data` has write permissions
---
## ✅ Success Criteria
For Docker deployment to be considered successful:
- Image builds without errors
- Container starts and runs stably
- Health checks pass
- Bot connects to Telegram API
- Bot connects to backend API
- Database persists across restarts
- No security warnings or vulnerabilities
- Logs are clean (no critical errors)
- All network connectivity works
- Full stack integration succeeds
---
**Last Updated**: 2025-10-21

View File

@@ -6,29 +6,32 @@ This checklist guides you through manual testing of the Telegram bot functionali
Before starting manual tests:
- [ ] Backend API is running (`http://localhost:8001`)
- [ ] SSH tunnel to Oracle DB is active
- [ ] Telegram bot is running (`python -m app.main`)
- [ ] TELEGRAM_BOT_TOKEN is configured in `.env`
- [ ] CLAUDE_API_KEY is configured in `.env` (if using real Claude SDK)
- [ ] SQLite database is initialized (`data/telegram_bot.db` exists)
- [ ] Backend API is running (`http://localhost:8000` or `8001`)
- [ ] SSH tunnel to Oracle DB is active (Linux dev only)
- [ ] `MODULE_TELEGRAM_ENABLED=true` in `.env`
- [ ] `TELEGRAM_BOT_TOKEN` is configured in `.env`
- [ ] SQLite database is initialized (auto-created on first run)
> **Note:** In the **ultrathin monolith** architecture, the Telegram bot runs as a background task
> within the unified backend. There's no separate bot process to start.
## 📱 Test Environment Setup
### Start Services
```bash
# Terminal 1: Start backend API (from roa2web/)
cd backend
source venv/bin/activate
uvicorn app.main:app --reload --port 8001
# From project root - starts everything (SSH tunnel + backend + frontend)
./start-prod.sh
# Terminal 2: Start Telegram bot
cd backend/modules/telegram
source venv/bin/activate
python -m app.main
# Or for testing mode:
./start-test.sh
# Check status
./status.sh
```
The bot starts automatically with the backend when `MODULE_TELEGRAM_ENABLED=true`.
### Test User Setup
- [ ] Create test Oracle user account in Oracle database (if needed)