refactor(docs): consolidate and cleanup documentation
- Delete 9 deprecated/obsolete docs (~6,300 lines removed) - Move test PDFs to tests/fixtures/ocr-samples/ - Create docs/DEPLOYMENT.md as principal guide - Create tests/ocr-validation/README.md - Update all refs for ultrathin monolith architecture - Update OCR tests to use relative paths Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
247
docs/telegram/DEPLOYMENT.md
Normal file
247
docs/telegram/DEPLOYMENT.md
Normal file
@@ -0,0 +1,247 @@
|
||||
# Telegram Bot Deployment Notes
|
||||
|
||||
## Overview
|
||||
|
||||
The ROA2WEB application includes an integrated Telegram bot module that runs as part of the unified FastAPI backend. This requires special configuration when deploying as a Windows service.
|
||||
|
||||
## Critical Configuration: Single Worker Only
|
||||
|
||||
### The Problem
|
||||
|
||||
**Symptom**: Telegram bot errors in logs:
|
||||
```
|
||||
telegram.error.Conflict: Conflict: terminated by other getUpdates request;
|
||||
make sure that only one bot instance is running
|
||||
```
|
||||
|
||||
**Root Cause**:
|
||||
- Uvicorn is configured to run with multiple workers (e.g., `--workers 4`)
|
||||
- Each worker process spawns its own Telegram bot instance
|
||||
- Telegram API allows **only ONE instance** per bot token to poll for updates
|
||||
- Multiple instances conflict with each other → continuous errors
|
||||
|
||||
### The Solution
|
||||
|
||||
**✅ ALWAYS use `--workers 1` for the backend service**
|
||||
|
||||
The NSSM service MUST be configured with:
|
||||
```powershell
|
||||
--workers 1
|
||||
```
|
||||
|
||||
**Why this works**:
|
||||
- Single uvicorn worker = Single bot instance
|
||||
- No polling conflicts
|
||||
- Telegram bot runs cleanly
|
||||
|
||||
**Performance Impact**:
|
||||
- ✅ Minimal - the application is I/O bound (Oracle database queries), not CPU bound
|
||||
- ✅ Single worker can handle hundreds of concurrent requests with async/await
|
||||
- ✅ Connection pooling (Oracle + SQLite) ensures efficient resource usage
|
||||
|
||||
## Deployment Instructions
|
||||
|
||||
### New Installation
|
||||
|
||||
When installing with `Install-ROA2WEB.ps1`, the script now **automatically** uses `--workers 1` (as of 2025-12-29).
|
||||
|
||||
No manual configuration needed.
|
||||
|
||||
### Existing Installation (Upgrade)
|
||||
|
||||
If you have an existing installation with `--workers 4`, run the fix script:
|
||||
|
||||
```powershell
|
||||
cd C:\TEMP\ROA2WEB-Scripts
|
||||
.\Fix-TelegramWorkers.ps1
|
||||
```
|
||||
|
||||
This script will:
|
||||
1. Stop the ROA2WEB-Backend service
|
||||
2. Update NSSM parameters to `--workers 1`
|
||||
3. Restart the service
|
||||
4. Verify health and check logs for errors
|
||||
|
||||
### Manual Fix (Alternative)
|
||||
|
||||
If you prefer to fix manually:
|
||||
|
||||
```powershell
|
||||
# Stop service
|
||||
Stop-Service ROA2WEB-Backend
|
||||
|
||||
# Update NSSM parameters
|
||||
nssm set ROA2WEB-Backend AppParameters "-m uvicorn main:app --host 127.0.0.1 --port 8000 --workers 1"
|
||||
|
||||
# Verify
|
||||
nssm get ROA2WEB-Backend AppParameters
|
||||
|
||||
# Start service
|
||||
Start-Service ROA2WEB-Backend
|
||||
|
||||
# Monitor logs
|
||||
Get-Content C:\inetpub\wwwroot\roa2web\logs\backend-stderr.log -Tail 50 -Wait
|
||||
```
|
||||
|
||||
## Verification
|
||||
|
||||
### 1. Check Service Parameters
|
||||
|
||||
```powershell
|
||||
nssm get ROA2WEB-Backend AppParameters
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
-m uvicorn main:app --host 127.0.0.1 --port 8000 --workers 1
|
||||
```
|
||||
|
||||
### 2. Check Logs for Bot Startup
|
||||
|
||||
```powershell
|
||||
Get-Content C:\inetpub\wwwroot\roa2web\logs\backend-stderr.log -Tail 50
|
||||
```
|
||||
|
||||
Expected output (should appear **ONCE**, not 4 times):
|
||||
```
|
||||
22:42:39 - main - INFO - [TELEGRAM] Starting bot...
|
||||
22:42:40 - main - INFO - [TELEGRAM] ✅ Bot running: @ROA2WEBBot
|
||||
```
|
||||
|
||||
### 3. Verify No Conflict Errors
|
||||
|
||||
After waiting 1-2 minutes, check logs again:
|
||||
|
||||
```powershell
|
||||
Get-Content C:\inetpub\wwwroot\roa2web\logs\backend-stderr.log -Tail 100 | Select-String "Conflict"
|
||||
```
|
||||
|
||||
Expected output: **No results** (no conflict errors)
|
||||
|
||||
### 4. Check Process Count
|
||||
|
||||
```powershell
|
||||
Get-Process -Name python | Where-Object { $_.MainWindowTitle -eq "" }
|
||||
```
|
||||
|
||||
You should see:
|
||||
- **1 parent process** (uvicorn master)
|
||||
- **1 child process** (the single worker)
|
||||
- **Total: 2 python processes**
|
||||
|
||||
With `--workers 4`, you would see 5 processes (1 parent + 4 workers) ❌
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Still seeing conflict errors after fix?
|
||||
|
||||
1. **Verify service parameters**:
|
||||
```powershell
|
||||
nssm get ROA2WEB-Backend AppParameters
|
||||
```
|
||||
Should show `--workers 1`
|
||||
|
||||
2. **Fully restart the service**:
|
||||
```powershell
|
||||
Stop-Service ROA2WEB-Backend -Force
|
||||
Start-Sleep -Seconds 5
|
||||
Start-Service ROA2WEB-Backend
|
||||
```
|
||||
|
||||
3. **Check for old processes**:
|
||||
```powershell
|
||||
Get-Process python | Stop-Process -Force
|
||||
Start-Service ROA2WEB-Backend
|
||||
```
|
||||
|
||||
### Service won't start after change?
|
||||
|
||||
1. **Check logs** for the actual error:
|
||||
```powershell
|
||||
Get-Content C:\inetpub\wwwroot\roa2web\logs\backend-stderr.log -Tail 50
|
||||
```
|
||||
|
||||
2. **Common issues**:
|
||||
- Oracle connection timeout → Check SSH tunnel or Oracle listener
|
||||
- Module import errors → Verify PYTHONPATH in NSSM config
|
||||
- Port already in use → Kill process using port 8000
|
||||
|
||||
### Cache stats endpoint (502 error)
|
||||
|
||||
The `/api/reports/cache/stats` endpoint may return 502 Bad Gateway with multiple workers because:
|
||||
- Multiple workers try to access the same SQLite cache database file
|
||||
- SQLite locking conflicts cause worker crashes
|
||||
- **Resolved by using `--workers 1`** ✅
|
||||
|
||||
## Related Issues
|
||||
|
||||
### Pydantic v2 Warnings
|
||||
|
||||
**Warning**:
|
||||
```
|
||||
UserWarning: Valid config keys have changed in V2:
|
||||
* 'schema_extra' has been renamed to 'json_schema_extra'
|
||||
```
|
||||
|
||||
**Fix**: Updated in `backend/modules/reports/models/trial_balance.py` (2025-12-29)
|
||||
|
||||
**Not critical** - just warnings, doesn't affect functionality.
|
||||
|
||||
### PTB Handler Warning
|
||||
|
||||
**Warning**:
|
||||
```
|
||||
PTBUserWarning: If 'per_message=False', 'CallbackQueryHandler' will not be tracked for every message.
|
||||
```
|
||||
|
||||
**Location**: `backend/modules/telegram/bot/email_handlers.py:742`
|
||||
|
||||
**Impact**: Informational warning, bot works correctly
|
||||
|
||||
**Fix**: Add `per_message=True` to ConversationHandler (optional enhancement)
|
||||
|
||||
## Architecture Notes
|
||||
|
||||
### Why Not Multiple Workers?
|
||||
|
||||
**Question**: Why not run the Telegram bot in a separate process/service?
|
||||
|
||||
**Answer**: Possible, but current architecture is simpler:
|
||||
|
||||
**Current (Recommended)**:
|
||||
- ✅ Single service manages everything
|
||||
- ✅ Unified logging and monitoring
|
||||
- ✅ Simpler deployment
|
||||
- ✅ Single process to debug
|
||||
- ⚠️ Must use `--workers 1`
|
||||
|
||||
**Alternative (Separate Bot Service)**:
|
||||
- ✅ Could use `--workers 4` for web backend
|
||||
- ❌ Requires two Windows services
|
||||
- ❌ More complex deployment
|
||||
- ❌ Two processes to monitor
|
||||
- ❌ Coordination required
|
||||
|
||||
**Decision**: Keep integrated bot with single worker. Performance is excellent for our use case.
|
||||
|
||||
### Performance Characteristics
|
||||
|
||||
With `--workers 1`:
|
||||
- **Concurrent requests**: Handled via async/await (100+ concurrent)
|
||||
- **Database pooling**: 5-10 connections in Oracle pool (shared)
|
||||
- **Memory usage**: ~300-500 MB per worker
|
||||
- **CPU usage**: Low (I/O bound, not CPU bound)
|
||||
- **Response times**: 50-200ms (mostly database query time)
|
||||
|
||||
Performance is **more than adequate** for:
|
||||
- 50-100 concurrent users
|
||||
- 1000+ requests per minute
|
||||
- Multiple modules (Reports, Data Entry, Telegram)
|
||||
|
||||
## Conclusion
|
||||
|
||||
**Always deploy ROA2WEB backend with `--workers 1` to avoid Telegram bot conflicts.**
|
||||
|
||||
The fix script (`Fix-TelegramWorkers.ps1`) makes this change automatically for existing installations.
|
||||
|
||||
New installations (Install-ROA2WEB.ps1) are pre-configured correctly.
|
||||
Reference in New Issue
Block a user