Files
btgo-playwright/CLAUDE.md

160 lines
5.7 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code when working with code in this repository.
## Project Overview
BTGO Scraper - Playwright automation for extracting account balances and transaction CSVs from Banca Transilvania George (go.bancatransilvania.ro).
**Security Context**: Authorized personal banking automation tool for educational purposes.
**CRITICAL**: Docker is **BLOCKED by WAF** (datacenter IP). Run locally only.
## Lessons Learned (deployment)
- **Headless vs WAF**: Akamai blocks the *old* headless mode (plain `headless=True`),
but the *new* headless mode (`--headless=new`) passes. Launch with `headless=False`
+ arg `--headless=new` + a clean Chrome UA (the headless UA contains `HeadlessChrome`,
which Akamai detects).
- **Session 0 has no GPU**: Running as a Windows Service (no user logged in) means
Chromium does software rendering -> CPU 100%, slow. Mitigation, not a cure: smaller
viewport (1366x768, ~50% fewer pixels than 1920x1080; must stay >~1100px or BT shows
a "redirect to store" splash) + overhead flags (`--disable-gpu` etc.). The real fix
for speed would be running outside Session 0, with a GPU.
- **No fixed `time.sleep()` for page state**: on the slow prod machine, hardcoded
waits time out. Always wait on a selector/condition (`wait_for_selector`), generously.
- **`networkidle` never fires**: BT pages poll trackers continuously. Use
`domcontentloaded` + explicit selector waits.
## Coding Guidelines
**NO EMOJIS**: Do not use emojis in user-facing messages (Telegram, email, notifications). Use plain text only.
## Running the Scraper
```powershell
# Windows setup
deployment\windows\scripts\setup_dev.ps1
# Run scraper (browser visible)
deployment\windows\scripts\run_scraper.ps1
# Run Telegram bot (manual)
deployment\windows\scripts\run_telegram_bot_manual.ps1
# Interactive menu (all options)
deployment\windows\scripts\menu.ps1
```
## Architecture
### Core Flow
```
login() → handle_2fa_wait() → read_accounts() → download_transactions() → save_results()
```
### Critical Implementation Details
#### 1. Login with Popup (lines ~140-200)
- Main page → Click "LOGIN" → **Opens popup window**
- Use `page.expect_popup()` to capture popup reference
- All operations happen in `self.login_page` (the popup)
#### 2. 2FA Auto-Detection (lines ~202-240)
- URL monitoring: `login.bancatransilvania.ro``goapp.bancatransilvania.ro`
- Poll for `#accountsBtn` element every 2 seconds
- After 2FA: `self.page = self.login_page` (switch page reference)
#### 3. Account Reading (lines ~242-310)
- Elements: `fba-account-details-card`, `#accountsBtn`
- Each card: `h4` (name), `span.text-grayscale-label` (IBAN), `strong.sold` (balance)
- Balance format: `"7,223.26 RON"` → parse to float + currency
#### 4. Transaction Download (lines ~529-732)
**Flow (2024+ version):**
```
Account 1: Expand card -> Click tranzactii icon -> Select "CSV" -> "Genereaza" -> Download from fba-document-item
Account 2+: Click #selectAccountBtn -> Select by heading name -> "Genereaza" -> Download
```
**Key methods:**
- `_download_first_account()`: Handles first account (expand + select CSV format)
- `_download_subsequent_account()`: Handles accounts 2+ (dropdown selection)
- `_wait_and_download()`: Waits for fba-document-item and downloads
**Account selection strategies (in order):**
1. `get_by_role("heading", name=account_name)`
2. `locator("fba-account-details").filter(has_text=account_name)`
3. `get_by_text(account_name, exact=True)`
### Key Selectors
**Cookie consent:**
- New (2024+): `get_by_role("button", name="Accept toate")`
- One-time consent: `get_by_text("Am inteles")`
**Login page:**
- URL: `https://go.bancatransilvania.ro/`
- Login link: `get_by_role("link", name="Login")`
- Username: `get_by_placeholder("ID logare")` (intelligent fallback in `_find_username_field`)
- Password: `get_by_placeholder("Parola")` or `input[type='password']`
- Submit: `get_by_role("button", name="Autentifica-te")` (intelligent fallback in `_find_submit_button`)
**Post-login:**
- 2FA success indicator: `#accountsBtn` visible and enabled
- Domain: `goapp.bancatransilvania.ro`
**Accounts:**
- Cards: `fba-account-details-card`
- Expand icon: `.mx-auto .mat-icon svg`, `.collapse-account-btn`
- Transactions button: `fba-account-buttons svg`, `.account-transactions-btn`
**Transaction download:**
- Account selector: `#selectAccountBtn svg`
- Account in dropdown: `get_by_role("heading", name=account_name)`
- CSV format: `get_by_text("CSV", exact=True)`
- Generate button: `get_by_role("button", name="Genereaza")`
- Download item: `fba-document-item svg`, `fba-document-item path`
**Update selectors:** `playwright codegen https://go.bancatransilvania.ro --target python`
## Docker Limitation
**Docker DOES NOT WORK** - Akamai WAF blocks with "Access Denied".
**Symptoms:**
- "Access Denied" page
- 0 links detected
- Screenshot: `data/debug_before_login_*.png` shows error
**Solution:** Use Windows local mode with `HEADLESS=false`.
## Common Issues
### Access Denied (Docker)
- **Cause**: WAF blocking
- **Fix**: Run locally with `HEADLESS=false`
### Transaction Download Timeout
- Check `fba-document-item` selector (wait for document generation)
- Verify `#selectAccountBtn` for account dropdown
- Account selection: verify heading name matches exactly
### 2FA Timeout
- Increase `TIMEOUT_2FA_SECONDS` in `.env`
- Verify URL redirect to `goapp.bancatransilvania.ro`
- Check for "Am inteles" consent dialog blocking
### Account Selection Failed
- Account name might have changed - verify exact match
- Try running `playwright codegen` to see current UI structure
- Check if dropdown opened (`#selectAccountBtn`)
## Exit Codes
- `0`: Success
- `1`: General error
- `4`: Config error (.env invalid)
- `99`: Unexpected error