Update dashboard, memory, root +2 more (+3 ~5)

This commit is contained in:
Echo
2026-02-02 16:21:41 +00:00
parent 2e8d47353b
commit 84701a062e
2212 changed files with 2938184 additions and 37 deletions

1
.gitignore vendored
View File

@@ -2,6 +2,7 @@
.env .env
*.secret *.secret
*_secret* *_secret*
credentials/
# Temporary files # Temporary files
*.tmp *.tmp

View File

@@ -1,5 +1,30 @@
# HEARTBEAT.md # HEARTBEAT.md
## Calendar Alert (<2h) - PRIORITATE!
La fiecare heartbeat, verifică dacă are eveniment în următoarele 2 ore:
```bash
cd ~/clawd && source venv/bin/activate && python3 -c "
from tools.calendar_check import get_service, TZ
from datetime import datetime, timedelta
service = get_service()
now = datetime.now(TZ)
soon = now + timedelta(hours=2)
events = service.events().list(
calendarId='primary',
timeMin=now.isoformat(),
timeMax=soon.isoformat(),
singleEvents=True
).execute().get('items', [])
for e in events:
start = e['start'].get('dateTime', e['start'].get('date'))
print(f'{start}: {e.get(\"summary\", \"(fără titlu)\")}')
"
```
Dacă găsești ceva → trimite IMEDIAT pe Discord #echo (canalul curent):
> ⚠️ **În [X] ai [EVENIMENT]!**
## Verificări periodice ## Verificări periodice
### 🔄 Mentenanță echipă (1x pe zi, dimineața) ### 🔄 Mentenanță echipă (1x pe zi, dimineața)

View File

@@ -97,6 +97,52 @@ memory_get path="memory/file.md" from=1 lines=50
- **Script:** `python3 tools/anaf-monitor/monitor_v2.py` - **Script:** `python3 tools/anaf-monitor/monitor_v2.py`
- **Monitorizează:** D100, D101, D200, D390, D406, situații financiare, E-Factura - **Monitorizează:** D100, D101, D200, D390, D406, situații financiare, E-Factura
### Google Calendar
- **Credentials:** `credentials/google-calendar.json` (OAuth client)
- **Token:** `credentials/google-calendar-token.json` (auto-refresh)
- **Acces:** Read + Write (`calendar.events`)
- **Venv:** `~/clawd/venv/` (activate cu `source venv/bin/activate`)
**Script:** `python3 tools/calendar_check.py [mode]`
- `today` - evenimente azi + mâine
- `week` - evenimente săptămâna curentă
- `travel` - reminder-uri pentru NLP/București (bilete + cazare)
- `busy` - verifică dacă e ocupat ACUM (pentru "nu deranja")
- `soon [ore]` - evenimente în următoarele N ore (default 2)
- `all` - toate cele de mai sus
**Travel detection:**
- Keywords: `nlp`, `bucuresti`, `bucurești`, `bucharest`
- 7-11 zile înainte → "Cumpără bilete + asigură cazare"
- 3-6 zile înainte → "⚠️ URGENT: Verifică dacă ai bilete!"
**Creare evenimente (Python):**
```python
from tools.calendar_check import create_event
create_event(
summary='Titlu',
start_datetime='2026-02-05T15:00:00',
duration_minutes=60,
description='Opțional',
reminders=[60, 15], # minute înainte (default: [30])
is_travel=False # True = reminders seara înainte + 2h
)
```
**Reminders automate:**
- Evenimente normale: 30 min înainte
- Evenimente travel (`is_travel=True`): seara înainte (18:00) + 2h înainte
- Custom: `reminders=[120, 30]` = 2h + 30min
**Integrat în:**
- `morning-report` - azi/mâine/peste 2 zile + travel reminders
- `evening-report` - mâine (reminder seară)
- `weekly-planning-sun` - săptămâna următoare + travel
- `respiratie-orar` - skip dacă busy în calendar
- `heartbeat` - alertă dacă eveniment în <2h
**Revocă accesul:** <https://myaccount.google.com/permissions>
### Pauze Respirație (Pattern Interrupt) ### Pauze Respirație (Pattern Interrupt)
- **Bancă tehnici:** `memory/kb/tehnici-pauza.md` - **Bancă tehnici:** `memory/kb/tehnici-pauza.md`
- **Script:** `python3 tools/pauza_random.py` - alege random, afișează cu context - **Script:** `python3 tools/pauza_random.py` - alege random, afișează cu context
@@ -134,10 +180,8 @@ memory_get path="memory/file.md" from=1 lines=50
| 06:00,17:00 | 08:00,19:00 | insights-extract | - | Extrage insights din memory/kb/ + actualizează tehnici-pauza.md | | 06:00,17:00 | 08:00,19:00 | insights-extract | - | Extrage insights din memory/kb/ + actualizează tehnici-pauza.md |
| 06:30 | 08:30 | morning-report | 📧 EMAIL | Raport dimineață - vezi [FLUX-JOBURI.md](memory/kb/projects/FLUX-JOBURI.md) | | 06:30 | 08:30 | morning-report | 📧 EMAIL | Raport dimineață - vezi [FLUX-JOBURI.md](memory/kb/projects/FLUX-JOBURI.md) |
| 07:00 | 09:00 | morning-coaching | #echo-self + 📧 | Gând + provocare → memory/kb/coaching/ | | 07:00 | 09:00 | morning-coaching | #echo-self + 📧 | Gând + provocare → memory/kb/coaching/ |
| 07-17 | 09-19 | respiratie-orar | #echo-self | Pauze orare: tehnică + rezultat + sursă (din tehnici-pauza.md) | | 07-17 | 09-19 | respiratie-orar | #echo-self | Pauze orare (skip dacă busy în calendar) |
| 15:00 mar,joi | 17:00 | project-checkin | #echo-work | Check-in Vending Master | | 15:00 mar,joi | 17:00 | project-checkin | #echo-work | Check-in Vending Master |
| 15:00 3/feb | 17:00 | grup-sprijin-pregatire | #echo-sprijin | Pregătire fișă grup joi |
| 15:00 5/feb | 17:00 | grup-sprijin-5feb | #echo-sprijin | Reminder grup sprijin |
| 18:00 | 20:00 | evening-report | 📧 EMAIL | Raport seară - vezi [FLUX-JOBURI.md](memory/kb/projects/FLUX-JOBURI.md) | | 18:00 | 20:00 | evening-report | 📧 EMAIL | Raport seară - vezi [FLUX-JOBURI.md](memory/kb/projects/FLUX-JOBURI.md) |
| 19:00 | 21:00 | evening-coaching | #echo-self + 📧 | Reflecție seară → memory/kb/coaching/ | | 19:00 | 21:00 | evening-coaching | #echo-self + 📧 | Reflecție seară → memory/kb/coaching/ |
| 20:00 | 22:00 | seara-merit-reminder | #echo-self | Reminder lista "10 lucruri pentru care merit respect" | | 20:00 | 22:00 | seara-merit-reminder | #echo-self | Reminder lista "10 lucruri pentru care merit respect" |

View File

@@ -197,15 +197,15 @@
} }
.status-section-subtitle { .status-section-subtitle {
font-size: var(--text-xs); font-size: 13px;
color: var(--text-muted); color: var(--text-secondary);
margin-top: 2px; margin-top: 2px;
} }
.status-badge { .status-badge {
padding: 2px 8px; padding: 3px 10px;
border-radius: var(--radius-sm); border-radius: var(--radius-sm);
font-size: var(--text-xs); font-size: 12px;
font-weight: 600; font-weight: 600;
} }
@@ -248,15 +248,15 @@
display: flex; display: flex;
align-items: center; align-items: center;
gap: var(--space-2); gap: var(--space-2);
font-size: var(--text-xs); font-size: 13px;
color: var(--text-secondary); color: var(--text-primary);
padding: var(--space-1) 0; padding: 2px 0;
} }
.status-detail-item svg { .status-detail-item svg {
width: 12px; width: 14px;
height: 12px; height: 14px;
color: var(--text-muted); color: var(--text-secondary);
} }
.status-detail-item.uncommitted { .status-detail-item.uncommitted {
@@ -266,46 +266,71 @@
.status-detail-item code { .status-detail-item code {
font-family: monospace; font-family: monospace;
background: var(--bg-elevated); background: var(--bg-elevated);
padding: 1px 4px; padding: 2px 6px;
border-radius: 2px; border-radius: 3px;
font-size: 11px; font-size: 12px;
}
/* Cron items - desktop compact grid */
.cron-list {
display: grid;
grid-template-columns: repeat(2, 1fr);
gap: 2px var(--space-4);
}
@media (max-width: 900px) {
.cron-list {
grid-template-columns: 1fr;
}
} }
/* Cron items */
.cron-item { .cron-item {
display: flex; display: flex;
align-items: center; align-items: center;
gap: var(--space-2); gap: var(--space-2);
font-size: var(--text-xs); font-size: var(--text-sm);
padding: var(--space-1) 0; padding: 2px 0;
} }
.cron-item.done { .cron-item.done {
color: var(--text-muted); opacity: 0.5;
}
.cron-item.done .cron-name {
text-decoration: line-through;
} }
.cron-item.pending { .cron-item.pending {
color: var(--text-secondary); color: var(--text-primary);
} }
.cron-time { .cron-time {
font-family: monospace; font-family: monospace;
min-width: 45px; min-width: 50px;
font-size: 13px;
color: var(--text-primary);
}
.cron-date {
color: #9ca3af;
font-size: 12px;
margin-left: 4px;
} }
.cron-icon { .cron-icon {
width: 14px; width: 14px;
height: 14px; height: 14px;
flex-shrink: 0;
} }
.cron-icon.done { color: #22c55e; } .cron-icon.done { color: #22c55e; }
.cron-icon.pending { color: var(--text-muted); } .cron-icon.pending { color: #9ca3af; }
.cron-icon.failed { color: #ef4444; } .cron-icon.failed { color: #ef4444; }
.cron-name {
flex: 1;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
color: var(--text-primary);
}
/* Two-column dashboard */ /* Two-column dashboard */
.dashboard-grid { .dashboard-grid {
display: grid; display: grid;
@@ -1480,7 +1505,7 @@
let html = ` let html = `
<div class="status-detail-item"> <div class="status-detail-item">
<i data-lucide="git-commit"></i> <i data-lucide="git-commit"></i>
<span><a href="${GITEA_URL}/commit/${git.lastCommit.hash}" target="_blank" style="color:var(--accent)">${git.lastCommit.hash}</a> ${git.lastCommit.message.substring(0, 40)}${git.lastCommit.message.length > 40 ? '...' : ''} <small>(${git.lastCommit.time})</small></span> <span><a href="${GITEA_URL}/commit/${git.lastCommit.hash}" target="_blank" style="color:var(--accent)">${git.lastCommit.hash}</a> ${git.lastCommit.message.substring(0, 40)}${git.lastCommit.message.length > 40 ? '...' : ''} (${git.lastCommit.time})</span>
</div> </div>
`; `;
@@ -1490,13 +1515,13 @@
const more = git.uncommittedCount > 3 ? ` +${git.uncommittedCount - 3}` : ''; const more = git.uncommittedCount > 3 ? ` +${git.uncommittedCount - 3}` : '';
html += `<div class="status-detail-item uncommitted"> html += `<div class="status-detail-item uncommitted">
<i data-lucide="alert-circle"></i> <i data-lucide="alert-circle"></i>
<span><a href="files.html?git=1" style="color:var(--warning)"><strong>${git.uncommittedCount}</strong> necomise</a>: <small>${files}${more}</small></span> <span><a href="files.html?git=1" style="color:var(--warning)"><strong>${git.uncommittedCount}</strong> necomise</a>: ${files}${more}</span>
</div>`; </div>`;
} }
html += `<div class="status-detail-item"> html += `<div class="status-detail-item">
<i data-lucide="external-link"></i> <i data-lucide="external-link"></i>
<a href="${GITEA_URL}" target="_blank" style="color:var(--accent);font-size:var(--text-xs)">gitea.romfast.ro/romfast/clawd</a> <a href="${GITEA_URL}" target="_blank" style="color:var(--accent)">gitea.romfast.ro/romfast/clawd</a>
</div>`; </div>`;
details.innerHTML = html; details.innerHTML = html;
@@ -1577,9 +1602,9 @@
return (a.nextRunAtMs || 0) - (b.nextRunAtMs || 0); return (a.nextRunAtMs || 0) - (b.nextRunAtMs || 0);
}); });
// Update details // Update details - compact grid layout
const details = document.getElementById('cronDetails'); const details = document.getElementById('cronDetails');
details.innerHTML = jobs.map(job => { const jobsHtml = jobs.map(job => {
const done = job.ranToday; const done = job.ranToday;
const failed = job.lastStatus === 'error'; const failed = job.lastStatus === 'error';
const statusClass = failed ? 'failed' : (done ? 'done' : 'pending'); const statusClass = failed ? 'failed' : (done ? 'done' : 'pending');
@@ -1588,17 +1613,17 @@
// Check if job is today // Check if job is today
const jobDate = new Date(job.nextRunAtMs || 0); const jobDate = new Date(job.nextRunAtMs || 0);
const isToday = jobDate.toDateString() === today; const isToday = jobDate.toDateString() === today;
const dateStr = isToday ? '' : jobDate.toLocaleDateString('ro-RO', { day: 'numeric', month: 'short' }); const dateStr = isToday ? '' : `<span class="cron-date">${jobDate.getDate()} ${jobDate.toLocaleDateString('ro-RO', { month: 'short' })}</span>`;
return ` return `
<div class="cron-item ${statusClass}"> <div class="cron-item ${statusClass}">
<i data-lucide="${icon}" class="cron-icon ${statusClass}"></i> <i data-lucide="${icon}" class="cron-icon ${statusClass}"></i>
<span class="cron-time">${job.time}${dateStr ? ` <span style="color:#6b7280;font-size:11px">(${dateStr})</span>` : ''}</span> <span class="cron-time">${job.time}${dateStr}</span>
<span class="cron-name">${job.name}</span> <span class="cron-name">${job.name}</span>
<span class="cron-agent" style="color: var(--text-muted); font-size: 0.75rem; margin-left: auto;">${job.agentId || ''}</span>
</div> </div>
`; `;
}).join(''); }).join('');
details.innerHTML = `<div class="cron-list">${jobsHtml}</div>`;
lucide.createIcons(); lucide.createIcons();
} catch (e) { } catch (e) {

View File

@@ -1,12 +1,12 @@
{ {
"lastChecks": { "lastChecks": {
"agents_sync": "2026-02-02", "agents_sync": "2026-02-02",
"email": 1770022820, "email": 1770044415,
"calendar": null, "calendar": null,
"git": 1769965200, "git": 1769965200,
"kb_index": 1770022820 "kb_index": 1770022820
}, },
"notes": { "notes": {
"2026-02-02": "09:00 UTC - Email OK (nimic nou), KB index OK. Cron jobs funcționale." "2026-02-02": "15:00 UTC - Email OK (nimic nou). Cron jobs funcționale toată ziua."
} }
} }

66
tools/calendar_auth.py Normal file
View File

@@ -0,0 +1,66 @@
#!/usr/bin/env python3
"""
Google Calendar OAuth2 Authorization.
Run this once to generate token.json for calendar access.
"""
import os
from pathlib import Path
from google_auth_oauthlib.flow import Flow
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
# Scopes needed for calendar access (read + write events)
SCOPES = ['https://www.googleapis.com/auth/calendar.events']
CREDENTIALS_FILE = Path(__file__).parent.parent / 'credentials' / 'google-calendar.json'
TOKEN_FILE = Path(__file__).parent.parent / 'credentials' / 'google-calendar-token.json'
def main():
creds = None
# Check if token already exists
if TOKEN_FILE.exists():
creds = Credentials.from_authorized_user_file(str(TOKEN_FILE), SCOPES)
# If no valid credentials, do the OAuth flow
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
print("Refreshing expired token...")
creds.refresh(Request())
else:
print(f"Starting OAuth flow...")
print(f"Using credentials: {CREDENTIALS_FILE}\n")
flow = Flow.from_client_secrets_file(
str(CREDENTIALS_FILE),
scopes=SCOPES,
redirect_uri='urn:ietf:wg:oauth:2.0:oob'
)
auth_url, _ = flow.authorization_url(prompt='consent')
print("="*60)
print("AUTHORIZATION REQUIRED")
print("="*60)
print("\n1. Open this URL in your browser:\n")
print(auth_url)
print("\n2. Sign in and authorize access")
print("3. Copy the authorization code and paste it below\n")
code = input("Enter authorization code: ").strip()
flow.fetch_token(code=code)
creds = flow.credentials
# Save the credentials for next run
TOKEN_FILE.parent.mkdir(parents=True, exist_ok=True)
with open(TOKEN_FILE, 'w') as token:
token.write(creds.to_json())
print(f"\nToken saved to: {TOKEN_FILE}")
print("\n✅ Authorization successful!")
print("You can now use the calendar tools.")
if __name__ == '__main__':
main()

304
tools/calendar_check.py Normal file
View File

@@ -0,0 +1,304 @@
#!/usr/bin/env python3
"""
Google Calendar checker for Echo.
Returns events for today, tomorrow, this week, or upcoming travel needs.
"""
import sys
import json
from pathlib import Path
from datetime import datetime, timedelta
from zoneinfo import ZoneInfo
from google.oauth2.credentials import Credentials
from googleapiclient.discovery import build
TOKEN_FILE = Path(__file__).parent.parent / 'credentials' / 'google-calendar-token.json'
TZ = ZoneInfo('Europe/Bucharest')
# Keywords that indicate travel to București (needs train + accommodation)
TRAVEL_KEYWORDS = ['nlp', 'bucuresti', 'bucurești', 'bucharest']
def get_service():
"""Get authenticated Calendar service."""
creds = Credentials.from_authorized_user_file(str(TOKEN_FILE))
return build('calendar', 'v3', credentials=creds)
def get_events(service, time_min, time_max, max_results=20):
"""Get events between time_min and time_max."""
events_result = service.events().list(
calendarId='primary',
timeMin=time_min.isoformat(),
timeMax=time_max.isoformat(),
maxResults=max_results,
singleEvents=True,
orderBy='startTime'
).execute()
return events_result.get('items', [])
def format_event(event):
"""Format event for display."""
start = event['start'].get('dateTime', event['start'].get('date'))
summary = event.get('summary', '(fără titlu)')
# Parse start time
if 'T' in start:
dt = datetime.fromisoformat(start)
time_str = dt.strftime('%H:%M')
date_str = dt.strftime('%d %b')
else:
time_str = 'toată ziua'
date_str = datetime.fromisoformat(start).strftime('%d %b')
return {
'summary': summary,
'date': date_str,
'time': time_str,
'start': start,
'is_travel': any(kw in summary.lower() for kw in TRAVEL_KEYWORDS)
}
def check_today_tomorrow():
"""Get events for today and tomorrow."""
service = get_service()
now = datetime.now(TZ)
# Today
today_start = now.replace(hour=0, minute=0, second=0, microsecond=0)
today_end = today_start + timedelta(days=1)
# Tomorrow
tomorrow_end = today_end + timedelta(days=1)
today_events = get_events(service, today_start, today_end)
tomorrow_events = get_events(service, today_end, tomorrow_end)
result = {
'today': [format_event(e) for e in today_events],
'tomorrow': [format_event(e) for e in tomorrow_events]
}
return result
def check_week():
"""Get events for this week (Mon-Sun)."""
service = get_service()
now = datetime.now(TZ)
# Start of this week (Monday)
days_since_monday = now.weekday()
week_start = now.replace(hour=0, minute=0, second=0, microsecond=0) - timedelta(days=days_since_monday)
week_end = week_start + timedelta(days=7)
events = get_events(service, week_start, week_end)
return {
'week_start': week_start.strftime('%d %b'),
'week_end': (week_end - timedelta(days=1)).strftime('%d %b'),
'events': [format_event(e) for e in events]
}
def check_travel_upcoming():
"""Check for travel events in next 14 days that need booking (7-11 days out)."""
service = get_service()
now = datetime.now(TZ)
# Look 14 days ahead
future = now + timedelta(days=14)
events = get_events(service, now, future)
reminders = []
for event in events:
formatted = format_event(event)
if formatted['is_travel']:
# Calculate days until event
start_str = event['start'].get('dateTime', event['start'].get('date'))
if 'T' in start_str:
event_date = datetime.fromisoformat(start_str).date()
else:
event_date = datetime.fromisoformat(start_str).date()
days_until = (event_date - now.date()).days
# Remind if 7-11 days away (booking window)
if 7 <= days_until <= 11:
reminders.append({
**formatted,
'days_until': days_until,
'action': 'Cumpără bilete tren + asigură cazare București'
})
# Urgent if 3-6 days away and might have missed window
elif 3 <= days_until <= 6:
reminders.append({
**formatted,
'days_until': days_until,
'action': '⚠️ URGENT: Verifică dacă ai bilete și cazare!'
})
return {'travel_reminders': reminders}
def is_busy_now():
"""Check if there's an event happening RIGHT NOW."""
service = get_service()
now = datetime.now(TZ)
# Check events that started before now and end after now
events_result = service.events().list(
calendarId='primary',
timeMin=(now - timedelta(hours=4)).isoformat(),
timeMax=(now + timedelta(minutes=30)).isoformat(),
singleEvents=True,
orderBy='startTime'
).execute()
for event in events_result.get('items', []):
start_str = event['start'].get('dateTime', event['start'].get('date'))
end_str = event['end'].get('dateTime', event['end'].get('date'))
# Skip all-day events for "busy now" check
if 'T' not in start_str:
continue
start = datetime.fromisoformat(start_str)
end = datetime.fromisoformat(end_str)
if start <= now <= end:
return {
'busy': True,
'event': event.get('summary', '(fără titlu)'),
'ends': end.strftime('%H:%M')
}
return {'busy': False}
def check_upcoming_hours(hours=2):
"""Check for events in the next N hours."""
service = get_service()
now = datetime.now(TZ)
future = now + timedelta(hours=hours)
events = get_events(service, now, future)
alerts = []
for event in events:
start_str = event['start'].get('dateTime', event['start'].get('date'))
summary = event.get('summary', '(fără titlu)')
if 'T' in start_str:
start = datetime.fromisoformat(start_str)
minutes_until = int((start - now).total_seconds() / 60)
if minutes_until > 0:
alerts.append({
'summary': summary,
'minutes_until': minutes_until,
'time': start.strftime('%H:%M')
})
return {'upcoming': alerts}
def main():
if len(sys.argv) < 2:
print("Usage: calendar_check.py [today|week|travel|busy|soon|all]")
sys.exit(1)
mode = sys.argv[1].lower()
if mode == 'today':
result = check_today_tomorrow()
elif mode == 'week':
result = check_week()
elif mode == 'travel':
result = check_travel_upcoming()
elif mode == 'busy':
result = is_busy_now()
elif mode == 'soon':
hours = int(sys.argv[2]) if len(sys.argv) > 2 else 2
result = check_upcoming_hours(hours)
elif mode == 'all':
result = {
**check_today_tomorrow(),
**check_week(),
**check_travel_upcoming()
}
else:
print(f"Unknown mode: {mode}")
sys.exit(1)
print(json.dumps(result, ensure_ascii=False, indent=2))
if __name__ == '__main__':
main()
def create_event(summary, start_datetime, duration_minutes=60, description=None,
reminders=None, is_travel=False):
"""
Create a calendar event with reminders.
Args:
summary: Event title
start_datetime: datetime object or ISO string (e.g., "2026-02-05T15:00:00")
duration_minutes: Duration in minutes (default 60)
description: Optional description
reminders: List of minutes before event for reminders, or None for defaults
e.g., [120, 30] = 2 hours and 30 min before
is_travel: If True, uses travel reminders (evening before + 2h before)
Returns:
Created event details
"""
service = get_service()
# Parse start time if string
if isinstance(start_datetime, str):
start = datetime.fromisoformat(start_datetime)
else:
start = start_datetime
# Ensure timezone
if start.tzinfo is None:
start = start.replace(tzinfo=TZ)
end = start + timedelta(minutes=duration_minutes)
# Set up reminders
if reminders is None:
if is_travel:
# Travel: evening before (calculate minutes to 18:00 day before) + 2h before
evening_before = start.replace(hour=18, minute=0, second=0) - timedelta(days=1)
minutes_to_evening = int((start - evening_before).total_seconds() / 60)
reminders = [minutes_to_evening, 120] # Evening before + 2 hours
else:
# Default: 30 min before
reminders = [30]
event = {
'summary': summary,
'start': {
'dateTime': start.isoformat(),
'timeZone': 'Europe/Bucharest',
},
'end': {
'dateTime': end.isoformat(),
'timeZone': 'Europe/Bucharest',
},
'reminders': {
'useDefault': False,
'overrides': [
{'method': 'popup', 'minutes': m} for m in reminders
],
},
}
if description:
event['description'] = description
created = service.events().insert(calendarId='primary', body=event).execute()
return {
'id': created['id'],
'summary': created['summary'],
'start': created['start'].get('dateTime'),
'link': created.get('htmlLink'),
'reminders': reminders
}

247
venv/bin/Activate.ps1 Normal file
View File

@@ -0,0 +1,247 @@
<#
.Synopsis
Activate a Python virtual environment for the current PowerShell session.
.Description
Pushes the python executable for a virtual environment to the front of the
$Env:PATH environment variable and sets the prompt to signify that you are
in a Python virtual environment. Makes use of the command line switches as
well as the `pyvenv.cfg` file values present in the virtual environment.
.Parameter VenvDir
Path to the directory that contains the virtual environment to activate. The
default value for this is the parent of the directory that the Activate.ps1
script is located within.
.Parameter Prompt
The prompt prefix to display when this virtual environment is activated. By
default, this prompt is the name of the virtual environment folder (VenvDir)
surrounded by parentheses and followed by a single space (ie. '(.venv) ').
.Example
Activate.ps1
Activates the Python virtual environment that contains the Activate.ps1 script.
.Example
Activate.ps1 -Verbose
Activates the Python virtual environment that contains the Activate.ps1 script,
and shows extra information about the activation as it executes.
.Example
Activate.ps1 -VenvDir C:\Users\MyUser\Common\.venv
Activates the Python virtual environment located in the specified location.
.Example
Activate.ps1 -Prompt "MyPython"
Activates the Python virtual environment that contains the Activate.ps1 script,
and prefixes the current prompt with the specified string (surrounded in
parentheses) while the virtual environment is active.
.Notes
On Windows, it may be required to enable this Activate.ps1 script by setting the
execution policy for the user. You can do this by issuing the following PowerShell
command:
PS C:\> Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
For more information on Execution Policies:
https://go.microsoft.com/fwlink/?LinkID=135170
#>
Param(
[Parameter(Mandatory = $false)]
[String]
$VenvDir,
[Parameter(Mandatory = $false)]
[String]
$Prompt
)
<# Function declarations --------------------------------------------------- #>
<#
.Synopsis
Remove all shell session elements added by the Activate script, including the
addition of the virtual environment's Python executable from the beginning of
the PATH variable.
.Parameter NonDestructive
If present, do not remove this function from the global namespace for the
session.
#>
function global:deactivate ([switch]$NonDestructive) {
# Revert to original values
# The prior prompt:
if (Test-Path -Path Function:_OLD_VIRTUAL_PROMPT) {
Copy-Item -Path Function:_OLD_VIRTUAL_PROMPT -Destination Function:prompt
Remove-Item -Path Function:_OLD_VIRTUAL_PROMPT
}
# The prior PYTHONHOME:
if (Test-Path -Path Env:_OLD_VIRTUAL_PYTHONHOME) {
Copy-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME -Destination Env:PYTHONHOME
Remove-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME
}
# The prior PATH:
if (Test-Path -Path Env:_OLD_VIRTUAL_PATH) {
Copy-Item -Path Env:_OLD_VIRTUAL_PATH -Destination Env:PATH
Remove-Item -Path Env:_OLD_VIRTUAL_PATH
}
# Just remove the VIRTUAL_ENV altogether:
if (Test-Path -Path Env:VIRTUAL_ENV) {
Remove-Item -Path env:VIRTUAL_ENV
}
# Just remove VIRTUAL_ENV_PROMPT altogether.
if (Test-Path -Path Env:VIRTUAL_ENV_PROMPT) {
Remove-Item -Path env:VIRTUAL_ENV_PROMPT
}
# Just remove the _PYTHON_VENV_PROMPT_PREFIX altogether:
if (Get-Variable -Name "_PYTHON_VENV_PROMPT_PREFIX" -ErrorAction SilentlyContinue) {
Remove-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Scope Global -Force
}
# Leave deactivate function in the global namespace if requested:
if (-not $NonDestructive) {
Remove-Item -Path function:deactivate
}
}
<#
.Description
Get-PyVenvConfig parses the values from the pyvenv.cfg file located in the
given folder, and returns them in a map.
For each line in the pyvenv.cfg file, if that line can be parsed into exactly
two strings separated by `=` (with any amount of whitespace surrounding the =)
then it is considered a `key = value` line. The left hand string is the key,
the right hand is the value.
If the value starts with a `'` or a `"` then the first and last character is
stripped from the value before being captured.
.Parameter ConfigDir
Path to the directory that contains the `pyvenv.cfg` file.
#>
function Get-PyVenvConfig(
[String]
$ConfigDir
) {
Write-Verbose "Given ConfigDir=$ConfigDir, obtain values in pyvenv.cfg"
# Ensure the file exists, and issue a warning if it doesn't (but still allow the function to continue).
$pyvenvConfigPath = Join-Path -Resolve -Path $ConfigDir -ChildPath 'pyvenv.cfg' -ErrorAction Continue
# An empty map will be returned if no config file is found.
$pyvenvConfig = @{ }
if ($pyvenvConfigPath) {
Write-Verbose "File exists, parse `key = value` lines"
$pyvenvConfigContent = Get-Content -Path $pyvenvConfigPath
$pyvenvConfigContent | ForEach-Object {
$keyval = $PSItem -split "\s*=\s*", 2
if ($keyval[0] -and $keyval[1]) {
$val = $keyval[1]
# Remove extraneous quotations around a string value.
if ("'""".Contains($val.Substring(0, 1))) {
$val = $val.Substring(1, $val.Length - 2)
}
$pyvenvConfig[$keyval[0]] = $val
Write-Verbose "Adding Key: '$($keyval[0])'='$val'"
}
}
}
return $pyvenvConfig
}
<# Begin Activate script --------------------------------------------------- #>
# Determine the containing directory of this script
$VenvExecPath = Split-Path -Parent $MyInvocation.MyCommand.Definition
$VenvExecDir = Get-Item -Path $VenvExecPath
Write-Verbose "Activation script is located in path: '$VenvExecPath'"
Write-Verbose "VenvExecDir Fullname: '$($VenvExecDir.FullName)"
Write-Verbose "VenvExecDir Name: '$($VenvExecDir.Name)"
# Set values required in priority: CmdLine, ConfigFile, Default
# First, get the location of the virtual environment, it might not be
# VenvExecDir if specified on the command line.
if ($VenvDir) {
Write-Verbose "VenvDir given as parameter, using '$VenvDir' to determine values"
}
else {
Write-Verbose "VenvDir not given as a parameter, using parent directory name as VenvDir."
$VenvDir = $VenvExecDir.Parent.FullName.TrimEnd("\\/")
Write-Verbose "VenvDir=$VenvDir"
}
# Next, read the `pyvenv.cfg` file to determine any required value such
# as `prompt`.
$pyvenvCfg = Get-PyVenvConfig -ConfigDir $VenvDir
# Next, set the prompt from the command line, or the config file, or
# just use the name of the virtual environment folder.
if ($Prompt) {
Write-Verbose "Prompt specified as argument, using '$Prompt'"
}
else {
Write-Verbose "Prompt not specified as argument to script, checking pyvenv.cfg value"
if ($pyvenvCfg -and $pyvenvCfg['prompt']) {
Write-Verbose " Setting based on value in pyvenv.cfg='$($pyvenvCfg['prompt'])'"
$Prompt = $pyvenvCfg['prompt'];
}
else {
Write-Verbose " Setting prompt based on parent's directory's name. (Is the directory name passed to venv module when creating the virtual environment)"
Write-Verbose " Got leaf-name of $VenvDir='$(Split-Path -Path $venvDir -Leaf)'"
$Prompt = Split-Path -Path $venvDir -Leaf
}
}
Write-Verbose "Prompt = '$Prompt'"
Write-Verbose "VenvDir='$VenvDir'"
# Deactivate any currently active virtual environment, but leave the
# deactivate function in place.
deactivate -nondestructive
# Now set the environment variable VIRTUAL_ENV, used by many tools to determine
# that there is an activated venv.
$env:VIRTUAL_ENV = $VenvDir
if (-not $Env:VIRTUAL_ENV_DISABLE_PROMPT) {
Write-Verbose "Setting prompt to '$Prompt'"
# Set the prompt to include the env name
# Make sure _OLD_VIRTUAL_PROMPT is global
function global:_OLD_VIRTUAL_PROMPT { "" }
Copy-Item -Path function:prompt -Destination function:_OLD_VIRTUAL_PROMPT
New-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Description "Python virtual environment prompt prefix" -Scope Global -Option ReadOnly -Visibility Public -Value $Prompt
function global:prompt {
Write-Host -NoNewline -ForegroundColor Green "($_PYTHON_VENV_PROMPT_PREFIX) "
_OLD_VIRTUAL_PROMPT
}
$env:VIRTUAL_ENV_PROMPT = $Prompt
}
# Clear PYTHONHOME
if (Test-Path -Path Env:PYTHONHOME) {
Copy-Item -Path Env:PYTHONHOME -Destination Env:_OLD_VIRTUAL_PYTHONHOME
Remove-Item -Path Env:PYTHONHOME
}
# Add the venv to the PATH
Copy-Item -Path Env:PATH -Destination Env:_OLD_VIRTUAL_PATH
$Env:PATH = "$VenvExecDir$([System.IO.Path]::PathSeparator)$Env:PATH"

70
venv/bin/activate Normal file
View File

@@ -0,0 +1,70 @@
# This file must be used with "source bin/activate" *from bash*
# You cannot run it directly
deactivate () {
# reset old environment variables
if [ -n "${_OLD_VIRTUAL_PATH:-}" ] ; then
PATH="${_OLD_VIRTUAL_PATH:-}"
export PATH
unset _OLD_VIRTUAL_PATH
fi
if [ -n "${_OLD_VIRTUAL_PYTHONHOME:-}" ] ; then
PYTHONHOME="${_OLD_VIRTUAL_PYTHONHOME:-}"
export PYTHONHOME
unset _OLD_VIRTUAL_PYTHONHOME
fi
# Call hash to forget past commands. Without forgetting
# past commands the $PATH changes we made may not be respected
hash -r 2> /dev/null
if [ -n "${_OLD_VIRTUAL_PS1:-}" ] ; then
PS1="${_OLD_VIRTUAL_PS1:-}"
export PS1
unset _OLD_VIRTUAL_PS1
fi
unset VIRTUAL_ENV
unset VIRTUAL_ENV_PROMPT
if [ ! "${1:-}" = "nondestructive" ] ; then
# Self destruct!
unset -f deactivate
fi
}
# unset irrelevant variables
deactivate nondestructive
# on Windows, a path can contain colons and backslashes and has to be converted:
if [ "${OSTYPE:-}" = "cygwin" ] || [ "${OSTYPE:-}" = "msys" ] ; then
# transform D:\path\to\venv to /d/path/to/venv on MSYS
# and to /cygdrive/d/path/to/venv on Cygwin
export VIRTUAL_ENV=$(cygpath /home/moltbot/clawd/venv)
else
# use the path as-is
export VIRTUAL_ENV=/home/moltbot/clawd/venv
fi
_OLD_VIRTUAL_PATH="$PATH"
PATH="$VIRTUAL_ENV/"bin":$PATH"
export PATH
# unset PYTHONHOME if set
# this will fail if PYTHONHOME is set to the empty string (which is bad anyway)
# could use `if (set -u; : $PYTHONHOME) ;` in bash
if [ -n "${PYTHONHOME:-}" ] ; then
_OLD_VIRTUAL_PYTHONHOME="${PYTHONHOME:-}"
unset PYTHONHOME
fi
if [ -z "${VIRTUAL_ENV_DISABLE_PROMPT:-}" ] ; then
_OLD_VIRTUAL_PS1="${PS1:-}"
PS1='(venv) '"${PS1:-}"
export PS1
VIRTUAL_ENV_PROMPT='(venv) '
export VIRTUAL_ENV_PROMPT
fi
# Call hash to forget past commands. Without forgetting
# past commands the $PATH changes we made may not be respected
hash -r 2> /dev/null

27
venv/bin/activate.csh Normal file
View File

@@ -0,0 +1,27 @@
# This file must be used with "source bin/activate.csh" *from csh*.
# You cannot run it directly.
# Created by Davide Di Blasi <davidedb@gmail.com>.
# Ported to Python 3.3 venv by Andrew Svetlov <andrew.svetlov@gmail.com>
alias deactivate 'test $?_OLD_VIRTUAL_PATH != 0 && setenv PATH "$_OLD_VIRTUAL_PATH" && unset _OLD_VIRTUAL_PATH; rehash; test $?_OLD_VIRTUAL_PROMPT != 0 && set prompt="$_OLD_VIRTUAL_PROMPT" && unset _OLD_VIRTUAL_PROMPT; unsetenv VIRTUAL_ENV; unsetenv VIRTUAL_ENV_PROMPT; test "\!:*" != "nondestructive" && unalias deactivate'
# Unset irrelevant variables.
deactivate nondestructive
setenv VIRTUAL_ENV /home/moltbot/clawd/venv
set _OLD_VIRTUAL_PATH="$PATH"
setenv PATH "$VIRTUAL_ENV/"bin":$PATH"
set _OLD_VIRTUAL_PROMPT="$prompt"
if (! "$?VIRTUAL_ENV_DISABLE_PROMPT") then
set prompt = '(venv) '"$prompt"
setenv VIRTUAL_ENV_PROMPT '(venv) '
endif
alias pydoc python -m pydoc
rehash

69
venv/bin/activate.fish Normal file
View File

@@ -0,0 +1,69 @@
# This file must be used with "source <venv>/bin/activate.fish" *from fish*
# (https://fishshell.com/). You cannot run it directly.
function deactivate -d "Exit virtual environment and return to normal shell environment"
# reset old environment variables
if test -n "$_OLD_VIRTUAL_PATH"
set -gx PATH $_OLD_VIRTUAL_PATH
set -e _OLD_VIRTUAL_PATH
end
if test -n "$_OLD_VIRTUAL_PYTHONHOME"
set -gx PYTHONHOME $_OLD_VIRTUAL_PYTHONHOME
set -e _OLD_VIRTUAL_PYTHONHOME
end
if test -n "$_OLD_FISH_PROMPT_OVERRIDE"
set -e _OLD_FISH_PROMPT_OVERRIDE
# prevents error when using nested fish instances (Issue #93858)
if functions -q _old_fish_prompt
functions -e fish_prompt
functions -c _old_fish_prompt fish_prompt
functions -e _old_fish_prompt
end
end
set -e VIRTUAL_ENV
set -e VIRTUAL_ENV_PROMPT
if test "$argv[1]" != "nondestructive"
# Self-destruct!
functions -e deactivate
end
end
# Unset irrelevant variables.
deactivate nondestructive
set -gx VIRTUAL_ENV /home/moltbot/clawd/venv
set -gx _OLD_VIRTUAL_PATH $PATH
set -gx PATH "$VIRTUAL_ENV/"bin $PATH
# Unset PYTHONHOME if set.
if set -q PYTHONHOME
set -gx _OLD_VIRTUAL_PYTHONHOME $PYTHONHOME
set -e PYTHONHOME
end
if test -z "$VIRTUAL_ENV_DISABLE_PROMPT"
# fish uses a function instead of an env var to generate the prompt.
# Save the current fish_prompt function as the function _old_fish_prompt.
functions -c fish_prompt _old_fish_prompt
# With the original prompt function renamed, we can override with our own.
function fish_prompt
# Save the return status of the last command.
set -l old_status $status
# Output the venv prompt; color taken from the blue of the Python logo.
printf "%s%s%s" (set_color 4B8BBE) '(venv) ' (set_color normal)
# Restore the return status of the previous command.
echo "exit $old_status" | .
# Output the original/"old" prompt.
_old_fish_prompt
end
set -gx _OLD_FISH_PROMPT_OVERRIDE "$VIRTUAL_ENV"
set -gx VIRTUAL_ENV_PROMPT '(venv) '
end

8
venv/bin/google-oauthlib-tool Executable file
View File

@@ -0,0 +1,8 @@
#!/home/moltbot/clawd/venv/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from google_auth_oauthlib.tool.__main__ import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())

8
venv/bin/normalizer Executable file
View File

@@ -0,0 +1,8 @@
#!/home/moltbot/clawd/venv/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from charset_normalizer.cli import cli_detect
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(cli_detect())

8
venv/bin/pip Executable file
View File

@@ -0,0 +1,8 @@
#!/home/moltbot/clawd/venv/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())

8
venv/bin/pip3 Executable file
View File

@@ -0,0 +1,8 @@
#!/home/moltbot/clawd/venv/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())

8
venv/bin/pip3.12 Executable file
View File

@@ -0,0 +1,8 @@
#!/home/moltbot/clawd/venv/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())

8
venv/bin/pyrsa-decrypt Executable file
View File

@@ -0,0 +1,8 @@
#!/home/moltbot/clawd/venv/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from rsa.cli import decrypt
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(decrypt())

8
venv/bin/pyrsa-encrypt Executable file
View File

@@ -0,0 +1,8 @@
#!/home/moltbot/clawd/venv/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from rsa.cli import encrypt
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(encrypt())

8
venv/bin/pyrsa-keygen Executable file
View File

@@ -0,0 +1,8 @@
#!/home/moltbot/clawd/venv/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from rsa.cli import keygen
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(keygen())

8
venv/bin/pyrsa-priv2pub Executable file
View File

@@ -0,0 +1,8 @@
#!/home/moltbot/clawd/venv/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from rsa.util import private_to_public
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(private_to_public())

8
venv/bin/pyrsa-sign Executable file
View File

@@ -0,0 +1,8 @@
#!/home/moltbot/clawd/venv/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from rsa.cli import sign
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(sign())

8
venv/bin/pyrsa-verify Executable file
View File

@@ -0,0 +1,8 @@
#!/home/moltbot/clawd/venv/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from rsa.cli import verify
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(verify())

1
venv/bin/python Symbolic link
View File

@@ -0,0 +1 @@
python3

1
venv/bin/python3 Symbolic link
View File

@@ -0,0 +1 @@
/usr/bin/python3

1
venv/bin/python3.12 Symbolic link
View File

@@ -0,0 +1 @@
python3

View File

@@ -0,0 +1,27 @@
"""Retain apiclient as an alias for googleapiclient."""
from googleapiclient import channel, discovery, errors, http, mimeparse, model
try:
from googleapiclient import sample_tools
except ImportError:
# Silently ignore, because the vast majority of consumers won't use it and
# it has deep dependence on oauth2client, an optional dependency.
sample_tools = None
from googleapiclient import schema
_SUBMODULES = {
"channel": channel,
"discovery": discovery,
"errors": errors,
"http": http,
"mimeparse": mimeparse,
"model": model,
"sample_tools": sample_tools,
"schema": schema,
}
import sys
for module_name, module in _SUBMODULES.items():
sys.modules["apiclient.%s" % module_name] = module

View File

@@ -0,0 +1,78 @@
Metadata-Version: 2.4
Name: certifi
Version: 2026.1.4
Summary: Python package for providing Mozilla's CA Bundle.
Home-page: https://github.com/certifi/python-certifi
Author: Kenneth Reitz
Author-email: me@kennethreitz.com
License: MPL-2.0
Project-URL: Source, https://github.com/certifi/python-certifi
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Mozilla Public License 2.0 (MPL 2.0)
Classifier: Natural Language :: English
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: >=3.7
License-File: LICENSE
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: home-page
Dynamic: license
Dynamic: license-file
Dynamic: project-url
Dynamic: requires-python
Dynamic: summary
Certifi: Python SSL Certificates
================================
Certifi provides Mozilla's carefully curated collection of Root Certificates for
validating the trustworthiness of SSL certificates while verifying the identity
of TLS hosts. It has been extracted from the `Requests`_ project.
Installation
------------
``certifi`` is available on PyPI. Simply install it with ``pip``::
$ pip install certifi
Usage
-----
To reference the installed certificate authority (CA) bundle, you can use the
built-in function::
>>> import certifi
>>> certifi.where()
'/usr/local/lib/python3.7/site-packages/certifi/cacert.pem'
Or from the command line::
$ python -m certifi
/usr/local/lib/python3.7/site-packages/certifi/cacert.pem
Enjoy!
.. _`Requests`: https://requests.readthedocs.io/en/master/
Addition/Removal of Certificates
--------------------------------
Certifi does not support any addition/removal or other modification of the
CA trust store content. This project is intended to provide a reliable and
highly portable root of trust to python deployments. Look to upstream projects
for methods to use alternate trust.

View File

@@ -0,0 +1,14 @@
certifi-2026.1.4.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
certifi-2026.1.4.dist-info/METADATA,sha256=FSfJEfKuMo6bJlofUrtRpn4PFTYtbYyXpHN_A3ZFpIY,2473
certifi-2026.1.4.dist-info/RECORD,,
certifi-2026.1.4.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
certifi-2026.1.4.dist-info/licenses/LICENSE,sha256=6TcW2mucDVpKHfYP5pWzcPBpVgPSH2-D8FPkLPwQyvc,989
certifi-2026.1.4.dist-info/top_level.txt,sha256=KMu4vUCfsjLrkPbSNdgdekS-pVJzBAJFO__nI8NF6-U,8
certifi/__init__.py,sha256=969deMMS7Uchipr0oO4dbRBUvRi0uNYCn07VmG1aTrg,94
certifi/__main__.py,sha256=xBBoj905TUWBLRGANOcf7oi6e-3dMP4cEoG9OyMs11g,243
certifi/__pycache__/__init__.cpython-312.pyc,,
certifi/__pycache__/__main__.cpython-312.pyc,,
certifi/__pycache__/core.cpython-312.pyc,,
certifi/cacert.pem,sha256=Tzl1_zCrvzVEO0hgZK6Ly0Hf9wf_31dsdtKS-0WKoKk,270954
certifi/core.py,sha256=XFXycndG5pf37ayeF8N32HUuDafsyhkVMbO4BAPWHa0,3394
certifi/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0

View File

@@ -0,0 +1,5 @@
Wheel-Version: 1.0
Generator: setuptools (80.9.0)
Root-Is-Purelib: true
Tag: py3-none-any

View File

@@ -0,0 +1,20 @@
This package contains a modified version of ca-bundle.crt:
ca-bundle.crt -- Bundle of CA Root Certificates
This is a bundle of X.509 certificates of public Certificate Authorities
(CA). These were automatically extracted from Mozilla's root certificates
file (certdata.txt). This file can be found in the mozilla source tree:
https://hg.mozilla.org/mozilla-central/file/tip/security/nss/lib/ckfw/builtins/certdata.txt
It contains the certificates in PEM format and therefore
can be directly used with curl / libcurl / php_curl, or with
an Apache+mod_ssl webserver for SSL client authentication.
Just configure this file as the SSLCACertificateFile.#
***** BEGIN LICENSE BLOCK *****
This Source Code Form is subject to the terms of the Mozilla Public License,
v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain
one at http://mozilla.org/MPL/2.0/.
***** END LICENSE BLOCK *****
@(#) $RCSfile: certdata.txt,v $ $Revision: 1.80 $ $Date: 2011/11/03 15:11:58 $

View File

@@ -0,0 +1,4 @@
from .core import contents, where
__all__ = ["contents", "where"]
__version__ = "2026.01.04"

View File

@@ -0,0 +1,12 @@
import argparse
from certifi import contents, where
parser = argparse.ArgumentParser()
parser.add_argument("-c", "--contents", action="store_true")
args = parser.parse_args()
if args.contents:
print(contents())
else:
print(where())

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,83 @@
"""
certifi.py
~~~~~~~~~~
This module returns the installation location of cacert.pem or its contents.
"""
import sys
import atexit
def exit_cacert_ctx() -> None:
_CACERT_CTX.__exit__(None, None, None) # type: ignore[union-attr]
if sys.version_info >= (3, 11):
from importlib.resources import as_file, files
_CACERT_CTX = None
_CACERT_PATH = None
def where() -> str:
# This is slightly terrible, but we want to delay extracting the file
# in cases where we're inside of a zipimport situation until someone
# actually calls where(), but we don't want to re-extract the file
# on every call of where(), so we'll do it once then store it in a
# global variable.
global _CACERT_CTX
global _CACERT_PATH
if _CACERT_PATH is None:
# This is slightly janky, the importlib.resources API wants you to
# manage the cleanup of this file, so it doesn't actually return a
# path, it returns a context manager that will give you the path
# when you enter it and will do any cleanup when you leave it. In
# the common case of not needing a temporary file, it will just
# return the file system location and the __exit__() is a no-op.
#
# We also have to hold onto the actual context manager, because
# it will do the cleanup whenever it gets garbage collected, so
# we will also store that at the global level as well.
_CACERT_CTX = as_file(files("certifi").joinpath("cacert.pem"))
_CACERT_PATH = str(_CACERT_CTX.__enter__())
atexit.register(exit_cacert_ctx)
return _CACERT_PATH
def contents() -> str:
return files("certifi").joinpath("cacert.pem").read_text(encoding="ascii")
else:
from importlib.resources import path as get_path, read_text
_CACERT_CTX = None
_CACERT_PATH = None
def where() -> str:
# This is slightly terrible, but we want to delay extracting the
# file in cases where we're inside of a zipimport situation until
# someone actually calls where(), but we don't want to re-extract
# the file on every call of where(), so we'll do it once then store
# it in a global variable.
global _CACERT_CTX
global _CACERT_PATH
if _CACERT_PATH is None:
# This is slightly janky, the importlib.resources API wants you
# to manage the cleanup of this file, so it doesn't actually
# return a path, it returns a context manager that will give
# you the path when you enter it and will do any cleanup when
# you leave it. In the common case of not needing a temporary
# file, it will just return the file system location and the
# __exit__() is a no-op.
#
# We also have to hold onto the actual context manager, because
# it will do the cleanup whenever it gets garbage collected, so
# we will also store that at the global level as well.
_CACERT_CTX = get_path("certifi", "cacert.pem")
_CACERT_PATH = str(_CACERT_CTX.__enter__())
atexit.register(exit_cacert_ctx)
return _CACERT_PATH
def contents() -> str:
return read_text("certifi", "cacert.pem", encoding="ascii")

View File

@@ -0,0 +1 @@
pip

View File

@@ -0,0 +1,68 @@
Metadata-Version: 2.4
Name: cffi
Version: 2.0.0
Summary: Foreign Function Interface for Python calling C code.
Author: Armin Rigo, Maciej Fijalkowski
Maintainer: Matt Davis, Matt Clay, Matti Picus
License-Expression: MIT
Project-URL: Documentation, https://cffi.readthedocs.io/
Project-URL: Changelog, https://cffi.readthedocs.io/en/latest/whatsnew.html
Project-URL: Downloads, https://github.com/python-cffi/cffi/releases
Project-URL: Contact, https://groups.google.com/forum/#!forum/python-cffi
Project-URL: Source Code, https://github.com/python-cffi/cffi
Project-URL: Issue Tracker, https://github.com/python-cffi/cffi/issues
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: Free Threading :: 2 - Beta
Classifier: Programming Language :: Python :: Implementation :: CPython
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: AUTHORS
Requires-Dist: pycparser; implementation_name != "PyPy"
Dynamic: license-file
[![GitHub Actions Status](https://github.com/python-cffi/cffi/actions/workflows/ci.yaml/badge.svg?branch=main)](https://github.com/python-cffi/cffi/actions/workflows/ci.yaml?query=branch%3Amain++)
[![PyPI version](https://img.shields.io/pypi/v/cffi.svg)](https://pypi.org/project/cffi)
[![Read the Docs](https://img.shields.io/badge/docs-latest-blue.svg)][Documentation]
CFFI
====
Foreign Function Interface for Python calling C code.
Please see the [Documentation] or uncompiled in the `doc/` subdirectory.
Download
--------
[Download page](https://github.com/python-cffi/cffi/releases)
Source Code
-----------
Source code is publicly available on
[GitHub](https://github.com/python-cffi/cffi).
Contact
-------
[Mailing list](https://groups.google.com/forum/#!forum/python-cffi)
Testing/development tips
------------------------
After `git clone` or `wget && tar`, we will get a directory called `cffi` or `cffi-x.x.x`. we call it `repo-directory`. To run tests under CPython, run the following in the `repo-directory`:
pip install pytest
pip install -e . # editable install of CFFI for local development
pytest src/c/ testing/
[Documentation]: http://cffi.readthedocs.org/

View File

@@ -0,0 +1,49 @@
_cffi_backend.cpython-312-x86_64-linux-gnu.so,sha256=AGLtw5fn9u4Cmwk3BbGlsXG7VZEvQekABMyEGuRZmcE,348808
cffi-2.0.0.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
cffi-2.0.0.dist-info/METADATA,sha256=uYzn40F68Im8EtXHNBLZs7FoPM-OxzyYbDWsjJvhujk,2559
cffi-2.0.0.dist-info/RECORD,,
cffi-2.0.0.dist-info/WHEEL,sha256=aSgG0F4rGPZtV0iTEIfy6dtHq6g67Lze3uLfk0vWn88,151
cffi-2.0.0.dist-info/entry_points.txt,sha256=y6jTxnyeuLnL-XJcDv8uML3n6wyYiGRg8MTp_QGJ9Ho,75
cffi-2.0.0.dist-info/licenses/AUTHORS,sha256=KmemC7-zN1nWfWRf8TG45ta8TK_CMtdR_Kw-2k0xTMg,208
cffi-2.0.0.dist-info/licenses/LICENSE,sha256=W6JN3FcGf5JJrdZEw6_EGl1tw34jQz73Wdld83Cwr2M,1123
cffi-2.0.0.dist-info/top_level.txt,sha256=rE7WR3rZfNKxWI9-jn6hsHCAl7MDkB-FmuQbxWjFehQ,19
cffi/__init__.py,sha256=-ksBQ7MfDzVvbBlV_ftYBWAmEqfA86ljIzMxzaZeAlI,511
cffi/__pycache__/__init__.cpython-312.pyc,,
cffi/__pycache__/_imp_emulation.cpython-312.pyc,,
cffi/__pycache__/_shimmed_dist_utils.cpython-312.pyc,,
cffi/__pycache__/api.cpython-312.pyc,,
cffi/__pycache__/backend_ctypes.cpython-312.pyc,,
cffi/__pycache__/cffi_opcode.cpython-312.pyc,,
cffi/__pycache__/commontypes.cpython-312.pyc,,
cffi/__pycache__/cparser.cpython-312.pyc,,
cffi/__pycache__/error.cpython-312.pyc,,
cffi/__pycache__/ffiplatform.cpython-312.pyc,,
cffi/__pycache__/lock.cpython-312.pyc,,
cffi/__pycache__/model.cpython-312.pyc,,
cffi/__pycache__/pkgconfig.cpython-312.pyc,,
cffi/__pycache__/recompiler.cpython-312.pyc,,
cffi/__pycache__/setuptools_ext.cpython-312.pyc,,
cffi/__pycache__/vengine_cpy.cpython-312.pyc,,
cffi/__pycache__/vengine_gen.cpython-312.pyc,,
cffi/__pycache__/verifier.cpython-312.pyc,,
cffi/_cffi_errors.h,sha256=zQXt7uR_m8gUW-fI2hJg0KoSkJFwXv8RGUkEDZ177dQ,3908
cffi/_cffi_include.h,sha256=Exhmgm9qzHWzWivjfTe0D7Xp4rPUkVxdNuwGhMTMzbw,15055
cffi/_embedding.h,sha256=Ai33FHblE7XSpHOCp8kPcWwN5_9BV14OvN0JVa6ITpw,18786
cffi/_imp_emulation.py,sha256=RxREG8zAbI2RPGBww90u_5fi8sWdahpdipOoPzkp7C0,2960
cffi/_shimmed_dist_utils.py,sha256=Bjj2wm8yZbvFvWEx5AEfmqaqZyZFhYfoyLLQHkXZuao,2230
cffi/api.py,sha256=alBv6hZQkjpmZplBphdaRn2lPO9-CORs_M7ixabvZWI,42169
cffi/backend_ctypes.py,sha256=h5ZIzLc6BFVXnGyc9xPqZWUS7qGy7yFSDqXe68Sa8z4,42454
cffi/cffi_opcode.py,sha256=JDV5l0R0_OadBX_uE7xPPTYtMdmpp8I9UYd6av7aiDU,5731
cffi/commontypes.py,sha256=7N6zPtCFlvxXMWhHV08psUjdYIK2XgsN3yo5dgua_v4,2805
cffi/cparser.py,sha256=QUTfmlL-aO-MYR8bFGlvAUHc36OQr7XYLe0WLkGFjRo,44790
cffi/error.py,sha256=v6xTiS4U0kvDcy4h_BDRo5v39ZQuj-IMRYLv5ETddZs,877
cffi/ffiplatform.py,sha256=avxFjdikYGJoEtmJO7ewVmwG_VEVl6EZ_WaNhZYCqv4,3584
cffi/lock.py,sha256=l9TTdwMIMpi6jDkJGnQgE9cvTIR7CAntIJr8EGHt3pY,747
cffi/model.py,sha256=W30UFQZE73jL5Mx5N81YT77us2W2iJjTm0XYfnwz1cg,21797
cffi/parse_c_type.h,sha256=OdwQfwM9ktq6vlCB43exFQmxDBtj2MBNdK8LYl15tjw,5976
cffi/pkgconfig.py,sha256=LP1w7vmWvmKwyqLaU1Z243FOWGNQMrgMUZrvgFuOlco,4374
cffi/recompiler.py,sha256=78J6lMEEOygXNmjN9-fOFFO3j7eW-iFxSrxfvQb54bY,65509
cffi/setuptools_ext.py,sha256=0rCwBJ1W7FHWtiMKfNXsSST88V8UXrui5oeXFlDNLG8,9411
cffi/vengine_cpy.py,sha256=oyQKD23kpE0aChUKA8Jg0e723foPiYzLYEdb-J0MiNs,43881
cffi/vengine_gen.py,sha256=DUlEIrDiVin1Pnhn1sfoamnS5NLqfJcOdhRoeSNeJRg,26939
cffi/verifier.py,sha256=oX8jpaohg2Qm3aHcznidAdvrVm5N4sQYG0a3Eo5mIl4,11182

View File

@@ -0,0 +1,6 @@
Wheel-Version: 1.0
Generator: setuptools (80.9.0)
Root-Is-Purelib: false
Tag: cp312-cp312-manylinux_2_17_x86_64
Tag: cp312-cp312-manylinux2014_x86_64

View File

@@ -0,0 +1,2 @@
[distutils.setup_keywords]
cffi_modules = cffi.setuptools_ext:cffi_modules

View File

@@ -0,0 +1,8 @@
This package has been mostly done by Armin Rigo with help from
Maciej Fijałkowski. The idea is heavily based (although not directly
copied) from LuaJIT ffi by Mike Pall.
Other contributors:
Google Inc.

View File

@@ -0,0 +1,23 @@
Except when otherwise stated (look for LICENSE files in directories or
information at the beginning of each file) all software and
documentation is licensed as follows:
MIT No Attribution
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, sublicense, and/or
sell copies of the Software, and to permit persons to whom the
Software is furnished to do so.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.

View File

@@ -0,0 +1,2 @@
_cffi_backend
cffi

View File

@@ -0,0 +1,14 @@
__all__ = ['FFI', 'VerificationError', 'VerificationMissing', 'CDefError',
'FFIError']
from .api import FFI
from .error import CDefError, FFIError, VerificationError, VerificationMissing
from .error import PkgConfigError
__version__ = "2.0.0"
__version_info__ = (2, 0, 0)
# The verifier module file names are based on the CRC32 of a string that
# contains the following version number. It may be older than __version__
# if nothing is clearly incompatible.
__version_verifier_modules__ = "0.8.6"

View File

@@ -0,0 +1,149 @@
#ifndef CFFI_MESSAGEBOX
# ifdef _MSC_VER
# define CFFI_MESSAGEBOX 1
# else
# define CFFI_MESSAGEBOX 0
# endif
#endif
#if CFFI_MESSAGEBOX
/* Windows only: logic to take the Python-CFFI embedding logic
initialization errors and display them in a background thread
with MessageBox. The idea is that if the whole program closes
as a result of this problem, then likely it is already a console
program and you can read the stderr output in the console too.
If it is not a console program, then it will likely show its own
dialog to complain, or generally not abruptly close, and for this
case the background thread should stay alive.
*/
static void *volatile _cffi_bootstrap_text;
static PyObject *_cffi_start_error_capture(void)
{
PyObject *result = NULL;
PyObject *x, *m, *bi;
if (InterlockedCompareExchangePointer(&_cffi_bootstrap_text,
(void *)1, NULL) != NULL)
return (PyObject *)1;
m = PyImport_AddModule("_cffi_error_capture");
if (m == NULL)
goto error;
result = PyModule_GetDict(m);
if (result == NULL)
goto error;
#if PY_MAJOR_VERSION >= 3
bi = PyImport_ImportModule("builtins");
#else
bi = PyImport_ImportModule("__builtin__");
#endif
if (bi == NULL)
goto error;
PyDict_SetItemString(result, "__builtins__", bi);
Py_DECREF(bi);
x = PyRun_String(
"import sys\n"
"class FileLike:\n"
" def write(self, x):\n"
" try:\n"
" of.write(x)\n"
" except: pass\n"
" self.buf += x\n"
" def flush(self):\n"
" pass\n"
"fl = FileLike()\n"
"fl.buf = ''\n"
"of = sys.stderr\n"
"sys.stderr = fl\n"
"def done():\n"
" sys.stderr = of\n"
" return fl.buf\n", /* make sure the returned value stays alive */
Py_file_input,
result, result);
Py_XDECREF(x);
error:
if (PyErr_Occurred())
{
PyErr_WriteUnraisable(Py_None);
PyErr_Clear();
}
return result;
}
#pragma comment(lib, "user32.lib")
static DWORD WINAPI _cffi_bootstrap_dialog(LPVOID ignored)
{
Sleep(666); /* may be interrupted if the whole process is closing */
#if PY_MAJOR_VERSION >= 3
MessageBoxW(NULL, (wchar_t *)_cffi_bootstrap_text,
L"Python-CFFI error",
MB_OK | MB_ICONERROR);
#else
MessageBoxA(NULL, (char *)_cffi_bootstrap_text,
"Python-CFFI error",
MB_OK | MB_ICONERROR);
#endif
_cffi_bootstrap_text = NULL;
return 0;
}
static void _cffi_stop_error_capture(PyObject *ecap)
{
PyObject *s;
void *text;
if (ecap == (PyObject *)1)
return;
if (ecap == NULL)
goto error;
s = PyRun_String("done()", Py_eval_input, ecap, ecap);
if (s == NULL)
goto error;
/* Show a dialog box, but in a background thread, and
never show multiple dialog boxes at once. */
#if PY_MAJOR_VERSION >= 3
text = PyUnicode_AsWideCharString(s, NULL);
#else
text = PyString_AsString(s);
#endif
_cffi_bootstrap_text = text;
if (text != NULL)
{
HANDLE h;
h = CreateThread(NULL, 0, _cffi_bootstrap_dialog,
NULL, 0, NULL);
if (h != NULL)
CloseHandle(h);
}
/* decref the string, but it should stay alive as 'fl.buf'
in the small module above. It will really be freed only if
we later get another similar error. So it's a leak of at
most one copy of the small module. That's fine for this
situation which is usually a "fatal error" anyway. */
Py_DECREF(s);
PyErr_Clear();
return;
error:
_cffi_bootstrap_text = NULL;
PyErr_Clear();
}
#else
static PyObject *_cffi_start_error_capture(void) { return NULL; }
static void _cffi_stop_error_capture(PyObject *ecap) { }
#endif

View File

@@ -0,0 +1,389 @@
#define _CFFI_
/* We try to define Py_LIMITED_API before including Python.h.
Mess: we can only define it if Py_DEBUG, Py_TRACE_REFS and
Py_REF_DEBUG are not defined. This is a best-effort approximation:
we can learn about Py_DEBUG from pyconfig.h, but it is unclear if
the same works for the other two macros. Py_DEBUG implies them,
but not the other way around.
The implementation is messy (issue #350): on Windows, with _MSC_VER,
we have to define Py_LIMITED_API even before including pyconfig.h.
In that case, we guess what pyconfig.h will do to the macros above,
and check our guess after the #include.
Note that on Windows, with CPython 3.x, you need >= 3.5 and virtualenv
version >= 16.0.0. With older versions of either, you don't get a
copy of PYTHON3.DLL in the virtualenv. We can't check the version of
CPython *before* we even include pyconfig.h. ffi.set_source() puts
a ``#define _CFFI_NO_LIMITED_API'' at the start of this file if it is
running on Windows < 3.5, as an attempt at fixing it, but that's
arguably wrong because it may not be the target version of Python.
Still better than nothing I guess. As another workaround, you can
remove the definition of Py_LIMITED_API here.
See also 'py_limited_api' in cffi/setuptools_ext.py.
*/
#if !defined(_CFFI_USE_EMBEDDING) && !defined(Py_LIMITED_API)
# ifdef _MSC_VER
# if !defined(_DEBUG) && !defined(Py_DEBUG) && !defined(Py_TRACE_REFS) && !defined(Py_REF_DEBUG) && !defined(_CFFI_NO_LIMITED_API)
# define Py_LIMITED_API
# endif
# include <pyconfig.h>
/* sanity-check: Py_LIMITED_API will cause crashes if any of these
are also defined. Normally, the Python file PC/pyconfig.h does not
cause any of these to be defined, with the exception that _DEBUG
causes Py_DEBUG. Double-check that. */
# ifdef Py_LIMITED_API
# if defined(Py_DEBUG)
# error "pyconfig.h unexpectedly defines Py_DEBUG, but Py_LIMITED_API is set"
# endif
# if defined(Py_TRACE_REFS)
# error "pyconfig.h unexpectedly defines Py_TRACE_REFS, but Py_LIMITED_API is set"
# endif
# if defined(Py_REF_DEBUG)
# error "pyconfig.h unexpectedly defines Py_REF_DEBUG, but Py_LIMITED_API is set"
# endif
# endif
# else
# include <pyconfig.h>
# if !defined(Py_DEBUG) && !defined(Py_TRACE_REFS) && !defined(Py_REF_DEBUG) && !defined(_CFFI_NO_LIMITED_API)
# define Py_LIMITED_API
# endif
# endif
#endif
#include <Python.h>
#ifdef __cplusplus
extern "C" {
#endif
#include <stddef.h>
#include "parse_c_type.h"
/* this block of #ifs should be kept exactly identical between
c/_cffi_backend.c, cffi/vengine_cpy.py, cffi/vengine_gen.py
and cffi/_cffi_include.h */
#if defined(_MSC_VER)
# include <malloc.h> /* for alloca() */
# if _MSC_VER < 1600 /* MSVC < 2010 */
typedef __int8 int8_t;
typedef __int16 int16_t;
typedef __int32 int32_t;
typedef __int64 int64_t;
typedef unsigned __int8 uint8_t;
typedef unsigned __int16 uint16_t;
typedef unsigned __int32 uint32_t;
typedef unsigned __int64 uint64_t;
typedef __int8 int_least8_t;
typedef __int16 int_least16_t;
typedef __int32 int_least32_t;
typedef __int64 int_least64_t;
typedef unsigned __int8 uint_least8_t;
typedef unsigned __int16 uint_least16_t;
typedef unsigned __int32 uint_least32_t;
typedef unsigned __int64 uint_least64_t;
typedef __int8 int_fast8_t;
typedef __int16 int_fast16_t;
typedef __int32 int_fast32_t;
typedef __int64 int_fast64_t;
typedef unsigned __int8 uint_fast8_t;
typedef unsigned __int16 uint_fast16_t;
typedef unsigned __int32 uint_fast32_t;
typedef unsigned __int64 uint_fast64_t;
typedef __int64 intmax_t;
typedef unsigned __int64 uintmax_t;
# else
# include <stdint.h>
# endif
# if _MSC_VER < 1800 /* MSVC < 2013 */
# ifndef __cplusplus
typedef unsigned char _Bool;
# endif
# endif
# define _cffi_float_complex_t _Fcomplex /* include <complex.h> for it */
# define _cffi_double_complex_t _Dcomplex /* include <complex.h> for it */
#else
# include <stdint.h>
# if (defined (__SVR4) && defined (__sun)) || defined(_AIX) || defined(__hpux)
# include <alloca.h>
# endif
# define _cffi_float_complex_t float _Complex
# define _cffi_double_complex_t double _Complex
#endif
#ifdef __GNUC__
# define _CFFI_UNUSED_FN __attribute__((unused))
#else
# define _CFFI_UNUSED_FN /* nothing */
#endif
#ifdef __cplusplus
# ifndef _Bool
typedef bool _Bool; /* semi-hackish: C++ has no _Bool; bool is builtin */
# endif
#endif
/********** CPython-specific section **********/
#ifndef PYPY_VERSION
#if PY_MAJOR_VERSION >= 3
# define PyInt_FromLong PyLong_FromLong
#endif
#define _cffi_from_c_double PyFloat_FromDouble
#define _cffi_from_c_float PyFloat_FromDouble
#define _cffi_from_c_long PyInt_FromLong
#define _cffi_from_c_ulong PyLong_FromUnsignedLong
#define _cffi_from_c_longlong PyLong_FromLongLong
#define _cffi_from_c_ulonglong PyLong_FromUnsignedLongLong
#define _cffi_from_c__Bool PyBool_FromLong
#define _cffi_to_c_double PyFloat_AsDouble
#define _cffi_to_c_float PyFloat_AsDouble
#define _cffi_from_c_int(x, type) \
(((type)-1) > 0 ? /* unsigned */ \
(sizeof(type) < sizeof(long) ? \
PyInt_FromLong((long)x) : \
sizeof(type) == sizeof(long) ? \
PyLong_FromUnsignedLong((unsigned long)x) : \
PyLong_FromUnsignedLongLong((unsigned long long)x)) : \
(sizeof(type) <= sizeof(long) ? \
PyInt_FromLong((long)x) : \
PyLong_FromLongLong((long long)x)))
#define _cffi_to_c_int(o, type) \
((type)( \
sizeof(type) == 1 ? (((type)-1) > 0 ? (type)_cffi_to_c_u8(o) \
: (type)_cffi_to_c_i8(o)) : \
sizeof(type) == 2 ? (((type)-1) > 0 ? (type)_cffi_to_c_u16(o) \
: (type)_cffi_to_c_i16(o)) : \
sizeof(type) == 4 ? (((type)-1) > 0 ? (type)_cffi_to_c_u32(o) \
: (type)_cffi_to_c_i32(o)) : \
sizeof(type) == 8 ? (((type)-1) > 0 ? (type)_cffi_to_c_u64(o) \
: (type)_cffi_to_c_i64(o)) : \
(Py_FatalError("unsupported size for type " #type), (type)0)))
#define _cffi_to_c_i8 \
((int(*)(PyObject *))_cffi_exports[1])
#define _cffi_to_c_u8 \
((int(*)(PyObject *))_cffi_exports[2])
#define _cffi_to_c_i16 \
((int(*)(PyObject *))_cffi_exports[3])
#define _cffi_to_c_u16 \
((int(*)(PyObject *))_cffi_exports[4])
#define _cffi_to_c_i32 \
((int(*)(PyObject *))_cffi_exports[5])
#define _cffi_to_c_u32 \
((unsigned int(*)(PyObject *))_cffi_exports[6])
#define _cffi_to_c_i64 \
((long long(*)(PyObject *))_cffi_exports[7])
#define _cffi_to_c_u64 \
((unsigned long long(*)(PyObject *))_cffi_exports[8])
#define _cffi_to_c_char \
((int(*)(PyObject *))_cffi_exports[9])
#define _cffi_from_c_pointer \
((PyObject *(*)(char *, struct _cffi_ctypedescr *))_cffi_exports[10])
#define _cffi_to_c_pointer \
((char *(*)(PyObject *, struct _cffi_ctypedescr *))_cffi_exports[11])
#define _cffi_get_struct_layout \
not used any more
#define _cffi_restore_errno \
((void(*)(void))_cffi_exports[13])
#define _cffi_save_errno \
((void(*)(void))_cffi_exports[14])
#define _cffi_from_c_char \
((PyObject *(*)(char))_cffi_exports[15])
#define _cffi_from_c_deref \
((PyObject *(*)(char *, struct _cffi_ctypedescr *))_cffi_exports[16])
#define _cffi_to_c \
((int(*)(char *, struct _cffi_ctypedescr *, PyObject *))_cffi_exports[17])
#define _cffi_from_c_struct \
((PyObject *(*)(char *, struct _cffi_ctypedescr *))_cffi_exports[18])
#define _cffi_to_c_wchar_t \
((_cffi_wchar_t(*)(PyObject *))_cffi_exports[19])
#define _cffi_from_c_wchar_t \
((PyObject *(*)(_cffi_wchar_t))_cffi_exports[20])
#define _cffi_to_c_long_double \
((long double(*)(PyObject *))_cffi_exports[21])
#define _cffi_to_c__Bool \
((_Bool(*)(PyObject *))_cffi_exports[22])
#define _cffi_prepare_pointer_call_argument \
((Py_ssize_t(*)(struct _cffi_ctypedescr *, \
PyObject *, char **))_cffi_exports[23])
#define _cffi_convert_array_from_object \
((int(*)(char *, struct _cffi_ctypedescr *, PyObject *))_cffi_exports[24])
#define _CFFI_CPIDX 25
#define _cffi_call_python \
((void(*)(struct _cffi_externpy_s *, char *))_cffi_exports[_CFFI_CPIDX])
#define _cffi_to_c_wchar3216_t \
((int(*)(PyObject *))_cffi_exports[26])
#define _cffi_from_c_wchar3216_t \
((PyObject *(*)(int))_cffi_exports[27])
#define _CFFI_NUM_EXPORTS 28
struct _cffi_ctypedescr;
static void *_cffi_exports[_CFFI_NUM_EXPORTS];
#define _cffi_type(index) ( \
assert((((uintptr_t)_cffi_types[index]) & 1) == 0), \
(struct _cffi_ctypedescr *)_cffi_types[index])
static PyObject *_cffi_init(const char *module_name, Py_ssize_t version,
const struct _cffi_type_context_s *ctx)
{
PyObject *module, *o_arg, *new_module;
void *raw[] = {
(void *)module_name,
(void *)version,
(void *)_cffi_exports,
(void *)ctx,
};
module = PyImport_ImportModule("_cffi_backend");
if (module == NULL)
goto failure;
o_arg = PyLong_FromVoidPtr((void *)raw);
if (o_arg == NULL)
goto failure;
new_module = PyObject_CallMethod(
module, (char *)"_init_cffi_1_0_external_module", (char *)"O", o_arg);
Py_DECREF(o_arg);
Py_DECREF(module);
return new_module;
failure:
Py_XDECREF(module);
return NULL;
}
#ifdef HAVE_WCHAR_H
typedef wchar_t _cffi_wchar_t;
#else
typedef uint16_t _cffi_wchar_t; /* same random pick as _cffi_backend.c */
#endif
_CFFI_UNUSED_FN static uint16_t _cffi_to_c_char16_t(PyObject *o)
{
if (sizeof(_cffi_wchar_t) == 2)
return (uint16_t)_cffi_to_c_wchar_t(o);
else
return (uint16_t)_cffi_to_c_wchar3216_t(o);
}
_CFFI_UNUSED_FN static PyObject *_cffi_from_c_char16_t(uint16_t x)
{
if (sizeof(_cffi_wchar_t) == 2)
return _cffi_from_c_wchar_t((_cffi_wchar_t)x);
else
return _cffi_from_c_wchar3216_t((int)x);
}
_CFFI_UNUSED_FN static int _cffi_to_c_char32_t(PyObject *o)
{
if (sizeof(_cffi_wchar_t) == 4)
return (int)_cffi_to_c_wchar_t(o);
else
return (int)_cffi_to_c_wchar3216_t(o);
}
_CFFI_UNUSED_FN static PyObject *_cffi_from_c_char32_t(unsigned int x)
{
if (sizeof(_cffi_wchar_t) == 4)
return _cffi_from_c_wchar_t((_cffi_wchar_t)x);
else
return _cffi_from_c_wchar3216_t((int)x);
}
union _cffi_union_alignment_u {
unsigned char m_char;
unsigned short m_short;
unsigned int m_int;
unsigned long m_long;
unsigned long long m_longlong;
float m_float;
double m_double;
long double m_longdouble;
};
struct _cffi_freeme_s {
struct _cffi_freeme_s *next;
union _cffi_union_alignment_u alignment;
};
_CFFI_UNUSED_FN static int
_cffi_convert_array_argument(struct _cffi_ctypedescr *ctptr, PyObject *arg,
char **output_data, Py_ssize_t datasize,
struct _cffi_freeme_s **freeme)
{
char *p;
if (datasize < 0)
return -1;
p = *output_data;
if (p == NULL) {
struct _cffi_freeme_s *fp = (struct _cffi_freeme_s *)PyObject_Malloc(
offsetof(struct _cffi_freeme_s, alignment) + (size_t)datasize);
if (fp == NULL)
return -1;
fp->next = *freeme;
*freeme = fp;
p = *output_data = (char *)&fp->alignment;
}
memset((void *)p, 0, (size_t)datasize);
return _cffi_convert_array_from_object(p, ctptr, arg);
}
_CFFI_UNUSED_FN static void
_cffi_free_array_arguments(struct _cffi_freeme_s *freeme)
{
do {
void *p = (void *)freeme;
freeme = freeme->next;
PyObject_Free(p);
} while (freeme != NULL);
}
/********** end CPython-specific section **********/
#else
_CFFI_UNUSED_FN
static void (*_cffi_call_python_org)(struct _cffi_externpy_s *, char *);
# define _cffi_call_python _cffi_call_python_org
#endif
#define _cffi_array_len(array) (sizeof(array) / sizeof((array)[0]))
#define _cffi_prim_int(size, sign) \
((size) == 1 ? ((sign) ? _CFFI_PRIM_INT8 : _CFFI_PRIM_UINT8) : \
(size) == 2 ? ((sign) ? _CFFI_PRIM_INT16 : _CFFI_PRIM_UINT16) : \
(size) == 4 ? ((sign) ? _CFFI_PRIM_INT32 : _CFFI_PRIM_UINT32) : \
(size) == 8 ? ((sign) ? _CFFI_PRIM_INT64 : _CFFI_PRIM_UINT64) : \
_CFFI__UNKNOWN_PRIM)
#define _cffi_prim_float(size) \
((size) == sizeof(float) ? _CFFI_PRIM_FLOAT : \
(size) == sizeof(double) ? _CFFI_PRIM_DOUBLE : \
(size) == sizeof(long double) ? _CFFI__UNKNOWN_LONG_DOUBLE : \
_CFFI__UNKNOWN_FLOAT_PRIM)
#define _cffi_check_int(got, got_nonpos, expected) \
((got_nonpos) == (expected <= 0) && \
(got) == (unsigned long long)expected)
#ifdef MS_WIN32
# define _cffi_stdcall __stdcall
#else
# define _cffi_stdcall /* nothing */
#endif
#ifdef __cplusplus
}
#endif

View File

@@ -0,0 +1,550 @@
/***** Support code for embedding *****/
#ifdef __cplusplus
extern "C" {
#endif
#if defined(_WIN32)
# define CFFI_DLLEXPORT __declspec(dllexport)
#elif defined(__GNUC__)
# define CFFI_DLLEXPORT __attribute__((visibility("default")))
#else
# define CFFI_DLLEXPORT /* nothing */
#endif
/* There are two global variables of type _cffi_call_python_fnptr:
* _cffi_call_python, which we declare just below, is the one called
by ``extern "Python"`` implementations.
* _cffi_call_python_org, which on CPython is actually part of the
_cffi_exports[] array, is the function pointer copied from
_cffi_backend. If _cffi_start_python() fails, then this is set
to NULL; otherwise, it should never be NULL.
After initialization is complete, both are equal. However, the
first one remains equal to &_cffi_start_and_call_python until the
very end of initialization, when we are (or should be) sure that
concurrent threads also see a completely initialized world, and
only then is it changed.
*/
#undef _cffi_call_python
typedef void (*_cffi_call_python_fnptr)(struct _cffi_externpy_s *, char *);
static void _cffi_start_and_call_python(struct _cffi_externpy_s *, char *);
static _cffi_call_python_fnptr _cffi_call_python = &_cffi_start_and_call_python;
#ifndef _MSC_VER
/* --- Assuming a GCC not infinitely old --- */
# define cffi_compare_and_swap(l,o,n) __sync_bool_compare_and_swap(l,o,n)
# define cffi_write_barrier() __sync_synchronize()
# if !defined(__amd64__) && !defined(__x86_64__) && \
!defined(__i386__) && !defined(__i386)
# define cffi_read_barrier() __sync_synchronize()
# else
# define cffi_read_barrier() (void)0
# endif
#else
/* --- Windows threads version --- */
# include <Windows.h>
# define cffi_compare_and_swap(l,o,n) \
(InterlockedCompareExchangePointer(l,n,o) == (o))
# define cffi_write_barrier() InterlockedCompareExchange(&_cffi_dummy,0,0)
# define cffi_read_barrier() (void)0
static volatile LONG _cffi_dummy;
#endif
#ifdef WITH_THREAD
# ifndef _MSC_VER
# include <pthread.h>
static pthread_mutex_t _cffi_embed_startup_lock;
# else
static CRITICAL_SECTION _cffi_embed_startup_lock;
# endif
static char _cffi_embed_startup_lock_ready = 0;
#endif
static void _cffi_acquire_reentrant_mutex(void)
{
static void *volatile lock = NULL;
while (!cffi_compare_and_swap(&lock, NULL, (void *)1)) {
/* should ideally do a spin loop instruction here, but
hard to do it portably and doesn't really matter I
think: pthread_mutex_init() should be very fast, and
this is only run at start-up anyway. */
}
#ifdef WITH_THREAD
if (!_cffi_embed_startup_lock_ready) {
# ifndef _MSC_VER
pthread_mutexattr_t attr;
pthread_mutexattr_init(&attr);
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
pthread_mutex_init(&_cffi_embed_startup_lock, &attr);
# else
InitializeCriticalSection(&_cffi_embed_startup_lock);
# endif
_cffi_embed_startup_lock_ready = 1;
}
#endif
while (!cffi_compare_and_swap(&lock, (void *)1, NULL))
;
#ifndef _MSC_VER
pthread_mutex_lock(&_cffi_embed_startup_lock);
#else
EnterCriticalSection(&_cffi_embed_startup_lock);
#endif
}
static void _cffi_release_reentrant_mutex(void)
{
#ifndef _MSC_VER
pthread_mutex_unlock(&_cffi_embed_startup_lock);
#else
LeaveCriticalSection(&_cffi_embed_startup_lock);
#endif
}
/********** CPython-specific section **********/
#ifndef PYPY_VERSION
#include "_cffi_errors.h"
#define _cffi_call_python_org _cffi_exports[_CFFI_CPIDX]
PyMODINIT_FUNC _CFFI_PYTHON_STARTUP_FUNC(void); /* forward */
static void _cffi_py_initialize(void)
{
/* XXX use initsigs=0, which "skips initialization registration of
signal handlers, which might be useful when Python is
embedded" according to the Python docs. But review and think
if it should be a user-controllable setting.
XXX we should also give a way to write errors to a buffer
instead of to stderr.
XXX if importing 'site' fails, CPython (any version) calls
exit(). Should we try to work around this behavior here?
*/
Py_InitializeEx(0);
}
static int _cffi_initialize_python(void)
{
/* This initializes Python, imports _cffi_backend, and then the
present .dll/.so is set up as a CPython C extension module.
*/
int result;
PyGILState_STATE state;
PyObject *pycode=NULL, *global_dict=NULL, *x;
PyObject *builtins;
state = PyGILState_Ensure();
/* Call the initxxx() function from the present module. It will
create and initialize us as a CPython extension module, instead
of letting the startup Python code do it---it might reimport
the same .dll/.so and get maybe confused on some platforms.
It might also have troubles locating the .dll/.so again for all
I know.
*/
(void)_CFFI_PYTHON_STARTUP_FUNC();
if (PyErr_Occurred())
goto error;
/* Now run the Python code provided to ffi.embedding_init_code().
*/
pycode = Py_CompileString(_CFFI_PYTHON_STARTUP_CODE,
"<init code for '" _CFFI_MODULE_NAME "'>",
Py_file_input);
if (pycode == NULL)
goto error;
global_dict = PyDict_New();
if (global_dict == NULL)
goto error;
builtins = PyEval_GetBuiltins();
if (builtins == NULL)
goto error;
if (PyDict_SetItemString(global_dict, "__builtins__", builtins) < 0)
goto error;
x = PyEval_EvalCode(
#if PY_MAJOR_VERSION < 3
(PyCodeObject *)
#endif
pycode, global_dict, global_dict);
if (x == NULL)
goto error;
Py_DECREF(x);
/* Done! Now if we've been called from
_cffi_start_and_call_python() in an ``extern "Python"``, we can
only hope that the Python code did correctly set up the
corresponding @ffi.def_extern() function. Otherwise, the
general logic of ``extern "Python"`` functions (inside the
_cffi_backend module) will find that the reference is still
missing and print an error.
*/
result = 0;
done:
Py_XDECREF(pycode);
Py_XDECREF(global_dict);
PyGILState_Release(state);
return result;
error:;
{
/* Print as much information as potentially useful.
Debugging load-time failures with embedding is not fun
*/
PyObject *ecap;
PyObject *exception, *v, *tb, *f, *modules, *mod;
PyErr_Fetch(&exception, &v, &tb);
ecap = _cffi_start_error_capture();
f = PySys_GetObject((char *)"stderr");
if (f != NULL && f != Py_None) {
PyFile_WriteString(
"Failed to initialize the Python-CFFI embedding logic:\n\n", f);
}
if (exception != NULL) {
PyErr_NormalizeException(&exception, &v, &tb);
PyErr_Display(exception, v, tb);
}
Py_XDECREF(exception);
Py_XDECREF(v);
Py_XDECREF(tb);
if (f != NULL && f != Py_None) {
PyFile_WriteString("\nFrom: " _CFFI_MODULE_NAME
"\ncompiled with cffi version: 2.0.0"
"\n_cffi_backend module: ", f);
modules = PyImport_GetModuleDict();
mod = PyDict_GetItemString(modules, "_cffi_backend");
if (mod == NULL) {
PyFile_WriteString("not loaded", f);
}
else {
v = PyObject_GetAttrString(mod, "__file__");
PyFile_WriteObject(v, f, 0);
Py_XDECREF(v);
}
PyFile_WriteString("\nsys.path: ", f);
PyFile_WriteObject(PySys_GetObject((char *)"path"), f, 0);
PyFile_WriteString("\n\n", f);
}
_cffi_stop_error_capture(ecap);
}
result = -1;
goto done;
}
#if PY_VERSION_HEX < 0x03080000
PyAPI_DATA(char *) _PyParser_TokenNames[]; /* from CPython */
#endif
static int _cffi_carefully_make_gil(void)
{
/* This does the basic initialization of Python. It can be called
completely concurrently from unrelated threads. It assumes
that we don't hold the GIL before (if it exists), and we don't
hold it afterwards.
(What it really does used to be completely different in Python 2
and Python 3, with the Python 2 solution avoiding the spin-lock
around the Py_InitializeEx() call. However, after recent changes
to CPython 2.7 (issue #358) it no longer works. So we use the
Python 3 solution everywhere.)
This initializes Python by calling Py_InitializeEx().
Important: this must not be called concurrently at all.
So we use a global variable as a simple spin lock. This global
variable must be from 'libpythonX.Y.so', not from this
cffi-based extension module, because it must be shared from
different cffi-based extension modules.
In Python < 3.8, we choose
_PyParser_TokenNames[0] as a completely arbitrary pointer value
that is never written to. The default is to point to the
string "ENDMARKER". We change it temporarily to point to the
next character in that string. (Yes, I know it's REALLY
obscure.)
In Python >= 3.8, this string array is no longer writable, so
instead we pick PyCapsuleType.tp_version_tag. We can't change
Python < 3.8 because someone might use a mixture of cffi
embedded modules, some of which were compiled before this file
changed.
In Python >= 3.12, this stopped working because that particular
tp_version_tag gets modified during interpreter startup. It's
arguably a bad idea before 3.12 too, but again we can't change
that because someone might use a mixture of cffi embedded
modules, and no-one reported a bug so far. In Python >= 3.12
we go instead for PyCapsuleType.tp_as_buffer, which is supposed
to always be NULL. We write to it temporarily a pointer to
a struct full of NULLs, which is semantically the same.
*/
#ifdef WITH_THREAD
# if PY_VERSION_HEX < 0x03080000
char *volatile *lock = (char *volatile *)_PyParser_TokenNames;
char *old_value, *locked_value;
while (1) { /* spin loop */
old_value = *lock;
locked_value = old_value + 1;
if (old_value[0] == 'E') {
assert(old_value[1] == 'N');
if (cffi_compare_and_swap(lock, old_value, locked_value))
break;
}
else {
assert(old_value[0] == 'N');
/* should ideally do a spin loop instruction here, but
hard to do it portably and doesn't really matter I
think: PyEval_InitThreads() should be very fast, and
this is only run at start-up anyway. */
}
}
# else
# if PY_VERSION_HEX < 0x030C0000
int volatile *lock = (int volatile *)&PyCapsule_Type.tp_version_tag;
int old_value, locked_value = -42;
assert(!(PyCapsule_Type.tp_flags & Py_TPFLAGS_HAVE_VERSION_TAG));
# else
static struct ebp_s { PyBufferProcs buf; int mark; } empty_buffer_procs;
empty_buffer_procs.mark = -42;
PyBufferProcs *volatile *lock = (PyBufferProcs *volatile *)
&PyCapsule_Type.tp_as_buffer;
PyBufferProcs *old_value, *locked_value = &empty_buffer_procs.buf;
# endif
while (1) { /* spin loop */
old_value = *lock;
if (old_value == 0) {
if (cffi_compare_and_swap(lock, old_value, locked_value))
break;
}
else {
# if PY_VERSION_HEX < 0x030C0000
assert(old_value == locked_value);
# else
/* The pointer should point to a possibly different
empty_buffer_procs from another C extension module */
assert(((struct ebp_s *)old_value)->mark == -42);
# endif
/* should ideally do a spin loop instruction here, but
hard to do it portably and doesn't really matter I
think: PyEval_InitThreads() should be very fast, and
this is only run at start-up anyway. */
}
}
# endif
#endif
/* call Py_InitializeEx() */
if (!Py_IsInitialized()) {
_cffi_py_initialize();
#if PY_VERSION_HEX < 0x03070000
PyEval_InitThreads();
#endif
PyEval_SaveThread(); /* release the GIL */
/* the returned tstate must be the one that has been stored into the
autoTLSkey by _PyGILState_Init() called from Py_Initialize(). */
}
else {
#if PY_VERSION_HEX < 0x03070000
/* PyEval_InitThreads() is always a no-op from CPython 3.7 */
PyGILState_STATE state = PyGILState_Ensure();
PyEval_InitThreads();
PyGILState_Release(state);
#endif
}
#ifdef WITH_THREAD
/* release the lock */
while (!cffi_compare_and_swap(lock, locked_value, old_value))
;
#endif
return 0;
}
/********** end CPython-specific section **********/
#else
/********** PyPy-specific section **********/
PyMODINIT_FUNC _CFFI_PYTHON_STARTUP_FUNC(const void *[]); /* forward */
static struct _cffi_pypy_init_s {
const char *name;
void *func; /* function pointer */
const char *code;
} _cffi_pypy_init = {
_CFFI_MODULE_NAME,
_CFFI_PYTHON_STARTUP_FUNC,
_CFFI_PYTHON_STARTUP_CODE,
};
extern int pypy_carefully_make_gil(const char *);
extern int pypy_init_embedded_cffi_module(int, struct _cffi_pypy_init_s *);
static int _cffi_carefully_make_gil(void)
{
return pypy_carefully_make_gil(_CFFI_MODULE_NAME);
}
static int _cffi_initialize_python(void)
{
return pypy_init_embedded_cffi_module(0xB011, &_cffi_pypy_init);
}
/********** end PyPy-specific section **********/
#endif
#ifdef __GNUC__
__attribute__((noinline))
#endif
static _cffi_call_python_fnptr _cffi_start_python(void)
{
/* Delicate logic to initialize Python. This function can be
called multiple times concurrently, e.g. when the process calls
its first ``extern "Python"`` functions in multiple threads at
once. It can also be called recursively, in which case we must
ignore it. We also have to consider what occurs if several
different cffi-based extensions reach this code in parallel
threads---it is a different copy of the code, then, and we
can't have any shared global variable unless it comes from
'libpythonX.Y.so'.
Idea:
* _cffi_carefully_make_gil(): "carefully" call
PyEval_InitThreads() (possibly with Py_InitializeEx() first).
* then we use a (local) custom lock to make sure that a call to this
cffi-based extension will wait if another call to the *same*
extension is running the initialization in another thread.
It is reentrant, so that a recursive call will not block, but
only one from a different thread.
* then we grab the GIL and (Python 2) we call Py_InitializeEx().
At this point, concurrent calls to Py_InitializeEx() are not
possible: we have the GIL.
* do the rest of the specific initialization, which may
temporarily release the GIL but not the custom lock.
Only release the custom lock when we are done.
*/
static char called = 0;
if (_cffi_carefully_make_gil() != 0)
return NULL;
_cffi_acquire_reentrant_mutex();
/* Here the GIL exists, but we don't have it. We're only protected
from concurrency by the reentrant mutex. */
/* This file only initializes the embedded module once, the first
time this is called, even if there are subinterpreters. */
if (!called) {
called = 1; /* invoke _cffi_initialize_python() only once,
but don't set '_cffi_call_python' right now,
otherwise concurrent threads won't call
this function at all (we need them to wait) */
if (_cffi_initialize_python() == 0) {
/* now initialization is finished. Switch to the fast-path. */
/* We would like nobody to see the new value of
'_cffi_call_python' without also seeing the rest of the
data initialized. However, this is not possible. But
the new value of '_cffi_call_python' is the function
'cffi_call_python()' from _cffi_backend. So: */
cffi_write_barrier();
/* ^^^ we put a write barrier here, and a corresponding
read barrier at the start of cffi_call_python(). This
ensures that after that read barrier, we see everything
done here before the write barrier.
*/
assert(_cffi_call_python_org != NULL);
_cffi_call_python = (_cffi_call_python_fnptr)_cffi_call_python_org;
}
else {
/* initialization failed. Reset this to NULL, even if it was
already set to some other value. Future calls to
_cffi_start_python() are still forced to occur, and will
always return NULL from now on. */
_cffi_call_python_org = NULL;
}
}
_cffi_release_reentrant_mutex();
return (_cffi_call_python_fnptr)_cffi_call_python_org;
}
static
void _cffi_start_and_call_python(struct _cffi_externpy_s *externpy, char *args)
{
_cffi_call_python_fnptr fnptr;
int current_err = errno;
#ifdef _MSC_VER
int current_lasterr = GetLastError();
#endif
fnptr = _cffi_start_python();
if (fnptr == NULL) {
fprintf(stderr, "function %s() called, but initialization code "
"failed. Returning 0.\n", externpy->name);
memset(args, 0, externpy->size_of_result);
}
#ifdef _MSC_VER
SetLastError(current_lasterr);
#endif
errno = current_err;
if (fnptr != NULL)
fnptr(externpy, args);
}
/* The cffi_start_python() function makes sure Python is initialized
and our cffi module is set up. It can be called manually from the
user C code. The same effect is obtained automatically from any
dll-exported ``extern "Python"`` function. This function returns
-1 if initialization failed, 0 if all is OK. */
_CFFI_UNUSED_FN
static int cffi_start_python(void)
{
if (_cffi_call_python == &_cffi_start_and_call_python) {
if (_cffi_start_python() == NULL)
return -1;
}
cffi_read_barrier();
return 0;
}
#undef cffi_compare_and_swap
#undef cffi_write_barrier
#undef cffi_read_barrier
#ifdef __cplusplus
}
#endif

View File

@@ -0,0 +1,83 @@
try:
# this works on Python < 3.12
from imp import *
except ImportError:
# this is a limited emulation for Python >= 3.12.
# Note that this is used only for tests or for the old ffi.verify().
# This is copied from the source code of Python 3.11.
from _imp import (acquire_lock, release_lock,
is_builtin, is_frozen)
from importlib._bootstrap import _load
from importlib import machinery
import os
import sys
import tokenize
SEARCH_ERROR = 0
PY_SOURCE = 1
PY_COMPILED = 2
C_EXTENSION = 3
PY_RESOURCE = 4
PKG_DIRECTORY = 5
C_BUILTIN = 6
PY_FROZEN = 7
PY_CODERESOURCE = 8
IMP_HOOK = 9
def get_suffixes():
extensions = [(s, 'rb', C_EXTENSION)
for s in machinery.EXTENSION_SUFFIXES]
source = [(s, 'r', PY_SOURCE) for s in machinery.SOURCE_SUFFIXES]
bytecode = [(s, 'rb', PY_COMPILED) for s in machinery.BYTECODE_SUFFIXES]
return extensions + source + bytecode
def find_module(name, path=None):
if not isinstance(name, str):
raise TypeError("'name' must be a str, not {}".format(type(name)))
elif not isinstance(path, (type(None), list)):
# Backwards-compatibility
raise RuntimeError("'path' must be None or a list, "
"not {}".format(type(path)))
if path is None:
if is_builtin(name):
return None, None, ('', '', C_BUILTIN)
elif is_frozen(name):
return None, None, ('', '', PY_FROZEN)
else:
path = sys.path
for entry in path:
package_directory = os.path.join(entry, name)
for suffix in ['.py', machinery.BYTECODE_SUFFIXES[0]]:
package_file_name = '__init__' + suffix
file_path = os.path.join(package_directory, package_file_name)
if os.path.isfile(file_path):
return None, package_directory, ('', '', PKG_DIRECTORY)
for suffix, mode, type_ in get_suffixes():
file_name = name + suffix
file_path = os.path.join(entry, file_name)
if os.path.isfile(file_path):
break
else:
continue
break # Break out of outer loop when breaking out of inner loop.
else:
raise ImportError(name, name=name)
encoding = None
if 'b' not in mode:
with open(file_path, 'rb') as file:
encoding = tokenize.detect_encoding(file.readline)[0]
file = open(file_path, mode, encoding=encoding)
return file, file_path, (suffix, mode, type_)
def load_dynamic(name, path, file=None):
loader = machinery.ExtensionFileLoader(name, path)
spec = machinery.ModuleSpec(name=name, loader=loader, origin=path)
return _load(spec)

View File

@@ -0,0 +1,45 @@
"""
Temporary shim module to indirect the bits of distutils we need from setuptools/distutils while providing useful
error messages beyond `No module named 'distutils' on Python >= 3.12, or when setuptools' vendored distutils is broken.
This is a compromise to avoid a hard-dep on setuptools for Python >= 3.12, since many users don't need runtime compilation support from CFFI.
"""
import sys
try:
# import setuptools first; this is the most robust way to ensure its embedded distutils is available
# (the .pth shim should usually work, but this is even more robust)
import setuptools
except Exception as ex:
if sys.version_info >= (3, 12):
# Python 3.12 has no built-in distutils to fall back on, so any import problem is fatal
raise Exception("This CFFI feature requires setuptools on Python >= 3.12. The setuptools module is missing or non-functional.") from ex
# silently ignore on older Pythons (support fallback to stdlib distutils where available)
else:
del setuptools
try:
# bring in just the bits of distutils we need, whether they really came from setuptools or stdlib-embedded distutils
from distutils import log, sysconfig
from distutils.ccompiler import CCompiler
from distutils.command.build_ext import build_ext
from distutils.core import Distribution, Extension
from distutils.dir_util import mkpath
from distutils.errors import DistutilsSetupError, CompileError, LinkError
from distutils.log import set_threshold, set_verbosity
if sys.platform == 'win32':
try:
# FUTURE: msvc9compiler module was removed in setuptools 74; consider removing, as it's only used by an ancient patch in `recompiler`
from distutils.msvc9compiler import MSVCCompiler
except ImportError:
MSVCCompiler = None
except Exception as ex:
if sys.version_info >= (3, 12):
raise Exception("This CFFI feature requires setuptools on Python >= 3.12. Please install the setuptools package.") from ex
# anything older, just let the underlying distutils import error fly
raise Exception("This CFFI feature requires distutils. Please install the distutils or setuptools package.") from ex
del sys

View File

@@ -0,0 +1,967 @@
import sys, types
from .lock import allocate_lock
from .error import CDefError
from . import model
try:
callable
except NameError:
# Python 3.1
from collections import Callable
callable = lambda x: isinstance(x, Callable)
try:
basestring
except NameError:
# Python 3.x
basestring = str
_unspecified = object()
class FFI(object):
r'''
The main top-level class that you instantiate once, or once per module.
Example usage:
ffi = FFI()
ffi.cdef("""
int printf(const char *, ...);
""")
C = ffi.dlopen(None) # standard library
-or-
C = ffi.verify() # use a C compiler: verify the decl above is right
C.printf("hello, %s!\n", ffi.new("char[]", "world"))
'''
def __init__(self, backend=None):
"""Create an FFI instance. The 'backend' argument is used to
select a non-default backend, mostly for tests.
"""
if backend is None:
# You need PyPy (>= 2.0 beta), or a CPython (>= 2.6) with
# _cffi_backend.so compiled.
import _cffi_backend as backend
from . import __version__
if backend.__version__ != __version__:
# bad version! Try to be as explicit as possible.
if hasattr(backend, '__file__'):
# CPython
raise Exception("Version mismatch: this is the 'cffi' package version %s, located in %r. When we import the top-level '_cffi_backend' extension module, we get version %s, located in %r. The two versions should be equal; check your installation." % (
__version__, __file__,
backend.__version__, backend.__file__))
else:
# PyPy
raise Exception("Version mismatch: this is the 'cffi' package version %s, located in %r. This interpreter comes with a built-in '_cffi_backend' module, which is version %s. The two versions should be equal; check your installation." % (
__version__, __file__, backend.__version__))
# (If you insist you can also try to pass the option
# 'backend=backend_ctypes.CTypesBackend()', but don't
# rely on it! It's probably not going to work well.)
from . import cparser
self._backend = backend
self._lock = allocate_lock()
self._parser = cparser.Parser()
self._cached_btypes = {}
self._parsed_types = types.ModuleType('parsed_types').__dict__
self._new_types = types.ModuleType('new_types').__dict__
self._function_caches = []
self._libraries = []
self._cdefsources = []
self._included_ffis = []
self._windows_unicode = None
self._init_once_cache = {}
self._cdef_version = None
self._embedding = None
self._typecache = model.get_typecache(backend)
if hasattr(backend, 'set_ffi'):
backend.set_ffi(self)
for name in list(backend.__dict__):
if name.startswith('RTLD_'):
setattr(self, name, getattr(backend, name))
#
with self._lock:
self.BVoidP = self._get_cached_btype(model.voidp_type)
self.BCharA = self._get_cached_btype(model.char_array_type)
if isinstance(backend, types.ModuleType):
# _cffi_backend: attach these constants to the class
if not hasattr(FFI, 'NULL'):
FFI.NULL = self.cast(self.BVoidP, 0)
FFI.CData, FFI.CType = backend._get_types()
else:
# ctypes backend: attach these constants to the instance
self.NULL = self.cast(self.BVoidP, 0)
self.CData, self.CType = backend._get_types()
self.buffer = backend.buffer
def cdef(self, csource, override=False, packed=False, pack=None):
"""Parse the given C source. This registers all declared functions,
types, and global variables. The functions and global variables can
then be accessed via either 'ffi.dlopen()' or 'ffi.verify()'.
The types can be used in 'ffi.new()' and other functions.
If 'packed' is specified as True, all structs declared inside this
cdef are packed, i.e. laid out without any field alignment at all.
Alternatively, 'pack' can be a small integer, and requests for
alignment greater than that are ignored (pack=1 is equivalent to
packed=True).
"""
self._cdef(csource, override=override, packed=packed, pack=pack)
def embedding_api(self, csource, packed=False, pack=None):
self._cdef(csource, packed=packed, pack=pack, dllexport=True)
if self._embedding is None:
self._embedding = ''
def _cdef(self, csource, override=False, **options):
if not isinstance(csource, str): # unicode, on Python 2
if not isinstance(csource, basestring):
raise TypeError("cdef() argument must be a string")
csource = csource.encode('ascii')
with self._lock:
self._cdef_version = object()
self._parser.parse(csource, override=override, **options)
self._cdefsources.append(csource)
if override:
for cache in self._function_caches:
cache.clear()
finishlist = self._parser._recomplete
if finishlist:
self._parser._recomplete = []
for tp in finishlist:
tp.finish_backend_type(self, finishlist)
def dlopen(self, name, flags=0):
"""Load and return a dynamic library identified by 'name'.
The standard C library can be loaded by passing None.
Note that functions and types declared by 'ffi.cdef()' are not
linked to a particular library, just like C headers; in the
library we only look for the actual (untyped) symbols.
"""
if not (isinstance(name, basestring) or
name is None or
isinstance(name, self.CData)):
raise TypeError("dlopen(name): name must be a file name, None, "
"or an already-opened 'void *' handle")
with self._lock:
lib, function_cache = _make_ffi_library(self, name, flags)
self._function_caches.append(function_cache)
self._libraries.append(lib)
return lib
def dlclose(self, lib):
"""Close a library obtained with ffi.dlopen(). After this call,
access to functions or variables from the library will fail
(possibly with a segmentation fault).
"""
type(lib).__cffi_close__(lib)
def _typeof_locked(self, cdecl):
# call me with the lock!
key = cdecl
if key in self._parsed_types:
return self._parsed_types[key]
#
if not isinstance(cdecl, str): # unicode, on Python 2
cdecl = cdecl.encode('ascii')
#
type = self._parser.parse_type(cdecl)
really_a_function_type = type.is_raw_function
if really_a_function_type:
type = type.as_function_pointer()
btype = self._get_cached_btype(type)
result = btype, really_a_function_type
self._parsed_types[key] = result
return result
def _typeof(self, cdecl, consider_function_as_funcptr=False):
# string -> ctype object
try:
result = self._parsed_types[cdecl]
except KeyError:
with self._lock:
result = self._typeof_locked(cdecl)
#
btype, really_a_function_type = result
if really_a_function_type and not consider_function_as_funcptr:
raise CDefError("the type %r is a function type, not a "
"pointer-to-function type" % (cdecl,))
return btype
def typeof(self, cdecl):
"""Parse the C type given as a string and return the
corresponding <ctype> object.
It can also be used on 'cdata' instance to get its C type.
"""
if isinstance(cdecl, basestring):
return self._typeof(cdecl)
if isinstance(cdecl, self.CData):
return self._backend.typeof(cdecl)
if isinstance(cdecl, types.BuiltinFunctionType):
res = _builtin_function_type(cdecl)
if res is not None:
return res
if (isinstance(cdecl, types.FunctionType)
and hasattr(cdecl, '_cffi_base_type')):
with self._lock:
return self._get_cached_btype(cdecl._cffi_base_type)
raise TypeError(type(cdecl))
def sizeof(self, cdecl):
"""Return the size in bytes of the argument. It can be a
string naming a C type, or a 'cdata' instance.
"""
if isinstance(cdecl, basestring):
BType = self._typeof(cdecl)
return self._backend.sizeof(BType)
else:
return self._backend.sizeof(cdecl)
def alignof(self, cdecl):
"""Return the natural alignment size in bytes of the C type
given as a string.
"""
if isinstance(cdecl, basestring):
cdecl = self._typeof(cdecl)
return self._backend.alignof(cdecl)
def offsetof(self, cdecl, *fields_or_indexes):
"""Return the offset of the named field inside the given
structure or array, which must be given as a C type name.
You can give several field names in case of nested structures.
You can also give numeric values which correspond to array
items, in case of an array type.
"""
if isinstance(cdecl, basestring):
cdecl = self._typeof(cdecl)
return self._typeoffsetof(cdecl, *fields_or_indexes)[1]
def new(self, cdecl, init=None):
"""Allocate an instance according to the specified C type and
return a pointer to it. The specified C type must be either a
pointer or an array: ``new('X *')`` allocates an X and returns
a pointer to it, whereas ``new('X[n]')`` allocates an array of
n X'es and returns an array referencing it (which works
mostly like a pointer, like in C). You can also use
``new('X[]', n)`` to allocate an array of a non-constant
length n.
The memory is initialized following the rules of declaring a
global variable in C: by default it is zero-initialized, but
an explicit initializer can be given which can be used to
fill all or part of the memory.
When the returned <cdata> object goes out of scope, the memory
is freed. In other words the returned <cdata> object has
ownership of the value of type 'cdecl' that it points to. This
means that the raw data can be used as long as this object is
kept alive, but must not be used for a longer time. Be careful
about that when copying the pointer to the memory somewhere
else, e.g. into another structure.
"""
if isinstance(cdecl, basestring):
cdecl = self._typeof(cdecl)
return self._backend.newp(cdecl, init)
def new_allocator(self, alloc=None, free=None,
should_clear_after_alloc=True):
"""Return a new allocator, i.e. a function that behaves like ffi.new()
but uses the provided low-level 'alloc' and 'free' functions.
'alloc' is called with the size as argument. If it returns NULL, a
MemoryError is raised. 'free' is called with the result of 'alloc'
as argument. Both can be either Python function or directly C
functions. If 'free' is None, then no free function is called.
If both 'alloc' and 'free' are None, the default is used.
If 'should_clear_after_alloc' is set to False, then the memory
returned by 'alloc' is assumed to be already cleared (or you are
fine with garbage); otherwise CFFI will clear it.
"""
compiled_ffi = self._backend.FFI()
allocator = compiled_ffi.new_allocator(alloc, free,
should_clear_after_alloc)
def allocate(cdecl, init=None):
if isinstance(cdecl, basestring):
cdecl = self._typeof(cdecl)
return allocator(cdecl, init)
return allocate
def cast(self, cdecl, source):
"""Similar to a C cast: returns an instance of the named C
type initialized with the given 'source'. The source is
casted between integers or pointers of any type.
"""
if isinstance(cdecl, basestring):
cdecl = self._typeof(cdecl)
return self._backend.cast(cdecl, source)
def string(self, cdata, maxlen=-1):
"""Return a Python string (or unicode string) from the 'cdata'.
If 'cdata' is a pointer or array of characters or bytes, returns
the null-terminated string. The returned string extends until
the first null character, or at most 'maxlen' characters. If
'cdata' is an array then 'maxlen' defaults to its length.
If 'cdata' is a pointer or array of wchar_t, returns a unicode
string following the same rules.
If 'cdata' is a single character or byte or a wchar_t, returns
it as a string or unicode string.
If 'cdata' is an enum, returns the value of the enumerator as a
string, or 'NUMBER' if the value is out of range.
"""
return self._backend.string(cdata, maxlen)
def unpack(self, cdata, length):
"""Unpack an array of C data of the given length,
returning a Python string/unicode/list.
If 'cdata' is a pointer to 'char', returns a byte string.
It does not stop at the first null. This is equivalent to:
ffi.buffer(cdata, length)[:]
If 'cdata' is a pointer to 'wchar_t', returns a unicode string.
'length' is measured in wchar_t's; it is not the size in bytes.
If 'cdata' is a pointer to anything else, returns a list of
'length' items. This is a faster equivalent to:
[cdata[i] for i in range(length)]
"""
return self._backend.unpack(cdata, length)
#def buffer(self, cdata, size=-1):
# """Return a read-write buffer object that references the raw C data
# pointed to by the given 'cdata'. The 'cdata' must be a pointer or
# an array. Can be passed to functions expecting a buffer, or directly
# manipulated with:
#
# buf[:] get a copy of it in a regular string, or
# buf[idx] as a single character
# buf[:] = ...
# buf[idx] = ... change the content
# """
# note that 'buffer' is a type, set on this instance by __init__
def from_buffer(self, cdecl, python_buffer=_unspecified,
require_writable=False):
"""Return a cdata of the given type pointing to the data of the
given Python object, which must support the buffer interface.
Note that this is not meant to be used on the built-in types
str or unicode (you can build 'char[]' arrays explicitly)
but only on objects containing large quantities of raw data
in some other format, like 'array.array' or numpy arrays.
The first argument is optional and default to 'char[]'.
"""
if python_buffer is _unspecified:
cdecl, python_buffer = self.BCharA, cdecl
elif isinstance(cdecl, basestring):
cdecl = self._typeof(cdecl)
return self._backend.from_buffer(cdecl, python_buffer,
require_writable)
def memmove(self, dest, src, n):
"""ffi.memmove(dest, src, n) copies n bytes of memory from src to dest.
Like the C function memmove(), the memory areas may overlap;
apart from that it behaves like the C function memcpy().
'src' can be any cdata ptr or array, or any Python buffer object.
'dest' can be any cdata ptr or array, or a writable Python buffer
object. The size to copy, 'n', is always measured in bytes.
Unlike other methods, this one supports all Python buffer including
byte strings and bytearrays---but it still does not support
non-contiguous buffers.
"""
return self._backend.memmove(dest, src, n)
def callback(self, cdecl, python_callable=None, error=None, onerror=None):
"""Return a callback object or a decorator making such a
callback object. 'cdecl' must name a C function pointer type.
The callback invokes the specified 'python_callable' (which may
be provided either directly or via a decorator). Important: the
callback object must be manually kept alive for as long as the
callback may be invoked from the C level.
"""
def callback_decorator_wrap(python_callable):
if not callable(python_callable):
raise TypeError("the 'python_callable' argument "
"is not callable")
return self._backend.callback(cdecl, python_callable,
error, onerror)
if isinstance(cdecl, basestring):
cdecl = self._typeof(cdecl, consider_function_as_funcptr=True)
if python_callable is None:
return callback_decorator_wrap # decorator mode
else:
return callback_decorator_wrap(python_callable) # direct mode
def getctype(self, cdecl, replace_with=''):
"""Return a string giving the C type 'cdecl', which may be itself
a string or a <ctype> object. If 'replace_with' is given, it gives
extra text to append (or insert for more complicated C types), like
a variable name, or '*' to get actually the C type 'pointer-to-cdecl'.
"""
if isinstance(cdecl, basestring):
cdecl = self._typeof(cdecl)
replace_with = replace_with.strip()
if (replace_with.startswith('*')
and '&[' in self._backend.getcname(cdecl, '&')):
replace_with = '(%s)' % replace_with
elif replace_with and not replace_with[0] in '[(':
replace_with = ' ' + replace_with
return self._backend.getcname(cdecl, replace_with)
def gc(self, cdata, destructor, size=0):
"""Return a new cdata object that points to the same
data. Later, when this new cdata object is garbage-collected,
'destructor(old_cdata_object)' will be called.
The optional 'size' gives an estimate of the size, used to
trigger the garbage collection more eagerly. So far only used
on PyPy. It tells the GC that the returned object keeps alive
roughly 'size' bytes of external memory.
"""
return self._backend.gcp(cdata, destructor, size)
def _get_cached_btype(self, type):
assert self._lock.acquire(False) is False
# call me with the lock!
try:
BType = self._cached_btypes[type]
except KeyError:
finishlist = []
BType = type.get_cached_btype(self, finishlist)
for type in finishlist:
type.finish_backend_type(self, finishlist)
return BType
def verify(self, source='', tmpdir=None, **kwargs):
"""Verify that the current ffi signatures compile on this
machine, and return a dynamic library object. The dynamic
library can be used to call functions and access global
variables declared in this 'ffi'. The library is compiled
by the C compiler: it gives you C-level API compatibility
(including calling macros). This is unlike 'ffi.dlopen()',
which requires binary compatibility in the signatures.
"""
from .verifier import Verifier, _caller_dir_pycache
#
# If set_unicode(True) was called, insert the UNICODE and
# _UNICODE macro declarations
if self._windows_unicode:
self._apply_windows_unicode(kwargs)
#
# Set the tmpdir here, and not in Verifier.__init__: it picks
# up the caller's directory, which we want to be the caller of
# ffi.verify(), as opposed to the caller of Veritier().
tmpdir = tmpdir or _caller_dir_pycache()
#
# Make a Verifier() and use it to load the library.
self.verifier = Verifier(self, source, tmpdir, **kwargs)
lib = self.verifier.load_library()
#
# Save the loaded library for keep-alive purposes, even
# if the caller doesn't keep it alive itself (it should).
self._libraries.append(lib)
return lib
def _get_errno(self):
return self._backend.get_errno()
def _set_errno(self, errno):
self._backend.set_errno(errno)
errno = property(_get_errno, _set_errno, None,
"the value of 'errno' from/to the C calls")
def getwinerror(self, code=-1):
return self._backend.getwinerror(code)
def _pointer_to(self, ctype):
with self._lock:
return model.pointer_cache(self, ctype)
def addressof(self, cdata, *fields_or_indexes):
"""Return the address of a <cdata 'struct-or-union'>.
If 'fields_or_indexes' are given, returns the address of that
field or array item in the structure or array, recursively in
case of nested structures.
"""
try:
ctype = self._backend.typeof(cdata)
except TypeError:
if '__addressof__' in type(cdata).__dict__:
return type(cdata).__addressof__(cdata, *fields_or_indexes)
raise
if fields_or_indexes:
ctype, offset = self._typeoffsetof(ctype, *fields_or_indexes)
else:
if ctype.kind == "pointer":
raise TypeError("addressof(pointer)")
offset = 0
ctypeptr = self._pointer_to(ctype)
return self._backend.rawaddressof(ctypeptr, cdata, offset)
def _typeoffsetof(self, ctype, field_or_index, *fields_or_indexes):
ctype, offset = self._backend.typeoffsetof(ctype, field_or_index)
for field1 in fields_or_indexes:
ctype, offset1 = self._backend.typeoffsetof(ctype, field1, 1)
offset += offset1
return ctype, offset
def include(self, ffi_to_include):
"""Includes the typedefs, structs, unions and enums defined
in another FFI instance. Usage is similar to a #include in C,
where a part of the program might include types defined in
another part for its own usage. Note that the include()
method has no effect on functions, constants and global
variables, which must anyway be accessed directly from the
lib object returned by the original FFI instance.
"""
if not isinstance(ffi_to_include, FFI):
raise TypeError("ffi.include() expects an argument that is also of"
" type cffi.FFI, not %r" % (
type(ffi_to_include).__name__,))
if ffi_to_include is self:
raise ValueError("self.include(self)")
with ffi_to_include._lock:
with self._lock:
self._parser.include(ffi_to_include._parser)
self._cdefsources.append('[')
self._cdefsources.extend(ffi_to_include._cdefsources)
self._cdefsources.append(']')
self._included_ffis.append(ffi_to_include)
def new_handle(self, x):
return self._backend.newp_handle(self.BVoidP, x)
def from_handle(self, x):
return self._backend.from_handle(x)
def release(self, x):
self._backend.release(x)
def set_unicode(self, enabled_flag):
"""Windows: if 'enabled_flag' is True, enable the UNICODE and
_UNICODE defines in C, and declare the types like TCHAR and LPTCSTR
to be (pointers to) wchar_t. If 'enabled_flag' is False,
declare these types to be (pointers to) plain 8-bit characters.
This is mostly for backward compatibility; you usually want True.
"""
if self._windows_unicode is not None:
raise ValueError("set_unicode() can only be called once")
enabled_flag = bool(enabled_flag)
if enabled_flag:
self.cdef("typedef wchar_t TBYTE;"
"typedef wchar_t TCHAR;"
"typedef const wchar_t *LPCTSTR;"
"typedef const wchar_t *PCTSTR;"
"typedef wchar_t *LPTSTR;"
"typedef wchar_t *PTSTR;"
"typedef TBYTE *PTBYTE;"
"typedef TCHAR *PTCHAR;")
else:
self.cdef("typedef char TBYTE;"
"typedef char TCHAR;"
"typedef const char *LPCTSTR;"
"typedef const char *PCTSTR;"
"typedef char *LPTSTR;"
"typedef char *PTSTR;"
"typedef TBYTE *PTBYTE;"
"typedef TCHAR *PTCHAR;")
self._windows_unicode = enabled_flag
def _apply_windows_unicode(self, kwds):
defmacros = kwds.get('define_macros', ())
if not isinstance(defmacros, (list, tuple)):
raise TypeError("'define_macros' must be a list or tuple")
defmacros = list(defmacros) + [('UNICODE', '1'),
('_UNICODE', '1')]
kwds['define_macros'] = defmacros
def _apply_embedding_fix(self, kwds):
# must include an argument like "-lpython2.7" for the compiler
def ensure(key, value):
lst = kwds.setdefault(key, [])
if value not in lst:
lst.append(value)
#
if '__pypy__' in sys.builtin_module_names:
import os
if sys.platform == "win32":
# we need 'libpypy-c.lib'. Current distributions of
# pypy (>= 4.1) contain it as 'libs/python27.lib'.
pythonlib = "python{0[0]}{0[1]}".format(sys.version_info)
if hasattr(sys, 'prefix'):
ensure('library_dirs', os.path.join(sys.prefix, 'libs'))
else:
# we need 'libpypy-c.{so,dylib}', which should be by
# default located in 'sys.prefix/bin' for installed
# systems.
if sys.version_info < (3,):
pythonlib = "pypy-c"
else:
pythonlib = "pypy3-c"
if hasattr(sys, 'prefix'):
ensure('library_dirs', os.path.join(sys.prefix, 'bin'))
# On uninstalled pypy's, the libpypy-c is typically found in
# .../pypy/goal/.
if hasattr(sys, 'prefix'):
ensure('library_dirs', os.path.join(sys.prefix, 'pypy', 'goal'))
else:
if sys.platform == "win32":
template = "python%d%d"
if hasattr(sys, 'gettotalrefcount'):
template += '_d'
else:
try:
import sysconfig
except ImportError: # 2.6
from cffi._shimmed_dist_utils import sysconfig
template = "python%d.%d"
if sysconfig.get_config_var('DEBUG_EXT'):
template += sysconfig.get_config_var('DEBUG_EXT')
pythonlib = (template %
(sys.hexversion >> 24, (sys.hexversion >> 16) & 0xff))
if hasattr(sys, 'abiflags'):
pythonlib += sys.abiflags
ensure('libraries', pythonlib)
if sys.platform == "win32":
ensure('extra_link_args', '/MANIFEST')
def set_source(self, module_name, source, source_extension='.c', **kwds):
import os
if hasattr(self, '_assigned_source'):
raise ValueError("set_source() cannot be called several times "
"per ffi object")
if not isinstance(module_name, basestring):
raise TypeError("'module_name' must be a string")
if os.sep in module_name or (os.altsep and os.altsep in module_name):
raise ValueError("'module_name' must not contain '/': use a dotted "
"name to make a 'package.module' location")
self._assigned_source = (str(module_name), source,
source_extension, kwds)
def set_source_pkgconfig(self, module_name, pkgconfig_libs, source,
source_extension='.c', **kwds):
from . import pkgconfig
if not isinstance(pkgconfig_libs, list):
raise TypeError("the pkgconfig_libs argument must be a list "
"of package names")
kwds2 = pkgconfig.flags_from_pkgconfig(pkgconfig_libs)
pkgconfig.merge_flags(kwds, kwds2)
self.set_source(module_name, source, source_extension, **kwds)
def distutils_extension(self, tmpdir='build', verbose=True):
from cffi._shimmed_dist_utils import mkpath
from .recompiler import recompile
#
if not hasattr(self, '_assigned_source'):
if hasattr(self, 'verifier'): # fallback, 'tmpdir' ignored
return self.verifier.get_extension()
raise ValueError("set_source() must be called before"
" distutils_extension()")
module_name, source, source_extension, kwds = self._assigned_source
if source is None:
raise TypeError("distutils_extension() is only for C extension "
"modules, not for dlopen()-style pure Python "
"modules")
mkpath(tmpdir)
ext, updated = recompile(self, module_name,
source, tmpdir=tmpdir, extradir=tmpdir,
source_extension=source_extension,
call_c_compiler=False, **kwds)
if verbose:
if updated:
sys.stderr.write("regenerated: %r\n" % (ext.sources[0],))
else:
sys.stderr.write("not modified: %r\n" % (ext.sources[0],))
return ext
def emit_c_code(self, filename):
from .recompiler import recompile
#
if not hasattr(self, '_assigned_source'):
raise ValueError("set_source() must be called before emit_c_code()")
module_name, source, source_extension, kwds = self._assigned_source
if source is None:
raise TypeError("emit_c_code() is only for C extension modules, "
"not for dlopen()-style pure Python modules")
recompile(self, module_name, source,
c_file=filename, call_c_compiler=False,
uses_ffiplatform=False, **kwds)
def emit_python_code(self, filename):
from .recompiler import recompile
#
if not hasattr(self, '_assigned_source'):
raise ValueError("set_source() must be called before emit_c_code()")
module_name, source, source_extension, kwds = self._assigned_source
if source is not None:
raise TypeError("emit_python_code() is only for dlopen()-style "
"pure Python modules, not for C extension modules")
recompile(self, module_name, source,
c_file=filename, call_c_compiler=False,
uses_ffiplatform=False, **kwds)
def compile(self, tmpdir='.', verbose=0, target=None, debug=None):
"""The 'target' argument gives the final file name of the
compiled DLL. Use '*' to force distutils' choice, suitable for
regular CPython C API modules. Use a file name ending in '.*'
to ask for the system's default extension for dynamic libraries
(.so/.dll/.dylib).
The default is '*' when building a non-embedded C API extension,
and (module_name + '.*') when building an embedded library.
"""
from .recompiler import recompile
#
if not hasattr(self, '_assigned_source'):
raise ValueError("set_source() must be called before compile()")
module_name, source, source_extension, kwds = self._assigned_source
return recompile(self, module_name, source, tmpdir=tmpdir,
target=target, source_extension=source_extension,
compiler_verbose=verbose, debug=debug, **kwds)
def init_once(self, func, tag):
# Read _init_once_cache[tag], which is either (False, lock) if
# we're calling the function now in some thread, or (True, result).
# Don't call setdefault() in most cases, to avoid allocating and
# immediately freeing a lock; but still use setdefaut() to avoid
# races.
try:
x = self._init_once_cache[tag]
except KeyError:
x = self._init_once_cache.setdefault(tag, (False, allocate_lock()))
# Common case: we got (True, result), so we return the result.
if x[0]:
return x[1]
# Else, it's a lock. Acquire it to serialize the following tests.
with x[1]:
# Read again from _init_once_cache the current status.
x = self._init_once_cache[tag]
if x[0]:
return x[1]
# Call the function and store the result back.
result = func()
self._init_once_cache[tag] = (True, result)
return result
def embedding_init_code(self, pysource):
if self._embedding:
raise ValueError("embedding_init_code() can only be called once")
# fix 'pysource' before it gets dumped into the C file:
# - remove empty lines at the beginning, so it starts at "line 1"
# - dedent, if all non-empty lines are indented
# - check for SyntaxErrors
import re
match = re.match(r'\s*\n', pysource)
if match:
pysource = pysource[match.end():]
lines = pysource.splitlines() or ['']
prefix = re.match(r'\s*', lines[0]).group()
for i in range(1, len(lines)):
line = lines[i]
if line.rstrip():
while not line.startswith(prefix):
prefix = prefix[:-1]
i = len(prefix)
lines = [line[i:]+'\n' for line in lines]
pysource = ''.join(lines)
#
compile(pysource, "cffi_init", "exec")
#
self._embedding = pysource
def def_extern(self, *args, **kwds):
raise ValueError("ffi.def_extern() is only available on API-mode FFI "
"objects")
def list_types(self):
"""Returns the user type names known to this FFI instance.
This returns a tuple containing three lists of names:
(typedef_names, names_of_structs, names_of_unions)
"""
typedefs = []
structs = []
unions = []
for key in self._parser._declarations:
if key.startswith('typedef '):
typedefs.append(key[8:])
elif key.startswith('struct '):
structs.append(key[7:])
elif key.startswith('union '):
unions.append(key[6:])
typedefs.sort()
structs.sort()
unions.sort()
return (typedefs, structs, unions)
def _load_backend_lib(backend, name, flags):
import os
if not isinstance(name, basestring):
if sys.platform != "win32" or name is not None:
return backend.load_library(name, flags)
name = "c" # Windows: load_library(None) fails, but this works
# on Python 2 (backward compatibility hack only)
first_error = None
if '.' in name or '/' in name or os.sep in name:
try:
return backend.load_library(name, flags)
except OSError as e:
first_error = e
import ctypes.util
path = ctypes.util.find_library(name)
if path is None:
if name == "c" and sys.platform == "win32" and sys.version_info >= (3,):
raise OSError("dlopen(None) cannot work on Windows for Python 3 "
"(see http://bugs.python.org/issue23606)")
msg = ("ctypes.util.find_library() did not manage "
"to locate a library called %r" % (name,))
if first_error is not None:
msg = "%s. Additionally, %s" % (first_error, msg)
raise OSError(msg)
return backend.load_library(path, flags)
def _make_ffi_library(ffi, libname, flags):
backend = ffi._backend
backendlib = _load_backend_lib(backend, libname, flags)
#
def accessor_function(name):
key = 'function ' + name
tp, _ = ffi._parser._declarations[key]
BType = ffi._get_cached_btype(tp)
value = backendlib.load_function(BType, name)
library.__dict__[name] = value
#
def accessor_variable(name):
key = 'variable ' + name
tp, _ = ffi._parser._declarations[key]
BType = ffi._get_cached_btype(tp)
read_variable = backendlib.read_variable
write_variable = backendlib.write_variable
setattr(FFILibrary, name, property(
lambda self: read_variable(BType, name),
lambda self, value: write_variable(BType, name, value)))
#
def addressof_var(name):
try:
return addr_variables[name]
except KeyError:
with ffi._lock:
if name not in addr_variables:
key = 'variable ' + name
tp, _ = ffi._parser._declarations[key]
BType = ffi._get_cached_btype(tp)
if BType.kind != 'array':
BType = model.pointer_cache(ffi, BType)
p = backendlib.load_function(BType, name)
addr_variables[name] = p
return addr_variables[name]
#
def accessor_constant(name):
raise NotImplementedError("non-integer constant '%s' cannot be "
"accessed from a dlopen() library" % (name,))
#
def accessor_int_constant(name):
library.__dict__[name] = ffi._parser._int_constants[name]
#
accessors = {}
accessors_version = [False]
addr_variables = {}
#
def update_accessors():
if accessors_version[0] is ffi._cdef_version:
return
#
for key, (tp, _) in ffi._parser._declarations.items():
if not isinstance(tp, model.EnumType):
tag, name = key.split(' ', 1)
if tag == 'function':
accessors[name] = accessor_function
elif tag == 'variable':
accessors[name] = accessor_variable
elif tag == 'constant':
accessors[name] = accessor_constant
else:
for i, enumname in enumerate(tp.enumerators):
def accessor_enum(name, tp=tp, i=i):
tp.check_not_partial()
library.__dict__[name] = tp.enumvalues[i]
accessors[enumname] = accessor_enum
for name in ffi._parser._int_constants:
accessors.setdefault(name, accessor_int_constant)
accessors_version[0] = ffi._cdef_version
#
def make_accessor(name):
with ffi._lock:
if name in library.__dict__ or name in FFILibrary.__dict__:
return # added by another thread while waiting for the lock
if name not in accessors:
update_accessors()
if name not in accessors:
raise AttributeError(name)
accessors[name](name)
#
class FFILibrary(object):
def __getattr__(self, name):
make_accessor(name)
return getattr(self, name)
def __setattr__(self, name, value):
try:
property = getattr(self.__class__, name)
except AttributeError:
make_accessor(name)
setattr(self, name, value)
else:
property.__set__(self, value)
def __dir__(self):
with ffi._lock:
update_accessors()
return accessors.keys()
def __addressof__(self, name):
if name in library.__dict__:
return library.__dict__[name]
if name in FFILibrary.__dict__:
return addressof_var(name)
make_accessor(name)
if name in library.__dict__:
return library.__dict__[name]
if name in FFILibrary.__dict__:
return addressof_var(name)
raise AttributeError("cffi library has no function or "
"global variable named '%s'" % (name,))
def __cffi_close__(self):
backendlib.close_lib()
self.__dict__.clear()
#
if isinstance(libname, basestring):
try:
if not isinstance(libname, str): # unicode, on Python 2
libname = libname.encode('utf-8')
FFILibrary.__name__ = 'FFILibrary_%s' % libname
except UnicodeError:
pass
library = FFILibrary()
return library, library.__dict__
def _builtin_function_type(func):
# a hack to make at least ffi.typeof(builtin_function) work,
# if the builtin function was obtained by 'vengine_cpy'.
import sys
try:
module = sys.modules[func.__module__]
ffi = module._cffi_original_ffi
types_of_builtin_funcs = module._cffi_types_of_builtin_funcs
tp = types_of_builtin_funcs[func]
except (KeyError, AttributeError, TypeError):
return None
else:
with ffi._lock:
return ffi._get_cached_btype(tp)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,187 @@
from .error import VerificationError
class CffiOp(object):
def __init__(self, op, arg):
self.op = op
self.arg = arg
def as_c_expr(self):
if self.op is None:
assert isinstance(self.arg, str)
return '(_cffi_opcode_t)(%s)' % (self.arg,)
classname = CLASS_NAME[self.op]
return '_CFFI_OP(_CFFI_OP_%s, %s)' % (classname, self.arg)
def as_python_bytes(self):
if self.op is None and self.arg.isdigit():
value = int(self.arg) # non-negative: '-' not in self.arg
if value >= 2**31:
raise OverflowError("cannot emit %r: limited to 2**31-1"
% (self.arg,))
return format_four_bytes(value)
if isinstance(self.arg, str):
raise VerificationError("cannot emit to Python: %r" % (self.arg,))
return format_four_bytes((self.arg << 8) | self.op)
def __str__(self):
classname = CLASS_NAME.get(self.op, self.op)
return '(%s %s)' % (classname, self.arg)
def format_four_bytes(num):
return '\\x%02X\\x%02X\\x%02X\\x%02X' % (
(num >> 24) & 0xFF,
(num >> 16) & 0xFF,
(num >> 8) & 0xFF,
(num ) & 0xFF)
OP_PRIMITIVE = 1
OP_POINTER = 3
OP_ARRAY = 5
OP_OPEN_ARRAY = 7
OP_STRUCT_UNION = 9
OP_ENUM = 11
OP_FUNCTION = 13
OP_FUNCTION_END = 15
OP_NOOP = 17
OP_BITFIELD = 19
OP_TYPENAME = 21
OP_CPYTHON_BLTN_V = 23 # varargs
OP_CPYTHON_BLTN_N = 25 # noargs
OP_CPYTHON_BLTN_O = 27 # O (i.e. a single arg)
OP_CONSTANT = 29
OP_CONSTANT_INT = 31
OP_GLOBAL_VAR = 33
OP_DLOPEN_FUNC = 35
OP_DLOPEN_CONST = 37
OP_GLOBAL_VAR_F = 39
OP_EXTERN_PYTHON = 41
PRIM_VOID = 0
PRIM_BOOL = 1
PRIM_CHAR = 2
PRIM_SCHAR = 3
PRIM_UCHAR = 4
PRIM_SHORT = 5
PRIM_USHORT = 6
PRIM_INT = 7
PRIM_UINT = 8
PRIM_LONG = 9
PRIM_ULONG = 10
PRIM_LONGLONG = 11
PRIM_ULONGLONG = 12
PRIM_FLOAT = 13
PRIM_DOUBLE = 14
PRIM_LONGDOUBLE = 15
PRIM_WCHAR = 16
PRIM_INT8 = 17
PRIM_UINT8 = 18
PRIM_INT16 = 19
PRIM_UINT16 = 20
PRIM_INT32 = 21
PRIM_UINT32 = 22
PRIM_INT64 = 23
PRIM_UINT64 = 24
PRIM_INTPTR = 25
PRIM_UINTPTR = 26
PRIM_PTRDIFF = 27
PRIM_SIZE = 28
PRIM_SSIZE = 29
PRIM_INT_LEAST8 = 30
PRIM_UINT_LEAST8 = 31
PRIM_INT_LEAST16 = 32
PRIM_UINT_LEAST16 = 33
PRIM_INT_LEAST32 = 34
PRIM_UINT_LEAST32 = 35
PRIM_INT_LEAST64 = 36
PRIM_UINT_LEAST64 = 37
PRIM_INT_FAST8 = 38
PRIM_UINT_FAST8 = 39
PRIM_INT_FAST16 = 40
PRIM_UINT_FAST16 = 41
PRIM_INT_FAST32 = 42
PRIM_UINT_FAST32 = 43
PRIM_INT_FAST64 = 44
PRIM_UINT_FAST64 = 45
PRIM_INTMAX = 46
PRIM_UINTMAX = 47
PRIM_FLOATCOMPLEX = 48
PRIM_DOUBLECOMPLEX = 49
PRIM_CHAR16 = 50
PRIM_CHAR32 = 51
_NUM_PRIM = 52
_UNKNOWN_PRIM = -1
_UNKNOWN_FLOAT_PRIM = -2
_UNKNOWN_LONG_DOUBLE = -3
_IO_FILE_STRUCT = -1
PRIMITIVE_TO_INDEX = {
'char': PRIM_CHAR,
'short': PRIM_SHORT,
'int': PRIM_INT,
'long': PRIM_LONG,
'long long': PRIM_LONGLONG,
'signed char': PRIM_SCHAR,
'unsigned char': PRIM_UCHAR,
'unsigned short': PRIM_USHORT,
'unsigned int': PRIM_UINT,
'unsigned long': PRIM_ULONG,
'unsigned long long': PRIM_ULONGLONG,
'float': PRIM_FLOAT,
'double': PRIM_DOUBLE,
'long double': PRIM_LONGDOUBLE,
'_cffi_float_complex_t': PRIM_FLOATCOMPLEX,
'_cffi_double_complex_t': PRIM_DOUBLECOMPLEX,
'_Bool': PRIM_BOOL,
'wchar_t': PRIM_WCHAR,
'char16_t': PRIM_CHAR16,
'char32_t': PRIM_CHAR32,
'int8_t': PRIM_INT8,
'uint8_t': PRIM_UINT8,
'int16_t': PRIM_INT16,
'uint16_t': PRIM_UINT16,
'int32_t': PRIM_INT32,
'uint32_t': PRIM_UINT32,
'int64_t': PRIM_INT64,
'uint64_t': PRIM_UINT64,
'intptr_t': PRIM_INTPTR,
'uintptr_t': PRIM_UINTPTR,
'ptrdiff_t': PRIM_PTRDIFF,
'size_t': PRIM_SIZE,
'ssize_t': PRIM_SSIZE,
'int_least8_t': PRIM_INT_LEAST8,
'uint_least8_t': PRIM_UINT_LEAST8,
'int_least16_t': PRIM_INT_LEAST16,
'uint_least16_t': PRIM_UINT_LEAST16,
'int_least32_t': PRIM_INT_LEAST32,
'uint_least32_t': PRIM_UINT_LEAST32,
'int_least64_t': PRIM_INT_LEAST64,
'uint_least64_t': PRIM_UINT_LEAST64,
'int_fast8_t': PRIM_INT_FAST8,
'uint_fast8_t': PRIM_UINT_FAST8,
'int_fast16_t': PRIM_INT_FAST16,
'uint_fast16_t': PRIM_UINT_FAST16,
'int_fast32_t': PRIM_INT_FAST32,
'uint_fast32_t': PRIM_UINT_FAST32,
'int_fast64_t': PRIM_INT_FAST64,
'uint_fast64_t': PRIM_UINT_FAST64,
'intmax_t': PRIM_INTMAX,
'uintmax_t': PRIM_UINTMAX,
}
F_UNION = 0x01
F_CHECK_FIELDS = 0x02
F_PACKED = 0x04
F_EXTERNAL = 0x08
F_OPAQUE = 0x10
G_FLAGS = dict([('_CFFI_' + _key, globals()[_key])
for _key in ['F_UNION', 'F_CHECK_FIELDS', 'F_PACKED',
'F_EXTERNAL', 'F_OPAQUE']])
CLASS_NAME = {}
for _name, _value in list(globals().items()):
if _name.startswith('OP_') and isinstance(_value, int):
CLASS_NAME[_value] = _name[3:]

View File

@@ -0,0 +1,82 @@
import sys
from . import model
from .error import FFIError
COMMON_TYPES = {}
try:
# fetch "bool" and all simple Windows types
from _cffi_backend import _get_common_types
_get_common_types(COMMON_TYPES)
except ImportError:
pass
COMMON_TYPES['FILE'] = model.unknown_type('FILE', '_IO_FILE')
COMMON_TYPES['bool'] = '_Bool' # in case we got ImportError above
COMMON_TYPES['float _Complex'] = '_cffi_float_complex_t'
COMMON_TYPES['double _Complex'] = '_cffi_double_complex_t'
for _type in model.PrimitiveType.ALL_PRIMITIVE_TYPES:
if _type.endswith('_t'):
COMMON_TYPES[_type] = _type
del _type
_CACHE = {}
def resolve_common_type(parser, commontype):
try:
return _CACHE[commontype]
except KeyError:
cdecl = COMMON_TYPES.get(commontype, commontype)
if not isinstance(cdecl, str):
result, quals = cdecl, 0 # cdecl is already a BaseType
elif cdecl in model.PrimitiveType.ALL_PRIMITIVE_TYPES:
result, quals = model.PrimitiveType(cdecl), 0
elif cdecl == 'set-unicode-needed':
raise FFIError("The Windows type %r is only available after "
"you call ffi.set_unicode()" % (commontype,))
else:
if commontype == cdecl:
raise FFIError(
"Unsupported type: %r. Please look at "
"http://cffi.readthedocs.io/en/latest/cdef.html#ffi-cdef-limitations "
"and file an issue if you think this type should really "
"be supported." % (commontype,))
result, quals = parser.parse_type_and_quals(cdecl) # recursive
assert isinstance(result, model.BaseTypeByIdentity)
_CACHE[commontype] = result, quals
return result, quals
# ____________________________________________________________
# extra types for Windows (most of them are in commontypes.c)
def win_common_types():
return {
"UNICODE_STRING": model.StructType(
"_UNICODE_STRING",
["Length",
"MaximumLength",
"Buffer"],
[model.PrimitiveType("unsigned short"),
model.PrimitiveType("unsigned short"),
model.PointerType(model.PrimitiveType("wchar_t"))],
[-1, -1, -1]),
"PUNICODE_STRING": "UNICODE_STRING *",
"PCUNICODE_STRING": "const UNICODE_STRING *",
"TBYTE": "set-unicode-needed",
"TCHAR": "set-unicode-needed",
"LPCTSTR": "set-unicode-needed",
"PCTSTR": "set-unicode-needed",
"LPTSTR": "set-unicode-needed",
"PTSTR": "set-unicode-needed",
"PTBYTE": "set-unicode-needed",
"PTCHAR": "set-unicode-needed",
}
if sys.platform == 'win32':
COMMON_TYPES.update(win_common_types())

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,31 @@
class FFIError(Exception):
__module__ = 'cffi'
class CDefError(Exception):
__module__ = 'cffi'
def __str__(self):
try:
current_decl = self.args[1]
filename = current_decl.coord.file
linenum = current_decl.coord.line
prefix = '%s:%d: ' % (filename, linenum)
except (AttributeError, TypeError, IndexError):
prefix = ''
return '%s%s' % (prefix, self.args[0])
class VerificationError(Exception):
""" An error raised when verification fails
"""
__module__ = 'cffi'
class VerificationMissing(Exception):
""" An error raised when incomplete structures are passed into
cdef, but no verification has been done
"""
__module__ = 'cffi'
class PkgConfigError(Exception):
""" An error raised for missing modules in pkg-config
"""
__module__ = 'cffi'

View File

@@ -0,0 +1,113 @@
import sys, os
from .error import VerificationError
LIST_OF_FILE_NAMES = ['sources', 'include_dirs', 'library_dirs',
'extra_objects', 'depends']
def get_extension(srcfilename, modname, sources=(), **kwds):
from cffi._shimmed_dist_utils import Extension
allsources = [srcfilename]
for src in sources:
allsources.append(os.path.normpath(src))
return Extension(name=modname, sources=allsources, **kwds)
def compile(tmpdir, ext, compiler_verbose=0, debug=None):
"""Compile a C extension module using distutils."""
saved_environ = os.environ.copy()
try:
outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
outputfilename = os.path.abspath(outputfilename)
finally:
# workaround for a distutils bugs where some env vars can
# become longer and longer every time it is used
for key, value in saved_environ.items():
if os.environ.get(key) != value:
os.environ[key] = value
return outputfilename
def _build(tmpdir, ext, compiler_verbose=0, debug=None):
# XXX compact but horrible :-(
from cffi._shimmed_dist_utils import Distribution, CompileError, LinkError, set_threshold, set_verbosity
dist = Distribution({'ext_modules': [ext]})
dist.parse_config_files()
options = dist.get_option_dict('build_ext')
if debug is None:
debug = sys.flags.debug
options['debug'] = ('ffiplatform', debug)
options['force'] = ('ffiplatform', True)
options['build_lib'] = ('ffiplatform', tmpdir)
options['build_temp'] = ('ffiplatform', tmpdir)
#
try:
old_level = set_threshold(0) or 0
try:
set_verbosity(compiler_verbose)
dist.run_command('build_ext')
cmd_obj = dist.get_command_obj('build_ext')
[soname] = cmd_obj.get_outputs()
finally:
set_threshold(old_level)
except (CompileError, LinkError) as e:
raise VerificationError('%s: %s' % (e.__class__.__name__, e))
#
return soname
try:
from os.path import samefile
except ImportError:
def samefile(f1, f2):
return os.path.abspath(f1) == os.path.abspath(f2)
def maybe_relative_path(path):
if not os.path.isabs(path):
return path # already relative
dir = path
names = []
while True:
prevdir = dir
dir, name = os.path.split(prevdir)
if dir == prevdir or not dir:
return path # failed to make it relative
names.append(name)
try:
if samefile(dir, os.curdir):
names.reverse()
return os.path.join(*names)
except OSError:
pass
# ____________________________________________________________
try:
int_or_long = (int, long)
import cStringIO
except NameError:
int_or_long = int # Python 3
import io as cStringIO
def _flatten(x, f):
if isinstance(x, str):
f.write('%ds%s' % (len(x), x))
elif isinstance(x, dict):
keys = sorted(x.keys())
f.write('%dd' % len(keys))
for key in keys:
_flatten(key, f)
_flatten(x[key], f)
elif isinstance(x, (list, tuple)):
f.write('%dl' % len(x))
for value in x:
_flatten(value, f)
elif isinstance(x, int_or_long):
f.write('%di' % (x,))
else:
raise TypeError(
"the keywords to verify() contains unsupported object %r" % (x,))
def flatten(x):
f = cStringIO.StringIO()
_flatten(x, f)
return f.getvalue()

View File

@@ -0,0 +1,30 @@
import sys
if sys.version_info < (3,):
try:
from thread import allocate_lock
except ImportError:
from dummy_thread import allocate_lock
else:
try:
from _thread import allocate_lock
except ImportError:
from _dummy_thread import allocate_lock
##import sys
##l1 = allocate_lock
##class allocate_lock(object):
## def __init__(self):
## self._real = l1()
## def __enter__(self):
## for i in range(4, 0, -1):
## print sys._getframe(i).f_code
## print
## return self._real.__enter__()
## def __exit__(self, *args):
## return self._real.__exit__(*args)
## def acquire(self, f):
## assert f is False
## return self._real.acquire(f)

View File

@@ -0,0 +1,618 @@
import types
import weakref
from .lock import allocate_lock
from .error import CDefError, VerificationError, VerificationMissing
# type qualifiers
Q_CONST = 0x01
Q_RESTRICT = 0x02
Q_VOLATILE = 0x04
def qualify(quals, replace_with):
if quals & Q_CONST:
replace_with = ' const ' + replace_with.lstrip()
if quals & Q_VOLATILE:
replace_with = ' volatile ' + replace_with.lstrip()
if quals & Q_RESTRICT:
# It seems that __restrict is supported by gcc and msvc.
# If you hit some different compiler, add a #define in
# _cffi_include.h for it (and in its copies, documented there)
replace_with = ' __restrict ' + replace_with.lstrip()
return replace_with
class BaseTypeByIdentity(object):
is_array_type = False
is_raw_function = False
def get_c_name(self, replace_with='', context='a C file', quals=0):
result = self.c_name_with_marker
assert result.count('&') == 1
# some logic duplication with ffi.getctype()... :-(
replace_with = replace_with.strip()
if replace_with:
if replace_with.startswith('*') and '&[' in result:
replace_with = '(%s)' % replace_with
elif not replace_with[0] in '[(':
replace_with = ' ' + replace_with
replace_with = qualify(quals, replace_with)
result = result.replace('&', replace_with)
if '$' in result:
raise VerificationError(
"cannot generate '%s' in %s: unknown type name"
% (self._get_c_name(), context))
return result
def _get_c_name(self):
return self.c_name_with_marker.replace('&', '')
def has_c_name(self):
return '$' not in self._get_c_name()
def is_integer_type(self):
return False
def get_cached_btype(self, ffi, finishlist, can_delay=False):
try:
BType = ffi._cached_btypes[self]
except KeyError:
BType = self.build_backend_type(ffi, finishlist)
BType2 = ffi._cached_btypes.setdefault(self, BType)
assert BType2 is BType
return BType
def __repr__(self):
return '<%s>' % (self._get_c_name(),)
def _get_items(self):
return [(name, getattr(self, name)) for name in self._attrs_]
class BaseType(BaseTypeByIdentity):
def __eq__(self, other):
return (self.__class__ == other.__class__ and
self._get_items() == other._get_items())
def __ne__(self, other):
return not self == other
def __hash__(self):
return hash((self.__class__, tuple(self._get_items())))
class VoidType(BaseType):
_attrs_ = ()
def __init__(self):
self.c_name_with_marker = 'void&'
def build_backend_type(self, ffi, finishlist):
return global_cache(self, ffi, 'new_void_type')
void_type = VoidType()
class BasePrimitiveType(BaseType):
def is_complex_type(self):
return False
class PrimitiveType(BasePrimitiveType):
_attrs_ = ('name',)
ALL_PRIMITIVE_TYPES = {
'char': 'c',
'short': 'i',
'int': 'i',
'long': 'i',
'long long': 'i',
'signed char': 'i',
'unsigned char': 'i',
'unsigned short': 'i',
'unsigned int': 'i',
'unsigned long': 'i',
'unsigned long long': 'i',
'float': 'f',
'double': 'f',
'long double': 'f',
'_cffi_float_complex_t': 'j',
'_cffi_double_complex_t': 'j',
'_Bool': 'i',
# the following types are not primitive in the C sense
'wchar_t': 'c',
'char16_t': 'c',
'char32_t': 'c',
'int8_t': 'i',
'uint8_t': 'i',
'int16_t': 'i',
'uint16_t': 'i',
'int32_t': 'i',
'uint32_t': 'i',
'int64_t': 'i',
'uint64_t': 'i',
'int_least8_t': 'i',
'uint_least8_t': 'i',
'int_least16_t': 'i',
'uint_least16_t': 'i',
'int_least32_t': 'i',
'uint_least32_t': 'i',
'int_least64_t': 'i',
'uint_least64_t': 'i',
'int_fast8_t': 'i',
'uint_fast8_t': 'i',
'int_fast16_t': 'i',
'uint_fast16_t': 'i',
'int_fast32_t': 'i',
'uint_fast32_t': 'i',
'int_fast64_t': 'i',
'uint_fast64_t': 'i',
'intptr_t': 'i',
'uintptr_t': 'i',
'intmax_t': 'i',
'uintmax_t': 'i',
'ptrdiff_t': 'i',
'size_t': 'i',
'ssize_t': 'i',
}
def __init__(self, name):
assert name in self.ALL_PRIMITIVE_TYPES
self.name = name
self.c_name_with_marker = name + '&'
def is_char_type(self):
return self.ALL_PRIMITIVE_TYPES[self.name] == 'c'
def is_integer_type(self):
return self.ALL_PRIMITIVE_TYPES[self.name] == 'i'
def is_float_type(self):
return self.ALL_PRIMITIVE_TYPES[self.name] == 'f'
def is_complex_type(self):
return self.ALL_PRIMITIVE_TYPES[self.name] == 'j'
def build_backend_type(self, ffi, finishlist):
return global_cache(self, ffi, 'new_primitive_type', self.name)
class UnknownIntegerType(BasePrimitiveType):
_attrs_ = ('name',)
def __init__(self, name):
self.name = name
self.c_name_with_marker = name + '&'
def is_integer_type(self):
return True
def build_backend_type(self, ffi, finishlist):
raise NotImplementedError("integer type '%s' can only be used after "
"compilation" % self.name)
class UnknownFloatType(BasePrimitiveType):
_attrs_ = ('name', )
def __init__(self, name):
self.name = name
self.c_name_with_marker = name + '&'
def build_backend_type(self, ffi, finishlist):
raise NotImplementedError("float type '%s' can only be used after "
"compilation" % self.name)
class BaseFunctionType(BaseType):
_attrs_ = ('args', 'result', 'ellipsis', 'abi')
def __init__(self, args, result, ellipsis, abi=None):
self.args = args
self.result = result
self.ellipsis = ellipsis
self.abi = abi
#
reprargs = [arg._get_c_name() for arg in self.args]
if self.ellipsis:
reprargs.append('...')
reprargs = reprargs or ['void']
replace_with = self._base_pattern % (', '.join(reprargs),)
if abi is not None:
replace_with = replace_with[:1] + abi + ' ' + replace_with[1:]
self.c_name_with_marker = (
self.result.c_name_with_marker.replace('&', replace_with))
class RawFunctionType(BaseFunctionType):
# Corresponds to a C type like 'int(int)', which is the C type of
# a function, but not a pointer-to-function. The backend has no
# notion of such a type; it's used temporarily by parsing.
_base_pattern = '(&)(%s)'
is_raw_function = True
def build_backend_type(self, ffi, finishlist):
raise CDefError("cannot render the type %r: it is a function "
"type, not a pointer-to-function type" % (self,))
def as_function_pointer(self):
return FunctionPtrType(self.args, self.result, self.ellipsis, self.abi)
class FunctionPtrType(BaseFunctionType):
_base_pattern = '(*&)(%s)'
def build_backend_type(self, ffi, finishlist):
result = self.result.get_cached_btype(ffi, finishlist)
args = []
for tp in self.args:
args.append(tp.get_cached_btype(ffi, finishlist))
abi_args = ()
if self.abi == "__stdcall":
if not self.ellipsis: # __stdcall ignored for variadic funcs
try:
abi_args = (ffi._backend.FFI_STDCALL,)
except AttributeError:
pass
return global_cache(self, ffi, 'new_function_type',
tuple(args), result, self.ellipsis, *abi_args)
def as_raw_function(self):
return RawFunctionType(self.args, self.result, self.ellipsis, self.abi)
class PointerType(BaseType):
_attrs_ = ('totype', 'quals')
def __init__(self, totype, quals=0):
self.totype = totype
self.quals = quals
extra = " *&"
if totype.is_array_type:
extra = "(%s)" % (extra.lstrip(),)
extra = qualify(quals, extra)
self.c_name_with_marker = totype.c_name_with_marker.replace('&', extra)
def build_backend_type(self, ffi, finishlist):
BItem = self.totype.get_cached_btype(ffi, finishlist, can_delay=True)
return global_cache(self, ffi, 'new_pointer_type', BItem)
voidp_type = PointerType(void_type)
def ConstPointerType(totype):
return PointerType(totype, Q_CONST)
const_voidp_type = ConstPointerType(void_type)
class NamedPointerType(PointerType):
_attrs_ = ('totype', 'name')
def __init__(self, totype, name, quals=0):
PointerType.__init__(self, totype, quals)
self.name = name
self.c_name_with_marker = name + '&'
class ArrayType(BaseType):
_attrs_ = ('item', 'length')
is_array_type = True
def __init__(self, item, length):
self.item = item
self.length = length
#
if length is None:
brackets = '&[]'
elif length == '...':
brackets = '&[/*...*/]'
else:
brackets = '&[%s]' % length
self.c_name_with_marker = (
self.item.c_name_with_marker.replace('&', brackets))
def length_is_unknown(self):
return isinstance(self.length, str)
def resolve_length(self, newlength):
return ArrayType(self.item, newlength)
def build_backend_type(self, ffi, finishlist):
if self.length_is_unknown():
raise CDefError("cannot render the type %r: unknown length" %
(self,))
self.item.get_cached_btype(ffi, finishlist) # force the item BType
BPtrItem = PointerType(self.item).get_cached_btype(ffi, finishlist)
return global_cache(self, ffi, 'new_array_type', BPtrItem, self.length)
char_array_type = ArrayType(PrimitiveType('char'), None)
class StructOrUnionOrEnum(BaseTypeByIdentity):
_attrs_ = ('name',)
forcename = None
def build_c_name_with_marker(self):
name = self.forcename or '%s %s' % (self.kind, self.name)
self.c_name_with_marker = name + '&'
def force_the_name(self, forcename):
self.forcename = forcename
self.build_c_name_with_marker()
def get_official_name(self):
assert self.c_name_with_marker.endswith('&')
return self.c_name_with_marker[:-1]
class StructOrUnion(StructOrUnionOrEnum):
fixedlayout = None
completed = 0
partial = False
packed = 0
def __init__(self, name, fldnames, fldtypes, fldbitsize, fldquals=None):
self.name = name
self.fldnames = fldnames
self.fldtypes = fldtypes
self.fldbitsize = fldbitsize
self.fldquals = fldquals
self.build_c_name_with_marker()
def anonymous_struct_fields(self):
if self.fldtypes is not None:
for name, type in zip(self.fldnames, self.fldtypes):
if name == '' and isinstance(type, StructOrUnion):
yield type
def enumfields(self, expand_anonymous_struct_union=True):
fldquals = self.fldquals
if fldquals is None:
fldquals = (0,) * len(self.fldnames)
for name, type, bitsize, quals in zip(self.fldnames, self.fldtypes,
self.fldbitsize, fldquals):
if (name == '' and isinstance(type, StructOrUnion)
and expand_anonymous_struct_union):
# nested anonymous struct/union
for result in type.enumfields():
yield result
else:
yield (name, type, bitsize, quals)
def force_flatten(self):
# force the struct or union to have a declaration that lists
# directly all fields returned by enumfields(), flattening
# nested anonymous structs/unions.
names = []
types = []
bitsizes = []
fldquals = []
for name, type, bitsize, quals in self.enumfields():
names.append(name)
types.append(type)
bitsizes.append(bitsize)
fldquals.append(quals)
self.fldnames = tuple(names)
self.fldtypes = tuple(types)
self.fldbitsize = tuple(bitsizes)
self.fldquals = tuple(fldquals)
def get_cached_btype(self, ffi, finishlist, can_delay=False):
BType = StructOrUnionOrEnum.get_cached_btype(self, ffi, finishlist,
can_delay)
if not can_delay:
self.finish_backend_type(ffi, finishlist)
return BType
def finish_backend_type(self, ffi, finishlist):
if self.completed:
if self.completed != 2:
raise NotImplementedError("recursive structure declaration "
"for '%s'" % (self.name,))
return
BType = ffi._cached_btypes[self]
#
self.completed = 1
#
if self.fldtypes is None:
pass # not completing it: it's an opaque struct
#
elif self.fixedlayout is None:
fldtypes = [tp.get_cached_btype(ffi, finishlist)
for tp in self.fldtypes]
lst = list(zip(self.fldnames, fldtypes, self.fldbitsize))
extra_flags = ()
if self.packed:
if self.packed == 1:
extra_flags = (8,) # SF_PACKED
else:
extra_flags = (0, self.packed)
ffi._backend.complete_struct_or_union(BType, lst, self,
-1, -1, *extra_flags)
#
else:
fldtypes = []
fieldofs, fieldsize, totalsize, totalalignment = self.fixedlayout
for i in range(len(self.fldnames)):
fsize = fieldsize[i]
ftype = self.fldtypes[i]
#
if isinstance(ftype, ArrayType) and ftype.length_is_unknown():
# fix the length to match the total size
BItemType = ftype.item.get_cached_btype(ffi, finishlist)
nlen, nrest = divmod(fsize, ffi.sizeof(BItemType))
if nrest != 0:
self._verification_error(
"field '%s.%s' has a bogus size?" % (
self.name, self.fldnames[i] or '{}'))
ftype = ftype.resolve_length(nlen)
self.fldtypes = (self.fldtypes[:i] + (ftype,) +
self.fldtypes[i+1:])
#
BFieldType = ftype.get_cached_btype(ffi, finishlist)
if isinstance(ftype, ArrayType) and ftype.length is None:
assert fsize == 0
else:
bitemsize = ffi.sizeof(BFieldType)
if bitemsize != fsize:
self._verification_error(
"field '%s.%s' is declared as %d bytes, but is "
"really %d bytes" % (self.name,
self.fldnames[i] or '{}',
bitemsize, fsize))
fldtypes.append(BFieldType)
#
lst = list(zip(self.fldnames, fldtypes, self.fldbitsize, fieldofs))
ffi._backend.complete_struct_or_union(BType, lst, self,
totalsize, totalalignment)
self.completed = 2
def _verification_error(self, msg):
raise VerificationError(msg)
def check_not_partial(self):
if self.partial and self.fixedlayout is None:
raise VerificationMissing(self._get_c_name())
def build_backend_type(self, ffi, finishlist):
self.check_not_partial()
finishlist.append(self)
#
return global_cache(self, ffi, 'new_%s_type' % self.kind,
self.get_official_name(), key=self)
class StructType(StructOrUnion):
kind = 'struct'
class UnionType(StructOrUnion):
kind = 'union'
class EnumType(StructOrUnionOrEnum):
kind = 'enum'
partial = False
partial_resolved = False
def __init__(self, name, enumerators, enumvalues, baseinttype=None):
self.name = name
self.enumerators = enumerators
self.enumvalues = enumvalues
self.baseinttype = baseinttype
self.build_c_name_with_marker()
def force_the_name(self, forcename):
StructOrUnionOrEnum.force_the_name(self, forcename)
if self.forcename is None:
name = self.get_official_name()
self.forcename = '$' + name.replace(' ', '_')
def check_not_partial(self):
if self.partial and not self.partial_resolved:
raise VerificationMissing(self._get_c_name())
def build_backend_type(self, ffi, finishlist):
self.check_not_partial()
base_btype = self.build_baseinttype(ffi, finishlist)
return global_cache(self, ffi, 'new_enum_type',
self.get_official_name(),
self.enumerators, self.enumvalues,
base_btype, key=self)
def build_baseinttype(self, ffi, finishlist):
if self.baseinttype is not None:
return self.baseinttype.get_cached_btype(ffi, finishlist)
#
if self.enumvalues:
smallest_value = min(self.enumvalues)
largest_value = max(self.enumvalues)
else:
import warnings
try:
# XXX! The goal is to ensure that the warnings.warn()
# will not suppress the warning. We want to get it
# several times if we reach this point several times.
__warningregistry__.clear()
except NameError:
pass
warnings.warn("%r has no values explicitly defined; "
"guessing that it is equivalent to 'unsigned int'"
% self._get_c_name())
smallest_value = largest_value = 0
if smallest_value < 0: # needs a signed type
sign = 1
candidate1 = PrimitiveType("int")
candidate2 = PrimitiveType("long")
else:
sign = 0
candidate1 = PrimitiveType("unsigned int")
candidate2 = PrimitiveType("unsigned long")
btype1 = candidate1.get_cached_btype(ffi, finishlist)
btype2 = candidate2.get_cached_btype(ffi, finishlist)
size1 = ffi.sizeof(btype1)
size2 = ffi.sizeof(btype2)
if (smallest_value >= ((-1) << (8*size1-1)) and
largest_value < (1 << (8*size1-sign))):
return btype1
if (smallest_value >= ((-1) << (8*size2-1)) and
largest_value < (1 << (8*size2-sign))):
return btype2
raise CDefError("%s values don't all fit into either 'long' "
"or 'unsigned long'" % self._get_c_name())
def unknown_type(name, structname=None):
if structname is None:
structname = '$%s' % name
tp = StructType(structname, None, None, None)
tp.force_the_name(name)
tp.origin = "unknown_type"
return tp
def unknown_ptr_type(name, structname=None):
if structname is None:
structname = '$$%s' % name
tp = StructType(structname, None, None, None)
return NamedPointerType(tp, name)
global_lock = allocate_lock()
_typecache_cffi_backend = weakref.WeakValueDictionary()
def get_typecache(backend):
# returns _typecache_cffi_backend if backend is the _cffi_backend
# module, or type(backend).__typecache if backend is an instance of
# CTypesBackend (or some FakeBackend class during tests)
if isinstance(backend, types.ModuleType):
return _typecache_cffi_backend
with global_lock:
if not hasattr(type(backend), '__typecache'):
type(backend).__typecache = weakref.WeakValueDictionary()
return type(backend).__typecache
def global_cache(srctype, ffi, funcname, *args, **kwds):
key = kwds.pop('key', (funcname, args))
assert not kwds
try:
return ffi._typecache[key]
except KeyError:
pass
try:
res = getattr(ffi._backend, funcname)(*args)
except NotImplementedError as e:
raise NotImplementedError("%s: %r: %s" % (funcname, srctype, e))
# note that setdefault() on WeakValueDictionary is not atomic
# and contains a rare bug (http://bugs.python.org/issue19542);
# we have to use a lock and do it ourselves
cache = ffi._typecache
with global_lock:
res1 = cache.get(key)
if res1 is None:
cache[key] = res
return res
else:
return res1
def pointer_cache(ffi, BType):
return global_cache('?', ffi, 'new_pointer_type', BType)
def attach_exception_info(e, name):
if e.args and type(e.args[0]) is str:
e.args = ('%s: %s' % (name, e.args[0]),) + e.args[1:]

View File

@@ -0,0 +1,181 @@
/* This part is from file 'cffi/parse_c_type.h'. It is copied at the
beginning of C sources generated by CFFI's ffi.set_source(). */
typedef void *_cffi_opcode_t;
#define _CFFI_OP(opcode, arg) (_cffi_opcode_t)(opcode | (((uintptr_t)(arg)) << 8))
#define _CFFI_GETOP(cffi_opcode) ((unsigned char)(uintptr_t)cffi_opcode)
#define _CFFI_GETARG(cffi_opcode) (((intptr_t)cffi_opcode) >> 8)
#define _CFFI_OP_PRIMITIVE 1
#define _CFFI_OP_POINTER 3
#define _CFFI_OP_ARRAY 5
#define _CFFI_OP_OPEN_ARRAY 7
#define _CFFI_OP_STRUCT_UNION 9
#define _CFFI_OP_ENUM 11
#define _CFFI_OP_FUNCTION 13
#define _CFFI_OP_FUNCTION_END 15
#define _CFFI_OP_NOOP 17
#define _CFFI_OP_BITFIELD 19
#define _CFFI_OP_TYPENAME 21
#define _CFFI_OP_CPYTHON_BLTN_V 23 // varargs
#define _CFFI_OP_CPYTHON_BLTN_N 25 // noargs
#define _CFFI_OP_CPYTHON_BLTN_O 27 // O (i.e. a single arg)
#define _CFFI_OP_CONSTANT 29
#define _CFFI_OP_CONSTANT_INT 31
#define _CFFI_OP_GLOBAL_VAR 33
#define _CFFI_OP_DLOPEN_FUNC 35
#define _CFFI_OP_DLOPEN_CONST 37
#define _CFFI_OP_GLOBAL_VAR_F 39
#define _CFFI_OP_EXTERN_PYTHON 41
#define _CFFI_PRIM_VOID 0
#define _CFFI_PRIM_BOOL 1
#define _CFFI_PRIM_CHAR 2
#define _CFFI_PRIM_SCHAR 3
#define _CFFI_PRIM_UCHAR 4
#define _CFFI_PRIM_SHORT 5
#define _CFFI_PRIM_USHORT 6
#define _CFFI_PRIM_INT 7
#define _CFFI_PRIM_UINT 8
#define _CFFI_PRIM_LONG 9
#define _CFFI_PRIM_ULONG 10
#define _CFFI_PRIM_LONGLONG 11
#define _CFFI_PRIM_ULONGLONG 12
#define _CFFI_PRIM_FLOAT 13
#define _CFFI_PRIM_DOUBLE 14
#define _CFFI_PRIM_LONGDOUBLE 15
#define _CFFI_PRIM_WCHAR 16
#define _CFFI_PRIM_INT8 17
#define _CFFI_PRIM_UINT8 18
#define _CFFI_PRIM_INT16 19
#define _CFFI_PRIM_UINT16 20
#define _CFFI_PRIM_INT32 21
#define _CFFI_PRIM_UINT32 22
#define _CFFI_PRIM_INT64 23
#define _CFFI_PRIM_UINT64 24
#define _CFFI_PRIM_INTPTR 25
#define _CFFI_PRIM_UINTPTR 26
#define _CFFI_PRIM_PTRDIFF 27
#define _CFFI_PRIM_SIZE 28
#define _CFFI_PRIM_SSIZE 29
#define _CFFI_PRIM_INT_LEAST8 30
#define _CFFI_PRIM_UINT_LEAST8 31
#define _CFFI_PRIM_INT_LEAST16 32
#define _CFFI_PRIM_UINT_LEAST16 33
#define _CFFI_PRIM_INT_LEAST32 34
#define _CFFI_PRIM_UINT_LEAST32 35
#define _CFFI_PRIM_INT_LEAST64 36
#define _CFFI_PRIM_UINT_LEAST64 37
#define _CFFI_PRIM_INT_FAST8 38
#define _CFFI_PRIM_UINT_FAST8 39
#define _CFFI_PRIM_INT_FAST16 40
#define _CFFI_PRIM_UINT_FAST16 41
#define _CFFI_PRIM_INT_FAST32 42
#define _CFFI_PRIM_UINT_FAST32 43
#define _CFFI_PRIM_INT_FAST64 44
#define _CFFI_PRIM_UINT_FAST64 45
#define _CFFI_PRIM_INTMAX 46
#define _CFFI_PRIM_UINTMAX 47
#define _CFFI_PRIM_FLOATCOMPLEX 48
#define _CFFI_PRIM_DOUBLECOMPLEX 49
#define _CFFI_PRIM_CHAR16 50
#define _CFFI_PRIM_CHAR32 51
#define _CFFI__NUM_PRIM 52
#define _CFFI__UNKNOWN_PRIM (-1)
#define _CFFI__UNKNOWN_FLOAT_PRIM (-2)
#define _CFFI__UNKNOWN_LONG_DOUBLE (-3)
#define _CFFI__IO_FILE_STRUCT (-1)
struct _cffi_global_s {
const char *name;
void *address;
_cffi_opcode_t type_op;
void *size_or_direct_fn; // OP_GLOBAL_VAR: size, or 0 if unknown
// OP_CPYTHON_BLTN_*: addr of direct function
};
struct _cffi_getconst_s {
unsigned long long value;
const struct _cffi_type_context_s *ctx;
int gindex;
};
struct _cffi_struct_union_s {
const char *name;
int type_index; // -> _cffi_types, on a OP_STRUCT_UNION
int flags; // _CFFI_F_* flags below
size_t size;
int alignment;
int first_field_index; // -> _cffi_fields array
int num_fields;
};
#define _CFFI_F_UNION 0x01 // is a union, not a struct
#define _CFFI_F_CHECK_FIELDS 0x02 // complain if fields are not in the
// "standard layout" or if some are missing
#define _CFFI_F_PACKED 0x04 // for CHECK_FIELDS, assume a packed struct
#define _CFFI_F_EXTERNAL 0x08 // in some other ffi.include()
#define _CFFI_F_OPAQUE 0x10 // opaque
struct _cffi_field_s {
const char *name;
size_t field_offset;
size_t field_size;
_cffi_opcode_t field_type_op;
};
struct _cffi_enum_s {
const char *name;
int type_index; // -> _cffi_types, on a OP_ENUM
int type_prim; // _CFFI_PRIM_xxx
const char *enumerators; // comma-delimited string
};
struct _cffi_typename_s {
const char *name;
int type_index; /* if opaque, points to a possibly artificial
OP_STRUCT which is itself opaque */
};
struct _cffi_type_context_s {
_cffi_opcode_t *types;
const struct _cffi_global_s *globals;
const struct _cffi_field_s *fields;
const struct _cffi_struct_union_s *struct_unions;
const struct _cffi_enum_s *enums;
const struct _cffi_typename_s *typenames;
int num_globals;
int num_struct_unions;
int num_enums;
int num_typenames;
const char *const *includes;
int num_types;
int flags; /* future extension */
};
struct _cffi_parse_info_s {
const struct _cffi_type_context_s *ctx;
_cffi_opcode_t *output;
unsigned int output_size;
size_t error_location;
const char *error_message;
};
struct _cffi_externpy_s {
const char *name;
size_t size_of_result;
void *reserved1, *reserved2;
};
#ifdef _CFFI_INTERNAL
static int parse_c_type(struct _cffi_parse_info_s *info, const char *input);
static int search_in_globals(const struct _cffi_type_context_s *ctx,
const char *search, size_t search_len);
static int search_in_struct_unions(const struct _cffi_type_context_s *ctx,
const char *search, size_t search_len);
#endif

View File

@@ -0,0 +1,121 @@
# pkg-config, https://www.freedesktop.org/wiki/Software/pkg-config/ integration for cffi
import sys, os, subprocess
from .error import PkgConfigError
def merge_flags(cfg1, cfg2):
"""Merge values from cffi config flags cfg2 to cf1
Example:
merge_flags({"libraries": ["one"]}, {"libraries": ["two"]})
{"libraries": ["one", "two"]}
"""
for key, value in cfg2.items():
if key not in cfg1:
cfg1[key] = value
else:
if not isinstance(cfg1[key], list):
raise TypeError("cfg1[%r] should be a list of strings" % (key,))
if not isinstance(value, list):
raise TypeError("cfg2[%r] should be a list of strings" % (key,))
cfg1[key].extend(value)
return cfg1
def call(libname, flag, encoding=sys.getfilesystemencoding()):
"""Calls pkg-config and returns the output if found
"""
a = ["pkg-config", "--print-errors"]
a.append(flag)
a.append(libname)
try:
pc = subprocess.Popen(a, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
except EnvironmentError as e:
raise PkgConfigError("cannot run pkg-config: %s" % (str(e).strip(),))
bout, berr = pc.communicate()
if pc.returncode != 0:
try:
berr = berr.decode(encoding)
except Exception:
pass
raise PkgConfigError(berr.strip())
if sys.version_info >= (3,) and not isinstance(bout, str): # Python 3.x
try:
bout = bout.decode(encoding)
except UnicodeDecodeError:
raise PkgConfigError("pkg-config %s %s returned bytes that cannot "
"be decoded with encoding %r:\n%r" %
(flag, libname, encoding, bout))
if os.altsep != '\\' and '\\' in bout:
raise PkgConfigError("pkg-config %s %s returned an unsupported "
"backslash-escaped output:\n%r" %
(flag, libname, bout))
return bout
def flags_from_pkgconfig(libs):
r"""Return compiler line flags for FFI.set_source based on pkg-config output
Usage
...
ffibuilder.set_source("_foo", pkgconfig = ["libfoo", "libbar >= 1.8.3"])
If pkg-config is installed on build machine, then arguments include_dirs,
library_dirs, libraries, define_macros, extra_compile_args and
extra_link_args are extended with an output of pkg-config for libfoo and
libbar.
Raises PkgConfigError in case the pkg-config call fails.
"""
def get_include_dirs(string):
return [x[2:] for x in string.split() if x.startswith("-I")]
def get_library_dirs(string):
return [x[2:] for x in string.split() if x.startswith("-L")]
def get_libraries(string):
return [x[2:] for x in string.split() if x.startswith("-l")]
# convert -Dfoo=bar to list of tuples [("foo", "bar")] expected by distutils
def get_macros(string):
def _macro(x):
x = x[2:] # drop "-D"
if '=' in x:
return tuple(x.split("=", 1)) # "-Dfoo=bar" => ("foo", "bar")
else:
return (x, None) # "-Dfoo" => ("foo", None)
return [_macro(x) for x in string.split() if x.startswith("-D")]
def get_other_cflags(string):
return [x for x in string.split() if not x.startswith("-I") and
not x.startswith("-D")]
def get_other_libs(string):
return [x for x in string.split() if not x.startswith("-L") and
not x.startswith("-l")]
# return kwargs for given libname
def kwargs(libname):
fse = sys.getfilesystemencoding()
all_cflags = call(libname, "--cflags")
all_libs = call(libname, "--libs")
return {
"include_dirs": get_include_dirs(all_cflags),
"library_dirs": get_library_dirs(all_libs),
"libraries": get_libraries(all_libs),
"define_macros": get_macros(all_cflags),
"extra_compile_args": get_other_cflags(all_cflags),
"extra_link_args": get_other_libs(all_libs),
}
# merge all arguments together
ret = {}
for libname in libs:
lib_flags = kwargs(libname)
merge_flags(ret, lib_flags)
return ret

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,229 @@
import os
import sys
import sysconfig
try:
basestring
except NameError:
# Python 3.x
basestring = str
def error(msg):
from cffi._shimmed_dist_utils import DistutilsSetupError
raise DistutilsSetupError(msg)
def execfile(filename, glob):
# We use execfile() (here rewritten for Python 3) instead of
# __import__() to load the build script. The problem with
# a normal import is that in some packages, the intermediate
# __init__.py files may already try to import the file that
# we are generating.
with open(filename) as f:
src = f.read()
src += '\n' # Python 2.6 compatibility
code = compile(src, filename, 'exec')
exec(code, glob, glob)
def add_cffi_module(dist, mod_spec):
from cffi.api import FFI
if not isinstance(mod_spec, basestring):
error("argument to 'cffi_modules=...' must be a str or a list of str,"
" not %r" % (type(mod_spec).__name__,))
mod_spec = str(mod_spec)
try:
build_file_name, ffi_var_name = mod_spec.split(':')
except ValueError:
error("%r must be of the form 'path/build.py:ffi_variable'" %
(mod_spec,))
if not os.path.exists(build_file_name):
ext = ''
rewritten = build_file_name.replace('.', '/') + '.py'
if os.path.exists(rewritten):
ext = ' (rewrite cffi_modules to [%r])' % (
rewritten + ':' + ffi_var_name,)
error("%r does not name an existing file%s" % (build_file_name, ext))
mod_vars = {'__name__': '__cffi__', '__file__': build_file_name}
execfile(build_file_name, mod_vars)
try:
ffi = mod_vars[ffi_var_name]
except KeyError:
error("%r: object %r not found in module" % (mod_spec,
ffi_var_name))
if not isinstance(ffi, FFI):
ffi = ffi() # maybe it's a function instead of directly an ffi
if not isinstance(ffi, FFI):
error("%r is not an FFI instance (got %r)" % (mod_spec,
type(ffi).__name__))
if not hasattr(ffi, '_assigned_source'):
error("%r: the set_source() method was not called" % (mod_spec,))
module_name, source, source_extension, kwds = ffi._assigned_source
if ffi._windows_unicode:
kwds = kwds.copy()
ffi._apply_windows_unicode(kwds)
if source is None:
_add_py_module(dist, ffi, module_name)
else:
_add_c_module(dist, ffi, module_name, source, source_extension, kwds)
def _set_py_limited_api(Extension, kwds):
"""
Add py_limited_api to kwds if setuptools >= 26 is in use.
Do not alter the setting if it already exists.
Setuptools takes care of ignoring the flag on Python 2 and PyPy.
CPython itself should ignore the flag in a debugging version
(by not listing .abi3.so in the extensions it supports), but
it doesn't so far, creating troubles. That's why we check
for "not hasattr(sys, 'gettotalrefcount')" (the 2.7 compatible equivalent
of 'd' not in sys.abiflags). (http://bugs.python.org/issue28401)
On Windows, with CPython <= 3.4, it's better not to use py_limited_api
because virtualenv *still* doesn't copy PYTHON3.DLL on these versions.
Recently (2020) we started shipping only >= 3.5 wheels, though. So
we'll give it another try and set py_limited_api on Windows >= 3.5.
"""
from cffi._shimmed_dist_utils import log
from cffi import recompiler
if ('py_limited_api' not in kwds and not hasattr(sys, 'gettotalrefcount')
and recompiler.USE_LIMITED_API):
import setuptools
try:
setuptools_major_version = int(setuptools.__version__.partition('.')[0])
if setuptools_major_version >= 26:
kwds['py_limited_api'] = True
except ValueError: # certain development versions of setuptools
# If we don't know the version number of setuptools, we
# try to set 'py_limited_api' anyway. At worst, we get a
# warning.
kwds['py_limited_api'] = True
if sysconfig.get_config_var("Py_GIL_DISABLED"):
if kwds.get('py_limited_api'):
log.info("Ignoring py_limited_api=True for free-threaded build.")
kwds['py_limited_api'] = False
if kwds.get('py_limited_api') is False:
# avoid setting Py_LIMITED_API if py_limited_api=False
# which _cffi_include.h does unless _CFFI_NO_LIMITED_API is defined
kwds.setdefault("define_macros", []).append(("_CFFI_NO_LIMITED_API", None))
return kwds
def _add_c_module(dist, ffi, module_name, source, source_extension, kwds):
# We are a setuptools extension. Need this build_ext for py_limited_api.
from setuptools.command.build_ext import build_ext
from cffi._shimmed_dist_utils import Extension, log, mkpath
from cffi import recompiler
allsources = ['$PLACEHOLDER']
allsources.extend(kwds.pop('sources', []))
kwds = _set_py_limited_api(Extension, kwds)
ext = Extension(name=module_name, sources=allsources, **kwds)
def make_mod(tmpdir, pre_run=None):
c_file = os.path.join(tmpdir, module_name + source_extension)
log.info("generating cffi module %r" % c_file)
mkpath(tmpdir)
# a setuptools-only, API-only hook: called with the "ext" and "ffi"
# arguments just before we turn the ffi into C code. To use it,
# subclass the 'distutils.command.build_ext.build_ext' class and
# add a method 'def pre_run(self, ext, ffi)'.
if pre_run is not None:
pre_run(ext, ffi)
updated = recompiler.make_c_source(ffi, module_name, source, c_file)
if not updated:
log.info("already up-to-date")
return c_file
if dist.ext_modules is None:
dist.ext_modules = []
dist.ext_modules.append(ext)
base_class = dist.cmdclass.get('build_ext', build_ext)
class build_ext_make_mod(base_class):
def run(self):
if ext.sources[0] == '$PLACEHOLDER':
pre_run = getattr(self, 'pre_run', None)
ext.sources[0] = make_mod(self.build_temp, pre_run)
base_class.run(self)
dist.cmdclass['build_ext'] = build_ext_make_mod
# NB. multiple runs here will create multiple 'build_ext_make_mod'
# classes. Even in this case the 'build_ext' command should be
# run once; but just in case, the logic above does nothing if
# called again.
def _add_py_module(dist, ffi, module_name):
from setuptools.command.build_py import build_py
from setuptools.command.build_ext import build_ext
from cffi._shimmed_dist_utils import log, mkpath
from cffi import recompiler
def generate_mod(py_file):
log.info("generating cffi module %r" % py_file)
mkpath(os.path.dirname(py_file))
updated = recompiler.make_py_source(ffi, module_name, py_file)
if not updated:
log.info("already up-to-date")
base_class = dist.cmdclass.get('build_py', build_py)
class build_py_make_mod(base_class):
def run(self):
base_class.run(self)
module_path = module_name.split('.')
module_path[-1] += '.py'
generate_mod(os.path.join(self.build_lib, *module_path))
def get_source_files(self):
# This is called from 'setup.py sdist' only. Exclude
# the generate .py module in this case.
saved_py_modules = self.py_modules
try:
if saved_py_modules:
self.py_modules = [m for m in saved_py_modules
if m != module_name]
return base_class.get_source_files(self)
finally:
self.py_modules = saved_py_modules
dist.cmdclass['build_py'] = build_py_make_mod
# distutils and setuptools have no notion I could find of a
# generated python module. If we don't add module_name to
# dist.py_modules, then things mostly work but there are some
# combination of options (--root and --record) that will miss
# the module. So we add it here, which gives a few apparently
# harmless warnings about not finding the file outside the
# build directory.
# Then we need to hack more in get_source_files(); see above.
if dist.py_modules is None:
dist.py_modules = []
dist.py_modules.append(module_name)
# the following is only for "build_ext -i"
base_class_2 = dist.cmdclass.get('build_ext', build_ext)
class build_ext_make_mod(base_class_2):
def run(self):
base_class_2.run(self)
if self.inplace:
# from get_ext_fullpath() in distutils/command/build_ext.py
module_path = module_name.split('.')
package = '.'.join(module_path[:-1])
build_py = self.get_finalized_command('build_py')
package_dir = build_py.get_package_dir(package)
file_name = module_path[-1] + '.py'
generate_mod(os.path.join(package_dir, file_name))
dist.cmdclass['build_ext'] = build_ext_make_mod
def cffi_modules(dist, attr, value):
assert attr == 'cffi_modules'
if isinstance(value, basestring):
value = [value]
for cffi_module in value:
add_cffi_module(dist, cffi_module)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,679 @@
#
# DEPRECATED: implementation for ffi.verify()
#
import sys, os
import types
from . import model
from .error import VerificationError
class VGenericEngine(object):
_class_key = 'g'
_gen_python_module = False
def __init__(self, verifier):
self.verifier = verifier
self.ffi = verifier.ffi
self.export_symbols = []
self._struct_pending_verification = {}
def patch_extension_kwds(self, kwds):
# add 'export_symbols' to the dictionary. Note that we add the
# list before filling it. When we fill it, it will thus also show
# up in kwds['export_symbols'].
kwds.setdefault('export_symbols', self.export_symbols)
def find_module(self, module_name, path, so_suffixes):
for so_suffix in so_suffixes:
basename = module_name + so_suffix
if path is None:
path = sys.path
for dirname in path:
filename = os.path.join(dirname, basename)
if os.path.isfile(filename):
return filename
def collect_types(self):
pass # not needed in the generic engine
def _prnt(self, what=''):
self._f.write(what + '\n')
def write_source_to_f(self):
prnt = self._prnt
# first paste some standard set of lines that are mostly '#include'
prnt(cffimod_header)
# then paste the C source given by the user, verbatim.
prnt(self.verifier.preamble)
#
# call generate_gen_xxx_decl(), for every xxx found from
# ffi._parser._declarations. This generates all the functions.
self._generate('decl')
#
# on Windows, distutils insists on putting init_cffi_xyz in
# 'export_symbols', so instead of fighting it, just give up and
# give it one
if sys.platform == 'win32':
if sys.version_info >= (3,):
prefix = 'PyInit_'
else:
prefix = 'init'
modname = self.verifier.get_module_name()
prnt("void %s%s(void) { }\n" % (prefix, modname))
def load_library(self, flags=0):
# import it with the CFFI backend
backend = self.ffi._backend
# needs to make a path that contains '/', on Posix
filename = os.path.join(os.curdir, self.verifier.modulefilename)
module = backend.load_library(filename, flags)
#
# call loading_gen_struct() to get the struct layout inferred by
# the C compiler
self._load(module, 'loading')
# build the FFILibrary class and instance, this is a module subclass
# because modules are expected to have usually-constant-attributes and
# in PyPy this means the JIT is able to treat attributes as constant,
# which we want.
class FFILibrary(types.ModuleType):
_cffi_generic_module = module
_cffi_ffi = self.ffi
_cffi_dir = []
def __dir__(self):
return FFILibrary._cffi_dir
library = FFILibrary("")
#
# finally, call the loaded_gen_xxx() functions. This will set
# up the 'library' object.
self._load(module, 'loaded', library=library)
return library
def _get_declarations(self):
lst = [(key, tp) for (key, (tp, qual)) in
self.ffi._parser._declarations.items()]
lst.sort()
return lst
def _generate(self, step_name):
for name, tp in self._get_declarations():
kind, realname = name.split(' ', 1)
try:
method = getattr(self, '_generate_gen_%s_%s' % (kind,
step_name))
except AttributeError:
raise VerificationError(
"not implemented in verify(): %r" % name)
try:
method(tp, realname)
except Exception as e:
model.attach_exception_info(e, name)
raise
def _load(self, module, step_name, **kwds):
for name, tp in self._get_declarations():
kind, realname = name.split(' ', 1)
method = getattr(self, '_%s_gen_%s' % (step_name, kind))
try:
method(tp, realname, module, **kwds)
except Exception as e:
model.attach_exception_info(e, name)
raise
def _generate_nothing(self, tp, name):
pass
def _loaded_noop(self, tp, name, module, **kwds):
pass
# ----------
# typedefs: generates no code so far
_generate_gen_typedef_decl = _generate_nothing
_loading_gen_typedef = _loaded_noop
_loaded_gen_typedef = _loaded_noop
# ----------
# function declarations
def _generate_gen_function_decl(self, tp, name):
assert isinstance(tp, model.FunctionPtrType)
if tp.ellipsis:
# cannot support vararg functions better than this: check for its
# exact type (including the fixed arguments), and build it as a
# constant function pointer (no _cffi_f_%s wrapper)
self._generate_gen_const(False, name, tp)
return
prnt = self._prnt
numargs = len(tp.args)
argnames = []
for i, type in enumerate(tp.args):
indirection = ''
if isinstance(type, model.StructOrUnion):
indirection = '*'
argnames.append('%sx%d' % (indirection, i))
context = 'argument of %s' % name
arglist = [type.get_c_name(' %s' % arg, context)
for type, arg in zip(tp.args, argnames)]
tpresult = tp.result
if isinstance(tpresult, model.StructOrUnion):
arglist.insert(0, tpresult.get_c_name(' *r', context))
tpresult = model.void_type
arglist = ', '.join(arglist) or 'void'
wrappername = '_cffi_f_%s' % name
self.export_symbols.append(wrappername)
if tp.abi:
abi = tp.abi + ' '
else:
abi = ''
funcdecl = ' %s%s(%s)' % (abi, wrappername, arglist)
context = 'result of %s' % name
prnt(tpresult.get_c_name(funcdecl, context))
prnt('{')
#
if isinstance(tp.result, model.StructOrUnion):
result_code = '*r = '
elif not isinstance(tp.result, model.VoidType):
result_code = 'return '
else:
result_code = ''
prnt(' %s%s(%s);' % (result_code, name, ', '.join(argnames)))
prnt('}')
prnt()
_loading_gen_function = _loaded_noop
def _loaded_gen_function(self, tp, name, module, library):
assert isinstance(tp, model.FunctionPtrType)
if tp.ellipsis:
newfunction = self._load_constant(False, tp, name, module)
else:
indirections = []
base_tp = tp
if (any(isinstance(typ, model.StructOrUnion) for typ in tp.args)
or isinstance(tp.result, model.StructOrUnion)):
indirect_args = []
for i, typ in enumerate(tp.args):
if isinstance(typ, model.StructOrUnion):
typ = model.PointerType(typ)
indirections.append((i, typ))
indirect_args.append(typ)
indirect_result = tp.result
if isinstance(indirect_result, model.StructOrUnion):
if indirect_result.fldtypes is None:
raise TypeError("'%s' is used as result type, "
"but is opaque" % (
indirect_result._get_c_name(),))
indirect_result = model.PointerType(indirect_result)
indirect_args.insert(0, indirect_result)
indirections.insert(0, ("result", indirect_result))
indirect_result = model.void_type
tp = model.FunctionPtrType(tuple(indirect_args),
indirect_result, tp.ellipsis)
BFunc = self.ffi._get_cached_btype(tp)
wrappername = '_cffi_f_%s' % name
newfunction = module.load_function(BFunc, wrappername)
for i, typ in indirections:
newfunction = self._make_struct_wrapper(newfunction, i, typ,
base_tp)
setattr(library, name, newfunction)
type(library)._cffi_dir.append(name)
def _make_struct_wrapper(self, oldfunc, i, tp, base_tp):
backend = self.ffi._backend
BType = self.ffi._get_cached_btype(tp)
if i == "result":
ffi = self.ffi
def newfunc(*args):
res = ffi.new(BType)
oldfunc(res, *args)
return res[0]
else:
def newfunc(*args):
args = args[:i] + (backend.newp(BType, args[i]),) + args[i+1:]
return oldfunc(*args)
newfunc._cffi_base_type = base_tp
return newfunc
# ----------
# named structs
def _generate_gen_struct_decl(self, tp, name):
assert name == tp.name
self._generate_struct_or_union_decl(tp, 'struct', name)
def _loading_gen_struct(self, tp, name, module):
self._loading_struct_or_union(tp, 'struct', name, module)
def _loaded_gen_struct(self, tp, name, module, **kwds):
self._loaded_struct_or_union(tp)
def _generate_gen_union_decl(self, tp, name):
assert name == tp.name
self._generate_struct_or_union_decl(tp, 'union', name)
def _loading_gen_union(self, tp, name, module):
self._loading_struct_or_union(tp, 'union', name, module)
def _loaded_gen_union(self, tp, name, module, **kwds):
self._loaded_struct_or_union(tp)
def _generate_struct_or_union_decl(self, tp, prefix, name):
if tp.fldnames is None:
return # nothing to do with opaque structs
checkfuncname = '_cffi_check_%s_%s' % (prefix, name)
layoutfuncname = '_cffi_layout_%s_%s' % (prefix, name)
cname = ('%s %s' % (prefix, name)).strip()
#
prnt = self._prnt
prnt('static void %s(%s *p)' % (checkfuncname, cname))
prnt('{')
prnt(' /* only to generate compile-time warnings or errors */')
prnt(' (void)p;')
for fname, ftype, fbitsize, fqual in tp.enumfields():
if (isinstance(ftype, model.PrimitiveType)
and ftype.is_integer_type()) or fbitsize >= 0:
# accept all integers, but complain on float or double
prnt(' (void)((p->%s) << 1);' % fname)
else:
# only accept exactly the type declared.
try:
prnt(' { %s = &p->%s; (void)tmp; }' % (
ftype.get_c_name('*tmp', 'field %r'%fname, quals=fqual),
fname))
except VerificationError as e:
prnt(' /* %s */' % str(e)) # cannot verify it, ignore
prnt('}')
self.export_symbols.append(layoutfuncname)
prnt('intptr_t %s(intptr_t i)' % (layoutfuncname,))
prnt('{')
prnt(' struct _cffi_aligncheck { char x; %s y; };' % cname)
prnt(' static intptr_t nums[] = {')
prnt(' sizeof(%s),' % cname)
prnt(' offsetof(struct _cffi_aligncheck, y),')
for fname, ftype, fbitsize, fqual in tp.enumfields():
if fbitsize >= 0:
continue # xxx ignore fbitsize for now
prnt(' offsetof(%s, %s),' % (cname, fname))
if isinstance(ftype, model.ArrayType) and ftype.length is None:
prnt(' 0, /* %s */' % ftype._get_c_name())
else:
prnt(' sizeof(((%s *)0)->%s),' % (cname, fname))
prnt(' -1')
prnt(' };')
prnt(' return nums[i];')
prnt(' /* the next line is not executed, but compiled */')
prnt(' %s(0);' % (checkfuncname,))
prnt('}')
prnt()
def _loading_struct_or_union(self, tp, prefix, name, module):
if tp.fldnames is None:
return # nothing to do with opaque structs
layoutfuncname = '_cffi_layout_%s_%s' % (prefix, name)
#
BFunc = self.ffi._typeof_locked("intptr_t(*)(intptr_t)")[0]
function = module.load_function(BFunc, layoutfuncname)
layout = []
num = 0
while True:
x = function(num)
if x < 0: break
layout.append(x)
num += 1
if isinstance(tp, model.StructOrUnion) and tp.partial:
# use the function()'s sizes and offsets to guide the
# layout of the struct
totalsize = layout[0]
totalalignment = layout[1]
fieldofs = layout[2::2]
fieldsize = layout[3::2]
tp.force_flatten()
assert len(fieldofs) == len(fieldsize) == len(tp.fldnames)
tp.fixedlayout = fieldofs, fieldsize, totalsize, totalalignment
else:
cname = ('%s %s' % (prefix, name)).strip()
self._struct_pending_verification[tp] = layout, cname
def _loaded_struct_or_union(self, tp):
if tp.fldnames is None:
return # nothing to do with opaque structs
self.ffi._get_cached_btype(tp) # force 'fixedlayout' to be considered
if tp in self._struct_pending_verification:
# check that the layout sizes and offsets match the real ones
def check(realvalue, expectedvalue, msg):
if realvalue != expectedvalue:
raise VerificationError(
"%s (we have %d, but C compiler says %d)"
% (msg, expectedvalue, realvalue))
ffi = self.ffi
BStruct = ffi._get_cached_btype(tp)
layout, cname = self._struct_pending_verification.pop(tp)
check(layout[0], ffi.sizeof(BStruct), "wrong total size")
check(layout[1], ffi.alignof(BStruct), "wrong total alignment")
i = 2
for fname, ftype, fbitsize, fqual in tp.enumfields():
if fbitsize >= 0:
continue # xxx ignore fbitsize for now
check(layout[i], ffi.offsetof(BStruct, fname),
"wrong offset for field %r" % (fname,))
if layout[i+1] != 0:
BField = ffi._get_cached_btype(ftype)
check(layout[i+1], ffi.sizeof(BField),
"wrong size for field %r" % (fname,))
i += 2
assert i == len(layout)
# ----------
# 'anonymous' declarations. These are produced for anonymous structs
# or unions; the 'name' is obtained by a typedef.
def _generate_gen_anonymous_decl(self, tp, name):
if isinstance(tp, model.EnumType):
self._generate_gen_enum_decl(tp, name, '')
else:
self._generate_struct_or_union_decl(tp, '', name)
def _loading_gen_anonymous(self, tp, name, module):
if isinstance(tp, model.EnumType):
self._loading_gen_enum(tp, name, module, '')
else:
self._loading_struct_or_union(tp, '', name, module)
def _loaded_gen_anonymous(self, tp, name, module, **kwds):
if isinstance(tp, model.EnumType):
self._loaded_gen_enum(tp, name, module, **kwds)
else:
self._loaded_struct_or_union(tp)
# ----------
# constants, likely declared with '#define'
def _generate_gen_const(self, is_int, name, tp=None, category='const',
check_value=None):
prnt = self._prnt
funcname = '_cffi_%s_%s' % (category, name)
self.export_symbols.append(funcname)
if check_value is not None:
assert is_int
assert category == 'const'
prnt('int %s(char *out_error)' % funcname)
prnt('{')
self._check_int_constant_value(name, check_value)
prnt(' return 0;')
prnt('}')
elif is_int:
assert category == 'const'
prnt('int %s(long long *out_value)' % funcname)
prnt('{')
prnt(' *out_value = (long long)(%s);' % (name,))
prnt(' return (%s) <= 0;' % (name,))
prnt('}')
else:
assert tp is not None
assert check_value is None
if category == 'var':
ampersand = '&'
else:
ampersand = ''
extra = ''
if category == 'const' and isinstance(tp, model.StructOrUnion):
extra = 'const *'
ampersand = '&'
prnt(tp.get_c_name(' %s%s(void)' % (extra, funcname), name))
prnt('{')
prnt(' return (%s%s);' % (ampersand, name))
prnt('}')
prnt()
def _generate_gen_constant_decl(self, tp, name):
is_int = isinstance(tp, model.PrimitiveType) and tp.is_integer_type()
self._generate_gen_const(is_int, name, tp)
_loading_gen_constant = _loaded_noop
def _load_constant(self, is_int, tp, name, module, check_value=None):
funcname = '_cffi_const_%s' % name
if check_value is not None:
assert is_int
self._load_known_int_constant(module, funcname)
value = check_value
elif is_int:
BType = self.ffi._typeof_locked("long long*")[0]
BFunc = self.ffi._typeof_locked("int(*)(long long*)")[0]
function = module.load_function(BFunc, funcname)
p = self.ffi.new(BType)
negative = function(p)
value = int(p[0])
if value < 0 and not negative:
BLongLong = self.ffi._typeof_locked("long long")[0]
value += (1 << (8*self.ffi.sizeof(BLongLong)))
else:
assert check_value is None
fntypeextra = '(*)(void)'
if isinstance(tp, model.StructOrUnion):
fntypeextra = '*' + fntypeextra
BFunc = self.ffi._typeof_locked(tp.get_c_name(fntypeextra, name))[0]
function = module.load_function(BFunc, funcname)
value = function()
if isinstance(tp, model.StructOrUnion):
value = value[0]
return value
def _loaded_gen_constant(self, tp, name, module, library):
is_int = isinstance(tp, model.PrimitiveType) and tp.is_integer_type()
value = self._load_constant(is_int, tp, name, module)
setattr(library, name, value)
type(library)._cffi_dir.append(name)
# ----------
# enums
def _check_int_constant_value(self, name, value):
prnt = self._prnt
if value <= 0:
prnt(' if ((%s) > 0 || (long)(%s) != %dL) {' % (
name, name, value))
else:
prnt(' if ((%s) <= 0 || (unsigned long)(%s) != %dUL) {' % (
name, name, value))
prnt(' char buf[64];')
prnt(' if ((%s) <= 0)' % name)
prnt(' sprintf(buf, "%%ld", (long)(%s));' % name)
prnt(' else')
prnt(' sprintf(buf, "%%lu", (unsigned long)(%s));' %
name)
prnt(' sprintf(out_error, "%s has the real value %s, not %s",')
prnt(' "%s", buf, "%d");' % (name[:100], value))
prnt(' return -1;')
prnt(' }')
def _load_known_int_constant(self, module, funcname):
BType = self.ffi._typeof_locked("char[]")[0]
BFunc = self.ffi._typeof_locked("int(*)(char*)")[0]
function = module.load_function(BFunc, funcname)
p = self.ffi.new(BType, 256)
if function(p) < 0:
error = self.ffi.string(p)
if sys.version_info >= (3,):
error = str(error, 'utf-8')
raise VerificationError(error)
def _enum_funcname(self, prefix, name):
# "$enum_$1" => "___D_enum____D_1"
name = name.replace('$', '___D_')
return '_cffi_e_%s_%s' % (prefix, name)
def _generate_gen_enum_decl(self, tp, name, prefix='enum'):
if tp.partial:
for enumerator in tp.enumerators:
self._generate_gen_const(True, enumerator)
return
#
funcname = self._enum_funcname(prefix, name)
self.export_symbols.append(funcname)
prnt = self._prnt
prnt('int %s(char *out_error)' % funcname)
prnt('{')
for enumerator, enumvalue in zip(tp.enumerators, tp.enumvalues):
self._check_int_constant_value(enumerator, enumvalue)
prnt(' return 0;')
prnt('}')
prnt()
def _loading_gen_enum(self, tp, name, module, prefix='enum'):
if tp.partial:
enumvalues = [self._load_constant(True, tp, enumerator, module)
for enumerator in tp.enumerators]
tp.enumvalues = tuple(enumvalues)
tp.partial_resolved = True
else:
funcname = self._enum_funcname(prefix, name)
self._load_known_int_constant(module, funcname)
def _loaded_gen_enum(self, tp, name, module, library):
for enumerator, enumvalue in zip(tp.enumerators, tp.enumvalues):
setattr(library, enumerator, enumvalue)
type(library)._cffi_dir.append(enumerator)
# ----------
# macros: for now only for integers
def _generate_gen_macro_decl(self, tp, name):
if tp == '...':
check_value = None
else:
check_value = tp # an integer
self._generate_gen_const(True, name, check_value=check_value)
_loading_gen_macro = _loaded_noop
def _loaded_gen_macro(self, tp, name, module, library):
if tp == '...':
check_value = None
else:
check_value = tp # an integer
value = self._load_constant(True, tp, name, module,
check_value=check_value)
setattr(library, name, value)
type(library)._cffi_dir.append(name)
# ----------
# global variables
def _generate_gen_variable_decl(self, tp, name):
if isinstance(tp, model.ArrayType):
if tp.length_is_unknown():
prnt = self._prnt
funcname = '_cffi_sizeof_%s' % (name,)
self.export_symbols.append(funcname)
prnt("size_t %s(void)" % funcname)
prnt("{")
prnt(" return sizeof(%s);" % (name,))
prnt("}")
tp_ptr = model.PointerType(tp.item)
self._generate_gen_const(False, name, tp_ptr)
else:
tp_ptr = model.PointerType(tp)
self._generate_gen_const(False, name, tp_ptr, category='var')
_loading_gen_variable = _loaded_noop
def _loaded_gen_variable(self, tp, name, module, library):
if isinstance(tp, model.ArrayType): # int a[5] is "constant" in the
# sense that "a=..." is forbidden
if tp.length_is_unknown():
funcname = '_cffi_sizeof_%s' % (name,)
BFunc = self.ffi._typeof_locked('size_t(*)(void)')[0]
function = module.load_function(BFunc, funcname)
size = function()
BItemType = self.ffi._get_cached_btype(tp.item)
length, rest = divmod(size, self.ffi.sizeof(BItemType))
if rest != 0:
raise VerificationError(
"bad size: %r does not seem to be an array of %s" %
(name, tp.item))
tp = tp.resolve_length(length)
tp_ptr = model.PointerType(tp.item)
value = self._load_constant(False, tp_ptr, name, module)
# 'value' is a <cdata 'type *'> which we have to replace with
# a <cdata 'type[N]'> if the N is actually known
if tp.length is not None:
BArray = self.ffi._get_cached_btype(tp)
value = self.ffi.cast(BArray, value)
setattr(library, name, value)
type(library)._cffi_dir.append(name)
return
# remove ptr=<cdata 'int *'> from the library instance, and replace
# it by a property on the class, which reads/writes into ptr[0].
funcname = '_cffi_var_%s' % name
BFunc = self.ffi._typeof_locked(tp.get_c_name('*(*)(void)', name))[0]
function = module.load_function(BFunc, funcname)
ptr = function()
def getter(library):
return ptr[0]
def setter(library, value):
ptr[0] = value
setattr(type(library), name, property(getter, setter))
type(library)._cffi_dir.append(name)
cffimod_header = r'''
#include <stdio.h>
#include <stddef.h>
#include <stdarg.h>
#include <errno.h>
#include <sys/types.h> /* XXX for ssize_t on some platforms */
/* this block of #ifs should be kept exactly identical between
c/_cffi_backend.c, cffi/vengine_cpy.py, cffi/vengine_gen.py
and cffi/_cffi_include.h */
#if defined(_MSC_VER)
# include <malloc.h> /* for alloca() */
# if _MSC_VER < 1600 /* MSVC < 2010 */
typedef __int8 int8_t;
typedef __int16 int16_t;
typedef __int32 int32_t;
typedef __int64 int64_t;
typedef unsigned __int8 uint8_t;
typedef unsigned __int16 uint16_t;
typedef unsigned __int32 uint32_t;
typedef unsigned __int64 uint64_t;
typedef __int8 int_least8_t;
typedef __int16 int_least16_t;
typedef __int32 int_least32_t;
typedef __int64 int_least64_t;
typedef unsigned __int8 uint_least8_t;
typedef unsigned __int16 uint_least16_t;
typedef unsigned __int32 uint_least32_t;
typedef unsigned __int64 uint_least64_t;
typedef __int8 int_fast8_t;
typedef __int16 int_fast16_t;
typedef __int32 int_fast32_t;
typedef __int64 int_fast64_t;
typedef unsigned __int8 uint_fast8_t;
typedef unsigned __int16 uint_fast16_t;
typedef unsigned __int32 uint_fast32_t;
typedef unsigned __int64 uint_fast64_t;
typedef __int64 intmax_t;
typedef unsigned __int64 uintmax_t;
# else
# include <stdint.h>
# endif
# if _MSC_VER < 1800 /* MSVC < 2013 */
# ifndef __cplusplus
typedef unsigned char _Bool;
# endif
# endif
# define _cffi_float_complex_t _Fcomplex /* include <complex.h> for it */
# define _cffi_double_complex_t _Dcomplex /* include <complex.h> for it */
#else
# include <stdint.h>
# if (defined (__SVR4) && defined (__sun)) || defined(_AIX) || defined(__hpux)
# include <alloca.h>
# endif
# define _cffi_float_complex_t float _Complex
# define _cffi_double_complex_t double _Complex
#endif
'''

View File

@@ -0,0 +1,306 @@
#
# DEPRECATED: implementation for ffi.verify()
#
import sys, os, binascii, shutil, io
from . import __version_verifier_modules__
from . import ffiplatform
from .error import VerificationError
if sys.version_info >= (3, 3):
import importlib.machinery
def _extension_suffixes():
return importlib.machinery.EXTENSION_SUFFIXES[:]
else:
import imp
def _extension_suffixes():
return [suffix for suffix, _, type in imp.get_suffixes()
if type == imp.C_EXTENSION]
if sys.version_info >= (3,):
NativeIO = io.StringIO
else:
class NativeIO(io.BytesIO):
def write(self, s):
if isinstance(s, unicode):
s = s.encode('ascii')
super(NativeIO, self).write(s)
class Verifier(object):
def __init__(self, ffi, preamble, tmpdir=None, modulename=None,
ext_package=None, tag='', force_generic_engine=False,
source_extension='.c', flags=None, relative_to=None, **kwds):
if ffi._parser._uses_new_feature:
raise VerificationError(
"feature not supported with ffi.verify(), but only "
"with ffi.set_source(): %s" % (ffi._parser._uses_new_feature,))
self.ffi = ffi
self.preamble = preamble
if not modulename:
flattened_kwds = ffiplatform.flatten(kwds)
vengine_class = _locate_engine_class(ffi, force_generic_engine)
self._vengine = vengine_class(self)
self._vengine.patch_extension_kwds(kwds)
self.flags = flags
self.kwds = self.make_relative_to(kwds, relative_to)
#
if modulename:
if tag:
raise TypeError("can't specify both 'modulename' and 'tag'")
else:
key = '\x00'.join(['%d.%d' % sys.version_info[:2],
__version_verifier_modules__,
preamble, flattened_kwds] +
ffi._cdefsources)
if sys.version_info >= (3,):
key = key.encode('utf-8')
k1 = hex(binascii.crc32(key[0::2]) & 0xffffffff)
k1 = k1.lstrip('0x').rstrip('L')
k2 = hex(binascii.crc32(key[1::2]) & 0xffffffff)
k2 = k2.lstrip('0').rstrip('L')
modulename = '_cffi_%s_%s%s%s' % (tag, self._vengine._class_key,
k1, k2)
suffix = _get_so_suffixes()[0]
self.tmpdir = tmpdir or _caller_dir_pycache()
self.sourcefilename = os.path.join(self.tmpdir, modulename + source_extension)
self.modulefilename = os.path.join(self.tmpdir, modulename + suffix)
self.ext_package = ext_package
self._has_source = False
self._has_module = False
def write_source(self, file=None):
"""Write the C source code. It is produced in 'self.sourcefilename',
which can be tweaked beforehand."""
with self.ffi._lock:
if self._has_source and file is None:
raise VerificationError(
"source code already written")
self._write_source(file)
def compile_module(self):
"""Write the C source code (if not done already) and compile it.
This produces a dynamic link library in 'self.modulefilename'."""
with self.ffi._lock:
if self._has_module:
raise VerificationError("module already compiled")
if not self._has_source:
self._write_source()
self._compile_module()
def load_library(self):
"""Get a C module from this Verifier instance.
Returns an instance of a FFILibrary class that behaves like the
objects returned by ffi.dlopen(), but that delegates all
operations to the C module. If necessary, the C code is written
and compiled first.
"""
with self.ffi._lock:
if not self._has_module:
self._locate_module()
if not self._has_module:
if not self._has_source:
self._write_source()
self._compile_module()
return self._load_library()
def get_module_name(self):
basename = os.path.basename(self.modulefilename)
# kill both the .so extension and the other .'s, as introduced
# by Python 3: 'basename.cpython-33m.so'
basename = basename.split('.', 1)[0]
# and the _d added in Python 2 debug builds --- but try to be
# conservative and not kill a legitimate _d
if basename.endswith('_d') and hasattr(sys, 'gettotalrefcount'):
basename = basename[:-2]
return basename
def get_extension(self):
if not self._has_source:
with self.ffi._lock:
if not self._has_source:
self._write_source()
sourcename = ffiplatform.maybe_relative_path(self.sourcefilename)
modname = self.get_module_name()
return ffiplatform.get_extension(sourcename, modname, **self.kwds)
def generates_python_module(self):
return self._vengine._gen_python_module
def make_relative_to(self, kwds, relative_to):
if relative_to and os.path.dirname(relative_to):
dirname = os.path.dirname(relative_to)
kwds = kwds.copy()
for key in ffiplatform.LIST_OF_FILE_NAMES:
if key in kwds:
lst = kwds[key]
if not isinstance(lst, (list, tuple)):
raise TypeError("keyword '%s' should be a list or tuple"
% (key,))
lst = [os.path.join(dirname, fn) for fn in lst]
kwds[key] = lst
return kwds
# ----------
def _locate_module(self):
if not os.path.isfile(self.modulefilename):
if self.ext_package:
try:
pkg = __import__(self.ext_package, None, None, ['__doc__'])
except ImportError:
return # cannot import the package itself, give up
# (e.g. it might be called differently before installation)
path = pkg.__path__
else:
path = None
filename = self._vengine.find_module(self.get_module_name(), path,
_get_so_suffixes())
if filename is None:
return
self.modulefilename = filename
self._vengine.collect_types()
self._has_module = True
def _write_source_to(self, file):
self._vengine._f = file
try:
self._vengine.write_source_to_f()
finally:
del self._vengine._f
def _write_source(self, file=None):
if file is not None:
self._write_source_to(file)
else:
# Write our source file to an in memory file.
f = NativeIO()
self._write_source_to(f)
source_data = f.getvalue()
# Determine if this matches the current file
if os.path.exists(self.sourcefilename):
with open(self.sourcefilename, "r") as fp:
needs_written = not (fp.read() == source_data)
else:
needs_written = True
# Actually write the file out if it doesn't match
if needs_written:
_ensure_dir(self.sourcefilename)
with open(self.sourcefilename, "w") as fp:
fp.write(source_data)
# Set this flag
self._has_source = True
def _compile_module(self):
# compile this C source
tmpdir = os.path.dirname(self.sourcefilename)
outputfilename = ffiplatform.compile(tmpdir, self.get_extension())
try:
same = ffiplatform.samefile(outputfilename, self.modulefilename)
except OSError:
same = False
if not same:
_ensure_dir(self.modulefilename)
shutil.move(outputfilename, self.modulefilename)
self._has_module = True
def _load_library(self):
assert self._has_module
if self.flags is not None:
return self._vengine.load_library(self.flags)
else:
return self._vengine.load_library()
# ____________________________________________________________
_FORCE_GENERIC_ENGINE = False # for tests
def _locate_engine_class(ffi, force_generic_engine):
if _FORCE_GENERIC_ENGINE:
force_generic_engine = True
if not force_generic_engine:
if '__pypy__' in sys.builtin_module_names:
force_generic_engine = True
else:
try:
import _cffi_backend
except ImportError:
_cffi_backend = '?'
if ffi._backend is not _cffi_backend:
force_generic_engine = True
if force_generic_engine:
from . import vengine_gen
return vengine_gen.VGenericEngine
else:
from . import vengine_cpy
return vengine_cpy.VCPythonEngine
# ____________________________________________________________
_TMPDIR = None
def _caller_dir_pycache():
if _TMPDIR:
return _TMPDIR
result = os.environ.get('CFFI_TMPDIR')
if result:
return result
filename = sys._getframe(2).f_code.co_filename
return os.path.abspath(os.path.join(os.path.dirname(filename),
'__pycache__'))
def set_tmpdir(dirname):
"""Set the temporary directory to use instead of __pycache__."""
global _TMPDIR
_TMPDIR = dirname
def cleanup_tmpdir(tmpdir=None, keep_so=False):
"""Clean up the temporary directory by removing all files in it
called `_cffi_*.{c,so}` as well as the `build` subdirectory."""
tmpdir = tmpdir or _caller_dir_pycache()
try:
filelist = os.listdir(tmpdir)
except OSError:
return
if keep_so:
suffix = '.c' # only remove .c files
else:
suffix = _get_so_suffixes()[0].lower()
for fn in filelist:
if fn.lower().startswith('_cffi_') and (
fn.lower().endswith(suffix) or fn.lower().endswith('.c')):
try:
os.unlink(os.path.join(tmpdir, fn))
except OSError:
pass
clean_dir = [os.path.join(tmpdir, 'build')]
for dir in clean_dir:
try:
for fn in os.listdir(dir):
fn = os.path.join(dir, fn)
if os.path.isdir(fn):
clean_dir.append(fn)
else:
os.unlink(fn)
except OSError:
pass
def _get_so_suffixes():
suffixes = _extension_suffixes()
if not suffixes:
# bah, no C_EXTENSION available. Occurs on pypy without cpyext
if sys.platform == 'win32':
suffixes = [".pyd"]
else:
suffixes = [".so"]
return suffixes
def _ensure_dir(filename):
dirname = os.path.dirname(filename)
if dirname and not os.path.isdir(dirname):
os.makedirs(dirname)

View File

@@ -0,0 +1,764 @@
Metadata-Version: 2.4
Name: charset-normalizer
Version: 3.4.4
Summary: The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet.
Author-email: "Ahmed R. TAHRI" <tahri.ahmed@proton.me>
Maintainer-email: "Ahmed R. TAHRI" <tahri.ahmed@proton.me>
License: MIT
Project-URL: Changelog, https://github.com/jawah/charset_normalizer/blob/master/CHANGELOG.md
Project-URL: Documentation, https://charset-normalizer.readthedocs.io/
Project-URL: Code, https://github.com/jawah/charset_normalizer
Project-URL: Issue tracker, https://github.com/jawah/charset_normalizer/issues
Keywords: encoding,charset,charset-detector,detector,normalization,unicode,chardet,detect
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Topic :: Utilities
Classifier: Typing :: Typed
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: unicode-backport
Dynamic: license-file
<h1 align="center">Charset Detection, for Everyone 👋</h1>
<p align="center">
<sup>The Real First Universal Charset Detector</sup><br>
<a href="https://pypi.org/project/charset-normalizer">
<img src="https://img.shields.io/pypi/pyversions/charset_normalizer.svg?orange=blue" />
</a>
<a href="https://pepy.tech/project/charset-normalizer/">
<img alt="Download Count Total" src="https://static.pepy.tech/badge/charset-normalizer/month" />
</a>
<a href="https://bestpractices.coreinfrastructure.org/projects/7297">
<img src="https://bestpractices.coreinfrastructure.org/projects/7297/badge">
</a>
</p>
<p align="center">
<sup><i>Featured Packages</i></sup><br>
<a href="https://github.com/jawah/niquests">
<img alt="Static Badge" src="https://img.shields.io/badge/Niquests-Most_Advanced_HTTP_Client-cyan">
</a>
<a href="https://github.com/jawah/wassima">
<img alt="Static Badge" src="https://img.shields.io/badge/Wassima-Certifi_Replacement-cyan">
</a>
</p>
<p align="center">
<sup><i>In other language (unofficial port - by the community)</i></sup><br>
<a href="https://github.com/nickspring/charset-normalizer-rs">
<img alt="Static Badge" src="https://img.shields.io/badge/Rust-red">
</a>
</p>
> A library that helps you read text from an unknown charset encoding.<br /> Motivated by `chardet`,
> I'm trying to resolve the issue by taking a new approach.
> All IANA character set names for which the Python core library provides codecs are supported.
<p align="center">
>>>>> <a href="https://charsetnormalizerweb.ousret.now.sh" target="_blank">👉 Try Me Online Now, Then Adopt Me 👈 </a> <<<<<
</p>
This project offers you an alternative to **Universal Charset Encoding Detector**, also known as **Chardet**.
| Feature | [Chardet](https://github.com/chardet/chardet) | Charset Normalizer | [cChardet](https://github.com/PyYoshi/cChardet) |
|--------------------------------------------------|:---------------------------------------------:|:--------------------------------------------------------------------------------------------------:|:-----------------------------------------------:|
| `Fast` | ❌ | ✅ | ✅ |
| `Universal**` | ❌ | ✅ | ❌ |
| `Reliable` **without** distinguishable standards | ❌ | ✅ | ✅ |
| `Reliable` **with** distinguishable standards | ✅ | ✅ | ✅ |
| `License` | LGPL-2.1<br>_restrictive_ | MIT | MPL-1.1<br>_restrictive_ |
| `Native Python` | ✅ | ✅ | ❌ |
| `Detect spoken language` | ❌ | ✅ | N/A |
| `UnicodeDecodeError Safety` | ❌ | ✅ | ❌ |
| `Whl Size (min)` | 193.6 kB | 42 kB | ~200 kB |
| `Supported Encoding` | 33 | 🎉 [99](https://charset-normalizer.readthedocs.io/en/latest/user/support.html#supported-encodings) | 40 |
<p align="center">
<img src="https://i.imgflip.com/373iay.gif" alt="Reading Normalized Text" width="226"/><img src="https://media.tenor.com/images/c0180f70732a18b4965448d33adba3d0/tenor.gif" alt="Cat Reading Text" width="200"/>
</p>
*\*\* : They are clearly using specific code for a specific encoding even if covering most of used one*<br>
## ⚡ Performance
This package offer better performance than its counterpart Chardet. Here are some numbers.
| Package | Accuracy | Mean per file (ms) | File per sec (est) |
|-----------------------------------------------|:--------:|:------------------:|:------------------:|
| [chardet](https://github.com/chardet/chardet) | 86 % | 63 ms | 16 file/sec |
| charset-normalizer | **98 %** | **10 ms** | 100 file/sec |
| Package | 99th percentile | 95th percentile | 50th percentile |
|-----------------------------------------------|:---------------:|:---------------:|:---------------:|
| [chardet](https://github.com/chardet/chardet) | 265 ms | 71 ms | 7 ms |
| charset-normalizer | 100 ms | 50 ms | 5 ms |
_updated as of december 2024 using CPython 3.12_
Chardet's performance on larger file (1MB+) are very poor. Expect huge difference on large payload.
> Stats are generated using 400+ files using default parameters. More details on used files, see GHA workflows.
> And yes, these results might change at any time. The dataset can be updated to include more files.
> The actual delays heavily depends on your CPU capabilities. The factors should remain the same.
> Keep in mind that the stats are generous and that Chardet accuracy vs our is measured using Chardet initial capability
> (e.g. Supported Encoding) Challenge-them if you want.
## ✨ Installation
Using pip:
```sh
pip install charset-normalizer -U
```
## 🚀 Basic Usage
### CLI
This package comes with a CLI.
```
usage: normalizer [-h] [-v] [-a] [-n] [-m] [-r] [-f] [-t THRESHOLD]
file [file ...]
The Real First Universal Charset Detector. Discover originating encoding used
on text file. Normalize text to unicode.
positional arguments:
files File(s) to be analysed
optional arguments:
-h, --help show this help message and exit
-v, --verbose Display complementary information about file if any.
Stdout will contain logs about the detection process.
-a, --with-alternative
Output complementary possibilities if any. Top-level
JSON WILL be a list.
-n, --normalize Permit to normalize input file. If not set, program
does not write anything.
-m, --minimal Only output the charset detected to STDOUT. Disabling
JSON output.
-r, --replace Replace file when trying to normalize it instead of
creating a new one.
-f, --force Replace file without asking if you are sure, use this
flag with caution.
-t THRESHOLD, --threshold THRESHOLD
Define a custom maximum amount of chaos allowed in
decoded content. 0. <= chaos <= 1.
--version Show version information and exit.
```
```bash
normalizer ./data/sample.1.fr.srt
```
or
```bash
python -m charset_normalizer ./data/sample.1.fr.srt
```
🎉 Since version 1.4.0 the CLI produce easily usable stdout result in JSON format.
```json
{
"path": "/home/default/projects/charset_normalizer/data/sample.1.fr.srt",
"encoding": "cp1252",
"encoding_aliases": [
"1252",
"windows_1252"
],
"alternative_encodings": [
"cp1254",
"cp1256",
"cp1258",
"iso8859_14",
"iso8859_15",
"iso8859_16",
"iso8859_3",
"iso8859_9",
"latin_1",
"mbcs"
],
"language": "French",
"alphabets": [
"Basic Latin",
"Latin-1 Supplement"
],
"has_sig_or_bom": false,
"chaos": 0.149,
"coherence": 97.152,
"unicode_path": null,
"is_preferred": true
}
```
### Python
*Just print out normalized text*
```python
from charset_normalizer import from_path
results = from_path('./my_subtitle.srt')
print(str(results.best()))
```
*Upgrade your code without effort*
```python
from charset_normalizer import detect
```
The above code will behave the same as **chardet**. We ensure that we offer the best (reasonable) BC result possible.
See the docs for advanced usage : [readthedocs.io](https://charset-normalizer.readthedocs.io/en/latest/)
## 😇 Why
When I started using Chardet, I noticed that it was not suited to my expectations, and I wanted to propose a
reliable alternative using a completely different method. Also! I never back down on a good challenge!
I **don't care** about the **originating charset** encoding, because **two different tables** can
produce **two identical rendered string.**
What I want is to get readable text, the best I can.
In a way, **I'm brute forcing text decoding.** How cool is that ? 😎
Don't confuse package **ftfy** with charset-normalizer or chardet. ftfy goal is to repair Unicode string whereas charset-normalizer to convert raw file in unknown encoding to unicode.
## 🍰 How
- Discard all charset encoding table that could not fit the binary content.
- Measure noise, or the mess once opened (by chunks) with a corresponding charset encoding.
- Extract matches with the lowest mess detected.
- Additionally, we measure coherence / probe for a language.
**Wait a minute**, what is noise/mess and coherence according to **YOU ?**
*Noise :* I opened hundred of text files, **written by humans**, with the wrong encoding table. **I observed**, then
**I established** some ground rules about **what is obvious** when **it seems like** a mess (aka. defining noise in rendered text).
I know that my interpretation of what is noise is probably incomplete, feel free to contribute in order to
improve or rewrite it.
*Coherence :* For each language there is on earth, we have computed ranked letter appearance occurrences (the best we can). So I thought
that intel is worth something here. So I use those records against decoded text to check if I can detect intelligent design.
## ⚡ Known limitations
- Language detection is unreliable when text contains two or more languages sharing identical letters. (eg. HTML (english tags) + Turkish content (Sharing Latin characters))
- Every charset detector heavily depends on sufficient content. In common cases, do not bother run detection on very tiny content.
## ⚠️ About Python EOLs
**If you are running:**
- Python >=2.7,<3.5: Unsupported
- Python 3.5: charset-normalizer < 2.1
- Python 3.6: charset-normalizer < 3.1
- Python 3.7: charset-normalizer < 4.0
Upgrade your Python interpreter as soon as possible.
## 👤 Contributing
Contributions, issues and feature requests are very much welcome.<br />
Feel free to check [issues page](https://github.com/ousret/charset_normalizer/issues) if you want to contribute.
## 📝 License
Copyright © [Ahmed TAHRI @Ousret](https://github.com/Ousret).<br />
This project is [MIT](https://github.com/Ousret/charset_normalizer/blob/master/LICENSE) licensed.
Characters frequencies used in this project © 2012 [Denny Vrandečić](http://simia.net/letters/)
## 💼 For Enterprise
Professional support for charset-normalizer is available as part of the [Tidelift
Subscription][1]. Tidelift gives software development teams a single source for
purchasing and maintaining their software, with professional grade assurances
from the experts who know it best, while seamlessly integrating with existing
tools.
[1]: https://tidelift.com/subscription/pkg/pypi-charset-normalizer?utm_source=pypi-charset-normalizer&utm_medium=readme
[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/7297/badge)](https://www.bestpractices.dev/projects/7297)
# Changelog
All notable changes to charset-normalizer will be documented in this file. This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
## [3.4.4](https://github.com/Ousret/charset_normalizer/compare/3.4.2...3.4.4) (2025-10-13)
### Changed
- Bound `setuptools` to a specific constraint `setuptools>=68,<=81`.
- Raised upper bound of mypyc for the optional pre-built extension to v1.18.2
### Removed
- `setuptools-scm` as a build dependency.
### Misc
- Enforced hashes in `dev-requirements.txt` and created `ci-requirements.txt` for security purposes.
- Additional pre-built wheels for riscv64, s390x, and armv7l architectures.
- Restore ` multiple.intoto.jsonl` in GitHub releases in addition to individual attestation file per wheel.
## [3.4.3](https://github.com/Ousret/charset_normalizer/compare/3.4.2...3.4.3) (2025-08-09)
### Changed
- mypy(c) is no longer a required dependency at build time if `CHARSET_NORMALIZER_USE_MYPYC` isn't set to `1`. (#595) (#583)
- automatically lower confidence on small bytes samples that are not Unicode in `detect` output legacy function. (#391)
### Added
- Custom build backend to overcome inability to mark mypy as an optional dependency in the build phase.
- Support for Python 3.14
### Fixed
- sdist archive contained useless directories.
- automatically fallback on valid UTF-16 or UTF-32 even if the md says it's noisy. (#633)
### Misc
- SBOM are automatically published to the relevant GitHub release to comply with regulatory changes.
Each published wheel comes with its SBOM. We choose CycloneDX as the format.
- Prebuilt optimized wheel are no longer distributed by default for CPython 3.7 due to a change in cibuildwheel.
## [3.4.2](https://github.com/Ousret/charset_normalizer/compare/3.4.1...3.4.2) (2025-05-02)
### Fixed
- Addressed the DeprecationWarning in our CLI regarding `argparse.FileType` by backporting the target class into the package. (#591)
- Improved the overall reliability of the detector with CJK Ideographs. (#605) (#587)
### Changed
- Optional mypyc compilation upgraded to version 1.15 for Python >= 3.8
## [3.4.1](https://github.com/Ousret/charset_normalizer/compare/3.4.0...3.4.1) (2024-12-24)
### Changed
- Project metadata are now stored using `pyproject.toml` instead of `setup.cfg` using setuptools as the build backend.
- Enforce annotation delayed loading for a simpler and consistent types in the project.
- Optional mypyc compilation upgraded to version 1.14 for Python >= 3.8
### Added
- pre-commit configuration.
- noxfile.
### Removed
- `build-requirements.txt` as per using `pyproject.toml` native build configuration.
- `bin/integration.py` and `bin/serve.py` in favor of downstream integration test (see noxfile).
- `setup.cfg` in favor of `pyproject.toml` metadata configuration.
- Unused `utils.range_scan` function.
### Fixed
- Converting content to Unicode bytes may insert `utf_8` instead of preferred `utf-8`. (#572)
- Deprecation warning "'count' is passed as positional argument" when converting to Unicode bytes on Python 3.13+
## [3.4.0](https://github.com/Ousret/charset_normalizer/compare/3.3.2...3.4.0) (2024-10-08)
### Added
- Argument `--no-preemptive` in the CLI to prevent the detector to search for hints.
- Support for Python 3.13 (#512)
### Fixed
- Relax the TypeError exception thrown when trying to compare a CharsetMatch with anything else than a CharsetMatch.
- Improved the general reliability of the detector based on user feedbacks. (#520) (#509) (#498) (#407) (#537)
- Declared charset in content (preemptive detection) not changed when converting to utf-8 bytes. (#381)
## [3.3.2](https://github.com/Ousret/charset_normalizer/compare/3.3.1...3.3.2) (2023-10-31)
### Fixed
- Unintentional memory usage regression when using large payload that match several encoding (#376)
- Regression on some detection case showcased in the documentation (#371)
### Added
- Noise (md) probe that identify malformed arabic representation due to the presence of letters in isolated form (credit to my wife)
## [3.3.1](https://github.com/Ousret/charset_normalizer/compare/3.3.0...3.3.1) (2023-10-22)
### Changed
- Optional mypyc compilation upgraded to version 1.6.1 for Python >= 3.8
- Improved the general detection reliability based on reports from the community
## [3.3.0](https://github.com/Ousret/charset_normalizer/compare/3.2.0...3.3.0) (2023-09-30)
### Added
- Allow to execute the CLI (e.g. normalizer) through `python -m charset_normalizer.cli` or `python -m charset_normalizer`
- Support for 9 forgotten encoding that are supported by Python but unlisted in `encoding.aliases` as they have no alias (#323)
### Removed
- (internal) Redundant utils.is_ascii function and unused function is_private_use_only
- (internal) charset_normalizer.assets is moved inside charset_normalizer.constant
### Changed
- (internal) Unicode code blocks in constants are updated using the latest v15.0.0 definition to improve detection
- Optional mypyc compilation upgraded to version 1.5.1 for Python >= 3.8
### Fixed
- Unable to properly sort CharsetMatch when both chaos/noise and coherence were close due to an unreachable condition in \_\_lt\_\_ (#350)
## [3.2.0](https://github.com/Ousret/charset_normalizer/compare/3.1.0...3.2.0) (2023-06-07)
### Changed
- Typehint for function `from_path` no longer enforce `PathLike` as its first argument
- Minor improvement over the global detection reliability
### Added
- Introduce function `is_binary` that relies on main capabilities, and optimized to detect binaries
- Propagate `enable_fallback` argument throughout `from_bytes`, `from_path`, and `from_fp` that allow a deeper control over the detection (default True)
- Explicit support for Python 3.12
### Fixed
- Edge case detection failure where a file would contain 'very-long' camel cased word (Issue #289)
## [3.1.0](https://github.com/Ousret/charset_normalizer/compare/3.0.1...3.1.0) (2023-03-06)
### Added
- Argument `should_rename_legacy` for legacy function `detect` and disregard any new arguments without errors (PR #262)
### Removed
- Support for Python 3.6 (PR #260)
### Changed
- Optional speedup provided by mypy/c 1.0.1
## [3.0.1](https://github.com/Ousret/charset_normalizer/compare/3.0.0...3.0.1) (2022-11-18)
### Fixed
- Multi-bytes cutter/chunk generator did not always cut correctly (PR #233)
### Changed
- Speedup provided by mypy/c 0.990 on Python >= 3.7
## [3.0.0](https://github.com/Ousret/charset_normalizer/compare/2.1.1...3.0.0) (2022-10-20)
### Added
- Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
- Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
- Add parameter `language_threshold` in `from_bytes`, `from_path` and `from_fp` to adjust the minimum expected coherence ratio
- `normalizer --version` now specify if current version provide extra speedup (meaning mypyc compilation whl)
### Changed
- Build with static metadata using 'build' frontend
- Make the language detection stricter
- Optional: Module `md.py` can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1
### Fixed
- CLI with opt --normalize fail when using full path for files
- TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it
- Sphinx warnings when generating the documentation
### Removed
- Coherence detector no longer return 'Simple English' instead return 'English'
- Coherence detector no longer return 'Classical Chinese' instead return 'Chinese'
- Breaking: Method `first()` and `best()` from CharsetMatch
- UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflict with ASCII)
- Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches
- Breaking: Top-level function `normalize`
- Breaking: Properties `chaos_secondary_pass`, `coherence_non_latin` and `w_counter` from CharsetMatch
- Support for the backport `unicodedata2`
## [3.0.0rc1](https://github.com/Ousret/charset_normalizer/compare/3.0.0b2...3.0.0rc1) (2022-10-18)
### Added
- Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
- Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
- Add parameter `language_threshold` in `from_bytes`, `from_path` and `from_fp` to adjust the minimum expected coherence ratio
### Changed
- Build with static metadata using 'build' frontend
- Make the language detection stricter
### Fixed
- CLI with opt --normalize fail when using full path for files
- TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it
### Removed
- Coherence detector no longer return 'Simple English' instead return 'English'
- Coherence detector no longer return 'Classical Chinese' instead return 'Chinese'
## [3.0.0b2](https://github.com/Ousret/charset_normalizer/compare/3.0.0b1...3.0.0b2) (2022-08-21)
### Added
- `normalizer --version` now specify if current version provide extra speedup (meaning mypyc compilation whl)
### Removed
- Breaking: Method `first()` and `best()` from CharsetMatch
- UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflict with ASCII)
### Fixed
- Sphinx warnings when generating the documentation
## [3.0.0b1](https://github.com/Ousret/charset_normalizer/compare/2.1.0...3.0.0b1) (2022-08-15)
### Changed
- Optional: Module `md.py` can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1
### Removed
- Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches
- Breaking: Top-level function `normalize`
- Breaking: Properties `chaos_secondary_pass`, `coherence_non_latin` and `w_counter` from CharsetMatch
- Support for the backport `unicodedata2`
## [2.1.1](https://github.com/Ousret/charset_normalizer/compare/2.1.0...2.1.1) (2022-08-19)
### Deprecated
- Function `normalize` scheduled for removal in 3.0
### Changed
- Removed useless call to decode in fn is_unprintable (#206)
### Fixed
- Third-party library (i18n xgettext) crashing not recognizing utf_8 (PEP 263) with underscore from [@aleksandernovikov](https://github.com/aleksandernovikov) (#204)
## [2.1.0](https://github.com/Ousret/charset_normalizer/compare/2.0.12...2.1.0) (2022-06-19)
### Added
- Output the Unicode table version when running the CLI with `--version` (PR #194)
### Changed
- Re-use decoded buffer for single byte character sets from [@nijel](https://github.com/nijel) (PR #175)
- Fixing some performance bottlenecks from [@deedy5](https://github.com/deedy5) (PR #183)
### Fixed
- Workaround potential bug in cpython with Zero Width No-Break Space located in Arabic Presentation Forms-B, Unicode 1.1 not acknowledged as space (PR #175)
- CLI default threshold aligned with the API threshold from [@oleksandr-kuzmenko](https://github.com/oleksandr-kuzmenko) (PR #181)
### Removed
- Support for Python 3.5 (PR #192)
### Deprecated
- Use of backport unicodedata from `unicodedata2` as Python is quickly catching up, scheduled for removal in 3.0 (PR #194)
## [2.0.12](https://github.com/Ousret/charset_normalizer/compare/2.0.11...2.0.12) (2022-02-12)
### Fixed
- ASCII miss-detection on rare cases (PR #170)
## [2.0.11](https://github.com/Ousret/charset_normalizer/compare/2.0.10...2.0.11) (2022-01-30)
### Added
- Explicit support for Python 3.11 (PR #164)
### Changed
- The logging behavior have been completely reviewed, now using only TRACE and DEBUG levels (PR #163 #165)
## [2.0.10](https://github.com/Ousret/charset_normalizer/compare/2.0.9...2.0.10) (2022-01-04)
### Fixed
- Fallback match entries might lead to UnicodeDecodeError for large bytes sequence (PR #154)
### Changed
- Skipping the language-detection (CD) on ASCII (PR #155)
## [2.0.9](https://github.com/Ousret/charset_normalizer/compare/2.0.8...2.0.9) (2021-12-03)
### Changed
- Moderating the logging impact (since 2.0.8) for specific environments (PR #147)
### Fixed
- Wrong logging level applied when setting kwarg `explain` to True (PR #146)
## [2.0.8](https://github.com/Ousret/charset_normalizer/compare/2.0.7...2.0.8) (2021-11-24)
### Changed
- Improvement over Vietnamese detection (PR #126)
- MD improvement on trailing data and long foreign (non-pure latin) data (PR #124)
- Efficiency improvements in cd/alphabet_languages from [@adbar](https://github.com/adbar) (PR #122)
- call sum() without an intermediary list following PEP 289 recommendations from [@adbar](https://github.com/adbar) (PR #129)
- Code style as refactored by Sourcery-AI (PR #131)
- Minor adjustment on the MD around european words (PR #133)
- Remove and replace SRTs from assets / tests (PR #139)
- Initialize the library logger with a `NullHandler` by default from [@nmaynes](https://github.com/nmaynes) (PR #135)
- Setting kwarg `explain` to True will add provisionally (bounded to function lifespan) a specific stream handler (PR #135)
### Fixed
- Fix large (misleading) sequence giving UnicodeDecodeError (PR #137)
- Avoid using too insignificant chunk (PR #137)
### Added
- Add and expose function `set_logging_handler` to configure a specific StreamHandler from [@nmaynes](https://github.com/nmaynes) (PR #135)
- Add `CHANGELOG.md` entries, format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) (PR #141)
## [2.0.7](https://github.com/Ousret/charset_normalizer/compare/2.0.6...2.0.7) (2021-10-11)
### Added
- Add support for Kazakh (Cyrillic) language detection (PR #109)
### Changed
- Further, improve inferring the language from a given single-byte code page (PR #112)
- Vainly trying to leverage PEP263 when PEP3120 is not supported (PR #116)
- Refactoring for potential performance improvements in loops from [@adbar](https://github.com/adbar) (PR #113)
- Various detection improvement (MD+CD) (PR #117)
### Removed
- Remove redundant logging entry about detected language(s) (PR #115)
### Fixed
- Fix a minor inconsistency between Python 3.5 and other versions regarding language detection (PR #117 #102)
## [2.0.6](https://github.com/Ousret/charset_normalizer/compare/2.0.5...2.0.6) (2021-09-18)
### Fixed
- Unforeseen regression with the loss of the backward-compatibility with some older minor of Python 3.5.x (PR #100)
- Fix CLI crash when using --minimal output in certain cases (PR #103)
### Changed
- Minor improvement to the detection efficiency (less than 1%) (PR #106 #101)
## [2.0.5](https://github.com/Ousret/charset_normalizer/compare/2.0.4...2.0.5) (2021-09-14)
### Changed
- The project now comply with: flake8, mypy, isort and black to ensure a better overall quality (PR #81)
- The BC-support with v1.x was improved, the old staticmethods are restored (PR #82)
- The Unicode detection is slightly improved (PR #93)
- Add syntax sugar \_\_bool\_\_ for results CharsetMatches list-container (PR #91)
### Removed
- The project no longer raise warning on tiny content given for detection, will be simply logged as warning instead (PR #92)
### Fixed
- In some rare case, the chunks extractor could cut in the middle of a multi-byte character and could mislead the mess detection (PR #95)
- Some rare 'space' characters could trip up the UnprintablePlugin/Mess detection (PR #96)
- The MANIFEST.in was not exhaustive (PR #78)
## [2.0.4](https://github.com/Ousret/charset_normalizer/compare/2.0.3...2.0.4) (2021-07-30)
### Fixed
- The CLI no longer raise an unexpected exception when no encoding has been found (PR #70)
- Fix accessing the 'alphabets' property when the payload contains surrogate characters (PR #68)
- The logger could mislead (explain=True) on detected languages and the impact of one MBCS match (PR #72)
- Submatch factoring could be wrong in rare edge cases (PR #72)
- Multiple files given to the CLI were ignored when publishing results to STDOUT. (After the first path) (PR #72)
- Fix line endings from CRLF to LF for certain project files (PR #67)
### Changed
- Adjust the MD to lower the sensitivity, thus improving the global detection reliability (PR #69 #76)
- Allow fallback on specified encoding if any (PR #71)
## [2.0.3](https://github.com/Ousret/charset_normalizer/compare/2.0.2...2.0.3) (2021-07-16)
### Changed
- Part of the detection mechanism has been improved to be less sensitive, resulting in more accurate detection results. Especially ASCII. (PR #63)
- According to the community wishes, the detection will fall back on ASCII or UTF-8 in a last-resort case. (PR #64)
## [2.0.2](https://github.com/Ousret/charset_normalizer/compare/2.0.1...2.0.2) (2021-07-15)
### Fixed
- Empty/Too small JSON payload miss-detection fixed. Report from [@tseaver](https://github.com/tseaver) (PR #59)
### Changed
- Don't inject unicodedata2 into sys.modules from [@akx](https://github.com/akx) (PR #57)
## [2.0.1](https://github.com/Ousret/charset_normalizer/compare/2.0.0...2.0.1) (2021-07-13)
### Fixed
- Make it work where there isn't a filesystem available, dropping assets frequencies.json. Report from [@sethmlarson](https://github.com/sethmlarson). (PR #55)
- Using explain=False permanently disable the verbose output in the current runtime (PR #47)
- One log entry (language target preemptive) was not show in logs when using explain=True (PR #47)
- Fix undesired exception (ValueError) on getitem of instance CharsetMatches (PR #52)
### Changed
- Public function normalize default args values were not aligned with from_bytes (PR #53)
### Added
- You may now use charset aliases in cp_isolation and cp_exclusion arguments (PR #47)
## [2.0.0](https://github.com/Ousret/charset_normalizer/compare/1.4.1...2.0.0) (2021-07-02)
### Changed
- 4x to 5 times faster than the previous 1.4.0 release. At least 2x faster than Chardet.
- Accent has been made on UTF-8 detection, should perform rather instantaneous.
- The backward compatibility with Chardet has been greatly improved. The legacy detect function returns an identical charset name whenever possible.
- The detection mechanism has been slightly improved, now Turkish content is detected correctly (most of the time)
- The program has been rewritten to ease the readability and maintainability. (+Using static typing)+
- utf_7 detection has been reinstated.
### Removed
- This package no longer require anything when used with Python 3.5 (Dropped cached_property)
- Removed support for these languages: Catalan, Esperanto, Kazakh, Baque, Volapük, Azeri, Galician, Nynorsk, Macedonian, and Serbocroatian.
- The exception hook on UnicodeDecodeError has been removed.
### Deprecated
- Methods coherence_non_latin, w_counter, chaos_secondary_pass of the class CharsetMatch are now deprecated and scheduled for removal in v3.0
### Fixed
- The CLI output used the relative path of the file(s). Should be absolute.
## [1.4.1](https://github.com/Ousret/charset_normalizer/compare/1.4.0...1.4.1) (2021-05-28)
### Fixed
- Logger configuration/usage no longer conflict with others (PR #44)
## [1.4.0](https://github.com/Ousret/charset_normalizer/compare/1.3.9...1.4.0) (2021-05-21)
### Removed
- Using standard logging instead of using the package loguru.
- Dropping nose test framework in favor of the maintained pytest.
- Choose to not use dragonmapper package to help with gibberish Chinese/CJK text.
- Require cached_property only for Python 3.5 due to constraint. Dropping for every other interpreter version.
- Stop support for UTF-7 that does not contain a SIG.
- Dropping PrettyTable, replaced with pure JSON output in CLI.
### Fixed
- BOM marker in a CharsetNormalizerMatch instance could be False in rare cases even if obviously present. Due to the sub-match factoring process.
- Not searching properly for the BOM when trying utf32/16 parent codec.
### Changed
- Improving the package final size by compressing frequencies.json.
- Huge improvement over the larges payload.
### Added
- CLI now produces JSON consumable output.
- Return ASCII if given sequences fit. Given reasonable confidence.
## [1.3.9](https://github.com/Ousret/charset_normalizer/compare/1.3.8...1.3.9) (2021-05-13)
### Fixed
- In some very rare cases, you may end up getting encode/decode errors due to a bad bytes payload (PR #40)
## [1.3.8](https://github.com/Ousret/charset_normalizer/compare/1.3.7...1.3.8) (2021-05-12)
### Fixed
- Empty given payload for detection may cause an exception if trying to access the `alphabets` property. (PR #39)
## [1.3.7](https://github.com/Ousret/charset_normalizer/compare/1.3.6...1.3.7) (2021-05-12)
### Fixed
- The legacy detect function should return UTF-8-SIG if sig is present in the payload. (PR #38)
## [1.3.6](https://github.com/Ousret/charset_normalizer/compare/1.3.5...1.3.6) (2021-02-09)
### Changed
- Amend the previous release to allow prettytable 2.0 (PR #35)
## [1.3.5](https://github.com/Ousret/charset_normalizer/compare/1.3.4...1.3.5) (2021-02-08)
### Fixed
- Fix error while using the package with a python pre-release interpreter (PR #33)
### Changed
- Dependencies refactoring, constraints revised.
### Added
- Add python 3.9 and 3.10 to the supported interpreters
MIT License
Copyright (c) 2025 TAHRI Ahmed R.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -0,0 +1,35 @@
../../../bin/normalizer,sha256=ybwi78W5hXBxrZF_P5GbryhPq3r8oPLo3R9i96xkfFI,253
charset_normalizer-3.4.4.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
charset_normalizer-3.4.4.dist-info/METADATA,sha256=jVuUFBti8dav19YLvWissTihVdF2ozUY4KKMw7jdkBQ,37303
charset_normalizer-3.4.4.dist-info/RECORD,,
charset_normalizer-3.4.4.dist-info/WHEEL,sha256=DxRnWQz-Kp9-4a4hdDHsSv0KUC3H7sN9Nbef3-8RjXU,190
charset_normalizer-3.4.4.dist-info/entry_points.txt,sha256=ADSTKrkXZ3hhdOVFi6DcUEHQRS0xfxDIE_pEz4wLIXA,65
charset_normalizer-3.4.4.dist-info/licenses/LICENSE,sha256=bQ1Bv-FwrGx9wkjJpj4lTQ-0WmDVCoJX0K-SxuJJuIc,1071
charset_normalizer-3.4.4.dist-info/top_level.txt,sha256=7ASyzePr8_xuZWJsnqJjIBtyV8vhEo0wBCv1MPRRi3Q,19
charset_normalizer/__init__.py,sha256=OKRxRv2Zhnqk00tqkN0c1BtJjm165fWXLydE52IKuHc,1590
charset_normalizer/__main__.py,sha256=yzYxMR-IhKRHYwcSlavEv8oGdwxsR89mr2X09qXGdps,109
charset_normalizer/__pycache__/__init__.cpython-312.pyc,,
charset_normalizer/__pycache__/__main__.cpython-312.pyc,,
charset_normalizer/__pycache__/api.cpython-312.pyc,,
charset_normalizer/__pycache__/cd.cpython-312.pyc,,
charset_normalizer/__pycache__/constant.cpython-312.pyc,,
charset_normalizer/__pycache__/legacy.cpython-312.pyc,,
charset_normalizer/__pycache__/md.cpython-312.pyc,,
charset_normalizer/__pycache__/models.cpython-312.pyc,,
charset_normalizer/__pycache__/utils.cpython-312.pyc,,
charset_normalizer/__pycache__/version.cpython-312.pyc,,
charset_normalizer/api.py,sha256=V07i8aVeCD8T2fSia3C-fn0i9t8qQguEBhsqszg32Ns,22668
charset_normalizer/cd.py,sha256=WKTo1HDb-H9HfCDc3Bfwq5jzS25Ziy9SE2a74SgTq88,12522
charset_normalizer/cli/__init__.py,sha256=D8I86lFk2-py45JvqxniTirSj_sFyE6sjaY_0-G1shc,136
charset_normalizer/cli/__main__.py,sha256=dMaXG6IJXRvqq8z2tig7Qb83-BpWTln55ooiku5_uvg,12646
charset_normalizer/cli/__pycache__/__init__.cpython-312.pyc,,
charset_normalizer/cli/__pycache__/__main__.cpython-312.pyc,,
charset_normalizer/constant.py,sha256=7UVY4ldYhmQMHUdgQ_sgZmzcQ0xxYxpBunqSZ-XJZ8U,42713
charset_normalizer/legacy.py,sha256=sYBzSpzsRrg_wF4LP536pG64BItw7Tqtc3SMQAHvFLM,2731
charset_normalizer/md.cpython-312-x86_64-linux-gnu.so,sha256=sZ7umtJLjKfA83NFJ7npkiDyr06zDT8cWtl6uIx2MsM,15912
charset_normalizer/md.py,sha256=-_oN3h3_X99nkFfqamD3yu45DC_wfk5odH0Tr_CQiXs,20145
charset_normalizer/md__mypyc.cpython-312-x86_64-linux-gnu.so,sha256=J2WWgLBQiO8sqdFsENp9u5V9uEH0tTwvTLszPdqhsv0,290584
charset_normalizer/models.py,sha256=lKXhOnIPtiakbK3i__J9wpOfzx3JDTKj7Dn3Rg0VaRI,12394
charset_normalizer/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
charset_normalizer/utils.py,sha256=sTejPgrdlNsKNucZfJCxJ95lMTLA0ShHLLE3n5wpT9Q,12170
charset_normalizer/version.py,sha256=nKE4qBNk5WA4LIJ_yIH_aSDfvtsyizkWMg-PUG-UZVk,115

View File

@@ -0,0 +1,7 @@
Wheel-Version: 1.0
Generator: setuptools (80.9.0)
Root-Is-Purelib: false
Tag: cp312-cp312-manylinux_2_17_x86_64
Tag: cp312-cp312-manylinux2014_x86_64
Tag: cp312-cp312-manylinux_2_28_x86_64

View File

@@ -0,0 +1,2 @@
[console_scripts]
normalizer = charset_normalizer.cli:cli_detect

View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 TAHRI Ahmed R.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -0,0 +1 @@
charset_normalizer

View File

@@ -0,0 +1,48 @@
"""
Charset-Normalizer
~~~~~~~~~~~~~~
The Real First Universal Charset Detector.
A library that helps you read text from an unknown charset encoding.
Motivated by chardet, This package is trying to resolve the issue by taking a new approach.
All IANA character set names for which the Python core library provides codecs are supported.
Basic usage:
>>> from charset_normalizer import from_bytes
>>> results = from_bytes('Bсеки човек има право на образование. Oбразованието!'.encode('utf_8'))
>>> best_guess = results.best()
>>> str(best_guess)
'Bсеки човек има право на образование. Oбразованието!'
Others methods and usages are available - see the full documentation
at <https://github.com/Ousret/charset_normalizer>.
:copyright: (c) 2021 by Ahmed TAHRI
:license: MIT, see LICENSE for more details.
"""
from __future__ import annotations
import logging
from .api import from_bytes, from_fp, from_path, is_binary
from .legacy import detect
from .models import CharsetMatch, CharsetMatches
from .utils import set_logging_handler
from .version import VERSION, __version__
__all__ = (
"from_fp",
"from_path",
"from_bytes",
"is_binary",
"detect",
"CharsetMatch",
"CharsetMatches",
"__version__",
"VERSION",
"set_logging_handler",
)
# Attach a NullHandler to the top level logger by default
# https://docs.python.org/3.3/howto/logging.html#configuring-logging-for-a-library
logging.getLogger("charset_normalizer").addHandler(logging.NullHandler())

View File

@@ -0,0 +1,6 @@
from __future__ import annotations
from .cli import cli_detect
if __name__ == "__main__":
cli_detect()

View File

@@ -0,0 +1,669 @@
from __future__ import annotations
import logging
from os import PathLike
from typing import BinaryIO
from .cd import (
coherence_ratio,
encoding_languages,
mb_encoding_languages,
merge_coherence_ratios,
)
from .constant import IANA_SUPPORTED, TOO_BIG_SEQUENCE, TOO_SMALL_SEQUENCE, TRACE
from .md import mess_ratio
from .models import CharsetMatch, CharsetMatches
from .utils import (
any_specified_encoding,
cut_sequence_chunks,
iana_name,
identify_sig_or_bom,
is_cp_similar,
is_multi_byte_encoding,
should_strip_sig_or_bom,
)
logger = logging.getLogger("charset_normalizer")
explain_handler = logging.StreamHandler()
explain_handler.setFormatter(
logging.Formatter("%(asctime)s | %(levelname)s | %(message)s")
)
def from_bytes(
sequences: bytes | bytearray,
steps: int = 5,
chunk_size: int = 512,
threshold: float = 0.2,
cp_isolation: list[str] | None = None,
cp_exclusion: list[str] | None = None,
preemptive_behaviour: bool = True,
explain: bool = False,
language_threshold: float = 0.1,
enable_fallback: bool = True,
) -> CharsetMatches:
"""
Given a raw bytes sequence, return the best possibles charset usable to render str objects.
If there is no results, it is a strong indicator that the source is binary/not text.
By default, the process will extract 5 blocks of 512o each to assess the mess and coherence of a given sequence.
And will give up a particular code page after 20% of measured mess. Those criteria are customizable at will.
The preemptive behavior DOES NOT replace the traditional detection workflow, it prioritize a particular code page
but never take it for granted. Can improve the performance.
You may want to focus your attention to some code page or/and not others, use cp_isolation and cp_exclusion for that
purpose.
This function will strip the SIG in the payload/sequence every time except on UTF-16, UTF-32.
By default the library does not setup any handler other than the NullHandler, if you choose to set the 'explain'
toggle to True it will alter the logger configuration to add a StreamHandler that is suitable for debugging.
Custom logging format and handler can be set manually.
"""
if not isinstance(sequences, (bytearray, bytes)):
raise TypeError(
"Expected object of type bytes or bytearray, got: {}".format(
type(sequences)
)
)
if explain:
previous_logger_level: int = logger.level
logger.addHandler(explain_handler)
logger.setLevel(TRACE)
length: int = len(sequences)
if length == 0:
logger.debug("Encoding detection on empty bytes, assuming utf_8 intention.")
if explain: # Defensive: ensure exit path clean handler
logger.removeHandler(explain_handler)
logger.setLevel(previous_logger_level or logging.WARNING)
return CharsetMatches([CharsetMatch(sequences, "utf_8", 0.0, False, [], "")])
if cp_isolation is not None:
logger.log(
TRACE,
"cp_isolation is set. use this flag for debugging purpose. "
"limited list of encoding allowed : %s.",
", ".join(cp_isolation),
)
cp_isolation = [iana_name(cp, False) for cp in cp_isolation]
else:
cp_isolation = []
if cp_exclusion is not None:
logger.log(
TRACE,
"cp_exclusion is set. use this flag for debugging purpose. "
"limited list of encoding excluded : %s.",
", ".join(cp_exclusion),
)
cp_exclusion = [iana_name(cp, False) for cp in cp_exclusion]
else:
cp_exclusion = []
if length <= (chunk_size * steps):
logger.log(
TRACE,
"override steps (%i) and chunk_size (%i) as content does not fit (%i byte(s) given) parameters.",
steps,
chunk_size,
length,
)
steps = 1
chunk_size = length
if steps > 1 and length / steps < chunk_size:
chunk_size = int(length / steps)
is_too_small_sequence: bool = len(sequences) < TOO_SMALL_SEQUENCE
is_too_large_sequence: bool = len(sequences) >= TOO_BIG_SEQUENCE
if is_too_small_sequence:
logger.log(
TRACE,
"Trying to detect encoding from a tiny portion of ({}) byte(s).".format(
length
),
)
elif is_too_large_sequence:
logger.log(
TRACE,
"Using lazy str decoding because the payload is quite large, ({}) byte(s).".format(
length
),
)
prioritized_encodings: list[str] = []
specified_encoding: str | None = (
any_specified_encoding(sequences) if preemptive_behaviour else None
)
if specified_encoding is not None:
prioritized_encodings.append(specified_encoding)
logger.log(
TRACE,
"Detected declarative mark in sequence. Priority +1 given for %s.",
specified_encoding,
)
tested: set[str] = set()
tested_but_hard_failure: list[str] = []
tested_but_soft_failure: list[str] = []
fallback_ascii: CharsetMatch | None = None
fallback_u8: CharsetMatch | None = None
fallback_specified: CharsetMatch | None = None
results: CharsetMatches = CharsetMatches()
early_stop_results: CharsetMatches = CharsetMatches()
sig_encoding, sig_payload = identify_sig_or_bom(sequences)
if sig_encoding is not None:
prioritized_encodings.append(sig_encoding)
logger.log(
TRACE,
"Detected a SIG or BOM mark on first %i byte(s). Priority +1 given for %s.",
len(sig_payload),
sig_encoding,
)
prioritized_encodings.append("ascii")
if "utf_8" not in prioritized_encodings:
prioritized_encodings.append("utf_8")
for encoding_iana in prioritized_encodings + IANA_SUPPORTED:
if cp_isolation and encoding_iana not in cp_isolation:
continue
if cp_exclusion and encoding_iana in cp_exclusion:
continue
if encoding_iana in tested:
continue
tested.add(encoding_iana)
decoded_payload: str | None = None
bom_or_sig_available: bool = sig_encoding == encoding_iana
strip_sig_or_bom: bool = bom_or_sig_available and should_strip_sig_or_bom(
encoding_iana
)
if encoding_iana in {"utf_16", "utf_32"} and not bom_or_sig_available:
logger.log(
TRACE,
"Encoding %s won't be tested as-is because it require a BOM. Will try some sub-encoder LE/BE.",
encoding_iana,
)
continue
if encoding_iana in {"utf_7"} and not bom_or_sig_available:
logger.log(
TRACE,
"Encoding %s won't be tested as-is because detection is unreliable without BOM/SIG.",
encoding_iana,
)
continue
try:
is_multi_byte_decoder: bool = is_multi_byte_encoding(encoding_iana)
except (ModuleNotFoundError, ImportError):
logger.log(
TRACE,
"Encoding %s does not provide an IncrementalDecoder",
encoding_iana,
)
continue
try:
if is_too_large_sequence and is_multi_byte_decoder is False:
str(
(
sequences[: int(50e4)]
if strip_sig_or_bom is False
else sequences[len(sig_payload) : int(50e4)]
),
encoding=encoding_iana,
)
else:
decoded_payload = str(
(
sequences
if strip_sig_or_bom is False
else sequences[len(sig_payload) :]
),
encoding=encoding_iana,
)
except (UnicodeDecodeError, LookupError) as e:
if not isinstance(e, LookupError):
logger.log(
TRACE,
"Code page %s does not fit given bytes sequence at ALL. %s",
encoding_iana,
str(e),
)
tested_but_hard_failure.append(encoding_iana)
continue
similar_soft_failure_test: bool = False
for encoding_soft_failed in tested_but_soft_failure:
if is_cp_similar(encoding_iana, encoding_soft_failed):
similar_soft_failure_test = True
break
if similar_soft_failure_test:
logger.log(
TRACE,
"%s is deemed too similar to code page %s and was consider unsuited already. Continuing!",
encoding_iana,
encoding_soft_failed,
)
continue
r_ = range(
0 if not bom_or_sig_available else len(sig_payload),
length,
int(length / steps),
)
multi_byte_bonus: bool = (
is_multi_byte_decoder
and decoded_payload is not None
and len(decoded_payload) < length
)
if multi_byte_bonus:
logger.log(
TRACE,
"Code page %s is a multi byte encoding table and it appear that at least one character "
"was encoded using n-bytes.",
encoding_iana,
)
max_chunk_gave_up: int = int(len(r_) / 4)
max_chunk_gave_up = max(max_chunk_gave_up, 2)
early_stop_count: int = 0
lazy_str_hard_failure = False
md_chunks: list[str] = []
md_ratios = []
try:
for chunk in cut_sequence_chunks(
sequences,
encoding_iana,
r_,
chunk_size,
bom_or_sig_available,
strip_sig_or_bom,
sig_payload,
is_multi_byte_decoder,
decoded_payload,
):
md_chunks.append(chunk)
md_ratios.append(
mess_ratio(
chunk,
threshold,
explain is True and 1 <= len(cp_isolation) <= 2,
)
)
if md_ratios[-1] >= threshold:
early_stop_count += 1
if (early_stop_count >= max_chunk_gave_up) or (
bom_or_sig_available and strip_sig_or_bom is False
):
break
except (
UnicodeDecodeError
) as e: # Lazy str loading may have missed something there
logger.log(
TRACE,
"LazyStr Loading: After MD chunk decode, code page %s does not fit given bytes sequence at ALL. %s",
encoding_iana,
str(e),
)
early_stop_count = max_chunk_gave_up
lazy_str_hard_failure = True
# We might want to check the sequence again with the whole content
# Only if initial MD tests passes
if (
not lazy_str_hard_failure
and is_too_large_sequence
and not is_multi_byte_decoder
):
try:
sequences[int(50e3) :].decode(encoding_iana, errors="strict")
except UnicodeDecodeError as e:
logger.log(
TRACE,
"LazyStr Loading: After final lookup, code page %s does not fit given bytes sequence at ALL. %s",
encoding_iana,
str(e),
)
tested_but_hard_failure.append(encoding_iana)
continue
mean_mess_ratio: float = sum(md_ratios) / len(md_ratios) if md_ratios else 0.0
if mean_mess_ratio >= threshold or early_stop_count >= max_chunk_gave_up:
tested_but_soft_failure.append(encoding_iana)
logger.log(
TRACE,
"%s was excluded because of initial chaos probing. Gave up %i time(s). "
"Computed mean chaos is %f %%.",
encoding_iana,
early_stop_count,
round(mean_mess_ratio * 100, ndigits=3),
)
# Preparing those fallbacks in case we got nothing.
if (
enable_fallback
and encoding_iana
in ["ascii", "utf_8", specified_encoding, "utf_16", "utf_32"]
and not lazy_str_hard_failure
):
fallback_entry = CharsetMatch(
sequences,
encoding_iana,
threshold,
bom_or_sig_available,
[],
decoded_payload,
preemptive_declaration=specified_encoding,
)
if encoding_iana == specified_encoding:
fallback_specified = fallback_entry
elif encoding_iana == "ascii":
fallback_ascii = fallback_entry
else:
fallback_u8 = fallback_entry
continue
logger.log(
TRACE,
"%s passed initial chaos probing. Mean measured chaos is %f %%",
encoding_iana,
round(mean_mess_ratio * 100, ndigits=3),
)
if not is_multi_byte_decoder:
target_languages: list[str] = encoding_languages(encoding_iana)
else:
target_languages = mb_encoding_languages(encoding_iana)
if target_languages:
logger.log(
TRACE,
"{} should target any language(s) of {}".format(
encoding_iana, str(target_languages)
),
)
cd_ratios = []
# We shall skip the CD when its about ASCII
# Most of the time its not relevant to run "language-detection" on it.
if encoding_iana != "ascii":
for chunk in md_chunks:
chunk_languages = coherence_ratio(
chunk,
language_threshold,
",".join(target_languages) if target_languages else None,
)
cd_ratios.append(chunk_languages)
cd_ratios_merged = merge_coherence_ratios(cd_ratios)
if cd_ratios_merged:
logger.log(
TRACE,
"We detected language {} using {}".format(
cd_ratios_merged, encoding_iana
),
)
current_match = CharsetMatch(
sequences,
encoding_iana,
mean_mess_ratio,
bom_or_sig_available,
cd_ratios_merged,
(
decoded_payload
if (
is_too_large_sequence is False
or encoding_iana in [specified_encoding, "ascii", "utf_8"]
)
else None
),
preemptive_declaration=specified_encoding,
)
results.append(current_match)
if (
encoding_iana in [specified_encoding, "ascii", "utf_8"]
and mean_mess_ratio < 0.1
):
# If md says nothing to worry about, then... stop immediately!
if mean_mess_ratio == 0.0:
logger.debug(
"Encoding detection: %s is most likely the one.",
current_match.encoding,
)
if explain: # Defensive: ensure exit path clean handler
logger.removeHandler(explain_handler)
logger.setLevel(previous_logger_level)
return CharsetMatches([current_match])
early_stop_results.append(current_match)
if (
len(early_stop_results)
and (specified_encoding is None or specified_encoding in tested)
and "ascii" in tested
and "utf_8" in tested
):
probable_result: CharsetMatch = early_stop_results.best() # type: ignore[assignment]
logger.debug(
"Encoding detection: %s is most likely the one.",
probable_result.encoding,
)
if explain: # Defensive: ensure exit path clean handler
logger.removeHandler(explain_handler)
logger.setLevel(previous_logger_level)
return CharsetMatches([probable_result])
if encoding_iana == sig_encoding:
logger.debug(
"Encoding detection: %s is most likely the one as we detected a BOM or SIG within "
"the beginning of the sequence.",
encoding_iana,
)
if explain: # Defensive: ensure exit path clean handler
logger.removeHandler(explain_handler)
logger.setLevel(previous_logger_level)
return CharsetMatches([results[encoding_iana]])
if len(results) == 0:
if fallback_u8 or fallback_ascii or fallback_specified:
logger.log(
TRACE,
"Nothing got out of the detection process. Using ASCII/UTF-8/Specified fallback.",
)
if fallback_specified:
logger.debug(
"Encoding detection: %s will be used as a fallback match",
fallback_specified.encoding,
)
results.append(fallback_specified)
elif (
(fallback_u8 and fallback_ascii is None)
or (
fallback_u8
and fallback_ascii
and fallback_u8.fingerprint != fallback_ascii.fingerprint
)
or (fallback_u8 is not None)
):
logger.debug("Encoding detection: utf_8 will be used as a fallback match")
results.append(fallback_u8)
elif fallback_ascii:
logger.debug("Encoding detection: ascii will be used as a fallback match")
results.append(fallback_ascii)
if results:
logger.debug(
"Encoding detection: Found %s as plausible (best-candidate) for content. With %i alternatives.",
results.best().encoding, # type: ignore
len(results) - 1,
)
else:
logger.debug("Encoding detection: Unable to determine any suitable charset.")
if explain:
logger.removeHandler(explain_handler)
logger.setLevel(previous_logger_level)
return results
def from_fp(
fp: BinaryIO,
steps: int = 5,
chunk_size: int = 512,
threshold: float = 0.20,
cp_isolation: list[str] | None = None,
cp_exclusion: list[str] | None = None,
preemptive_behaviour: bool = True,
explain: bool = False,
language_threshold: float = 0.1,
enable_fallback: bool = True,
) -> CharsetMatches:
"""
Same thing than the function from_bytes but using a file pointer that is already ready.
Will not close the file pointer.
"""
return from_bytes(
fp.read(),
steps,
chunk_size,
threshold,
cp_isolation,
cp_exclusion,
preemptive_behaviour,
explain,
language_threshold,
enable_fallback,
)
def from_path(
path: str | bytes | PathLike, # type: ignore[type-arg]
steps: int = 5,
chunk_size: int = 512,
threshold: float = 0.20,
cp_isolation: list[str] | None = None,
cp_exclusion: list[str] | None = None,
preemptive_behaviour: bool = True,
explain: bool = False,
language_threshold: float = 0.1,
enable_fallback: bool = True,
) -> CharsetMatches:
"""
Same thing than the function from_bytes but with one extra step. Opening and reading given file path in binary mode.
Can raise IOError.
"""
with open(path, "rb") as fp:
return from_fp(
fp,
steps,
chunk_size,
threshold,
cp_isolation,
cp_exclusion,
preemptive_behaviour,
explain,
language_threshold,
enable_fallback,
)
def is_binary(
fp_or_path_or_payload: PathLike | str | BinaryIO | bytes, # type: ignore[type-arg]
steps: int = 5,
chunk_size: int = 512,
threshold: float = 0.20,
cp_isolation: list[str] | None = None,
cp_exclusion: list[str] | None = None,
preemptive_behaviour: bool = True,
explain: bool = False,
language_threshold: float = 0.1,
enable_fallback: bool = False,
) -> bool:
"""
Detect if the given input (file, bytes, or path) points to a binary file. aka. not a string.
Based on the same main heuristic algorithms and default kwargs at the sole exception that fallbacks match
are disabled to be stricter around ASCII-compatible but unlikely to be a string.
"""
if isinstance(fp_or_path_or_payload, (str, PathLike)):
guesses = from_path(
fp_or_path_or_payload,
steps=steps,
chunk_size=chunk_size,
threshold=threshold,
cp_isolation=cp_isolation,
cp_exclusion=cp_exclusion,
preemptive_behaviour=preemptive_behaviour,
explain=explain,
language_threshold=language_threshold,
enable_fallback=enable_fallback,
)
elif isinstance(
fp_or_path_or_payload,
(
bytes,
bytearray,
),
):
guesses = from_bytes(
fp_or_path_or_payload,
steps=steps,
chunk_size=chunk_size,
threshold=threshold,
cp_isolation=cp_isolation,
cp_exclusion=cp_exclusion,
preemptive_behaviour=preemptive_behaviour,
explain=explain,
language_threshold=language_threshold,
enable_fallback=enable_fallback,
)
else:
guesses = from_fp(
fp_or_path_or_payload,
steps=steps,
chunk_size=chunk_size,
threshold=threshold,
cp_isolation=cp_isolation,
cp_exclusion=cp_exclusion,
preemptive_behaviour=preemptive_behaviour,
explain=explain,
language_threshold=language_threshold,
enable_fallback=enable_fallback,
)
return not guesses

View File

@@ -0,0 +1,395 @@
from __future__ import annotations
import importlib
from codecs import IncrementalDecoder
from collections import Counter
from functools import lru_cache
from typing import Counter as TypeCounter
from .constant import (
FREQUENCIES,
KO_NAMES,
LANGUAGE_SUPPORTED_COUNT,
TOO_SMALL_SEQUENCE,
ZH_NAMES,
)
from .md import is_suspiciously_successive_range
from .models import CoherenceMatches
from .utils import (
is_accentuated,
is_latin,
is_multi_byte_encoding,
is_unicode_range_secondary,
unicode_range,
)
def encoding_unicode_range(iana_name: str) -> list[str]:
"""
Return associated unicode ranges in a single byte code page.
"""
if is_multi_byte_encoding(iana_name):
raise OSError("Function not supported on multi-byte code page")
decoder = importlib.import_module(f"encodings.{iana_name}").IncrementalDecoder
p: IncrementalDecoder = decoder(errors="ignore")
seen_ranges: dict[str, int] = {}
character_count: int = 0
for i in range(0x40, 0xFF):
chunk: str = p.decode(bytes([i]))
if chunk:
character_range: str | None = unicode_range(chunk)
if character_range is None:
continue
if is_unicode_range_secondary(character_range) is False:
if character_range not in seen_ranges:
seen_ranges[character_range] = 0
seen_ranges[character_range] += 1
character_count += 1
return sorted(
[
character_range
for character_range in seen_ranges
if seen_ranges[character_range] / character_count >= 0.15
]
)
def unicode_range_languages(primary_range: str) -> list[str]:
"""
Return inferred languages used with a unicode range.
"""
languages: list[str] = []
for language, characters in FREQUENCIES.items():
for character in characters:
if unicode_range(character) == primary_range:
languages.append(language)
break
return languages
@lru_cache()
def encoding_languages(iana_name: str) -> list[str]:
"""
Single-byte encoding language association. Some code page are heavily linked to particular language(s).
This function does the correspondence.
"""
unicode_ranges: list[str] = encoding_unicode_range(iana_name)
primary_range: str | None = None
for specified_range in unicode_ranges:
if "Latin" not in specified_range:
primary_range = specified_range
break
if primary_range is None:
return ["Latin Based"]
return unicode_range_languages(primary_range)
@lru_cache()
def mb_encoding_languages(iana_name: str) -> list[str]:
"""
Multi-byte encoding language association. Some code page are heavily linked to particular language(s).
This function does the correspondence.
"""
if (
iana_name.startswith("shift_")
or iana_name.startswith("iso2022_jp")
or iana_name.startswith("euc_j")
or iana_name == "cp932"
):
return ["Japanese"]
if iana_name.startswith("gb") or iana_name in ZH_NAMES:
return ["Chinese"]
if iana_name.startswith("iso2022_kr") or iana_name in KO_NAMES:
return ["Korean"]
return []
@lru_cache(maxsize=LANGUAGE_SUPPORTED_COUNT)
def get_target_features(language: str) -> tuple[bool, bool]:
"""
Determine main aspects from a supported language if it contains accents and if is pure Latin.
"""
target_have_accents: bool = False
target_pure_latin: bool = True
for character in FREQUENCIES[language]:
if not target_have_accents and is_accentuated(character):
target_have_accents = True
if target_pure_latin and is_latin(character) is False:
target_pure_latin = False
return target_have_accents, target_pure_latin
def alphabet_languages(
characters: list[str], ignore_non_latin: bool = False
) -> list[str]:
"""
Return associated languages associated to given characters.
"""
languages: list[tuple[str, float]] = []
source_have_accents = any(is_accentuated(character) for character in characters)
for language, language_characters in FREQUENCIES.items():
target_have_accents, target_pure_latin = get_target_features(language)
if ignore_non_latin and target_pure_latin is False:
continue
if target_have_accents is False and source_have_accents:
continue
character_count: int = len(language_characters)
character_match_count: int = len(
[c for c in language_characters if c in characters]
)
ratio: float = character_match_count / character_count
if ratio >= 0.2:
languages.append((language, ratio))
languages = sorted(languages, key=lambda x: x[1], reverse=True)
return [compatible_language[0] for compatible_language in languages]
def characters_popularity_compare(
language: str, ordered_characters: list[str]
) -> float:
"""
Determine if a ordered characters list (by occurrence from most appearance to rarest) match a particular language.
The result is a ratio between 0. (absolutely no correspondence) and 1. (near perfect fit).
Beware that is function is not strict on the match in order to ease the detection. (Meaning close match is 1.)
"""
if language not in FREQUENCIES:
raise ValueError(f"{language} not available")
character_approved_count: int = 0
FREQUENCIES_language_set = set(FREQUENCIES[language])
ordered_characters_count: int = len(ordered_characters)
target_language_characters_count: int = len(FREQUENCIES[language])
large_alphabet: bool = target_language_characters_count > 26
for character, character_rank in zip(
ordered_characters, range(0, ordered_characters_count)
):
if character not in FREQUENCIES_language_set:
continue
character_rank_in_language: int = FREQUENCIES[language].index(character)
expected_projection_ratio: float = (
target_language_characters_count / ordered_characters_count
)
character_rank_projection: int = int(character_rank * expected_projection_ratio)
if (
large_alphabet is False
and abs(character_rank_projection - character_rank_in_language) > 4
):
continue
if (
large_alphabet is True
and abs(character_rank_projection - character_rank_in_language)
< target_language_characters_count / 3
):
character_approved_count += 1
continue
characters_before_source: list[str] = FREQUENCIES[language][
0:character_rank_in_language
]
characters_after_source: list[str] = FREQUENCIES[language][
character_rank_in_language:
]
characters_before: list[str] = ordered_characters[0:character_rank]
characters_after: list[str] = ordered_characters[character_rank:]
before_match_count: int = len(
set(characters_before) & set(characters_before_source)
)
after_match_count: int = len(
set(characters_after) & set(characters_after_source)
)
if len(characters_before_source) == 0 and before_match_count <= 4:
character_approved_count += 1
continue
if len(characters_after_source) == 0 and after_match_count <= 4:
character_approved_count += 1
continue
if (
before_match_count / len(characters_before_source) >= 0.4
or after_match_count / len(characters_after_source) >= 0.4
):
character_approved_count += 1
continue
return character_approved_count / len(ordered_characters)
def alpha_unicode_split(decoded_sequence: str) -> list[str]:
"""
Given a decoded text sequence, return a list of str. Unicode range / alphabet separation.
Ex. a text containing English/Latin with a bit a Hebrew will return two items in the resulting list;
One containing the latin letters and the other hebrew.
"""
layers: dict[str, str] = {}
for character in decoded_sequence:
if character.isalpha() is False:
continue
character_range: str | None = unicode_range(character)
if character_range is None:
continue
layer_target_range: str | None = None
for discovered_range in layers:
if (
is_suspiciously_successive_range(discovered_range, character_range)
is False
):
layer_target_range = discovered_range
break
if layer_target_range is None:
layer_target_range = character_range
if layer_target_range not in layers:
layers[layer_target_range] = character.lower()
continue
layers[layer_target_range] += character.lower()
return list(layers.values())
def merge_coherence_ratios(results: list[CoherenceMatches]) -> CoherenceMatches:
"""
This function merge results previously given by the function coherence_ratio.
The return type is the same as coherence_ratio.
"""
per_language_ratios: dict[str, list[float]] = {}
for result in results:
for sub_result in result:
language, ratio = sub_result
if language not in per_language_ratios:
per_language_ratios[language] = [ratio]
continue
per_language_ratios[language].append(ratio)
merge = [
(
language,
round(
sum(per_language_ratios[language]) / len(per_language_ratios[language]),
4,
),
)
for language in per_language_ratios
]
return sorted(merge, key=lambda x: x[1], reverse=True)
def filter_alt_coherence_matches(results: CoherenceMatches) -> CoherenceMatches:
"""
We shall NOT return "English—" in CoherenceMatches because it is an alternative
of "English". This function only keeps the best match and remove the em-dash in it.
"""
index_results: dict[str, list[float]] = dict()
for result in results:
language, ratio = result
no_em_name: str = language.replace("", "")
if no_em_name not in index_results:
index_results[no_em_name] = []
index_results[no_em_name].append(ratio)
if any(len(index_results[e]) > 1 for e in index_results):
filtered_results: CoherenceMatches = []
for language in index_results:
filtered_results.append((language, max(index_results[language])))
return filtered_results
return results
@lru_cache(maxsize=2048)
def coherence_ratio(
decoded_sequence: str, threshold: float = 0.1, lg_inclusion: str | None = None
) -> CoherenceMatches:
"""
Detect ANY language that can be identified in given sequence. The sequence will be analysed by layers.
A layer = Character extraction by alphabets/ranges.
"""
results: list[tuple[str, float]] = []
ignore_non_latin: bool = False
sufficient_match_count: int = 0
lg_inclusion_list = lg_inclusion.split(",") if lg_inclusion is not None else []
if "Latin Based" in lg_inclusion_list:
ignore_non_latin = True
lg_inclusion_list.remove("Latin Based")
for layer in alpha_unicode_split(decoded_sequence):
sequence_frequencies: TypeCounter[str] = Counter(layer)
most_common = sequence_frequencies.most_common()
character_count: int = sum(o for c, o in most_common)
if character_count <= TOO_SMALL_SEQUENCE:
continue
popular_character_ordered: list[str] = [c for c, o in most_common]
for language in lg_inclusion_list or alphabet_languages(
popular_character_ordered, ignore_non_latin
):
ratio: float = characters_popularity_compare(
language, popular_character_ordered
)
if ratio < threshold:
continue
elif ratio >= 0.8:
sufficient_match_count += 1
results.append((language, round(ratio, 4)))
if sufficient_match_count >= 3:
break
return sorted(
filter_alt_coherence_matches(results), key=lambda x: x[1], reverse=True
)

View File

@@ -0,0 +1,8 @@
from __future__ import annotations
from .__main__ import cli_detect, query_yes_no
__all__ = (
"cli_detect",
"query_yes_no",
)

View File

@@ -0,0 +1,381 @@
from __future__ import annotations
import argparse
import sys
import typing
from json import dumps
from os.path import abspath, basename, dirname, join, realpath
from platform import python_version
from unicodedata import unidata_version
import charset_normalizer.md as md_module
from charset_normalizer import from_fp
from charset_normalizer.models import CliDetectionResult
from charset_normalizer.version import __version__
def query_yes_no(question: str, default: str = "yes") -> bool:
"""Ask a yes/no question via input() and return their answer.
"question" is a string that is presented to the user.
"default" is the presumed answer if the user just hits <Enter>.
It must be "yes" (the default), "no" or None (meaning
an answer is required of the user).
The "answer" return value is True for "yes" or False for "no".
Credit goes to (c) https://stackoverflow.com/questions/3041986/apt-command-line-interface-like-yes-no-input
"""
valid = {"yes": True, "y": True, "ye": True, "no": False, "n": False}
if default is None:
prompt = " [y/n] "
elif default == "yes":
prompt = " [Y/n] "
elif default == "no":
prompt = " [y/N] "
else:
raise ValueError("invalid default answer: '%s'" % default)
while True:
sys.stdout.write(question + prompt)
choice = input().lower()
if default is not None and choice == "":
return valid[default]
elif choice in valid:
return valid[choice]
else:
sys.stdout.write("Please respond with 'yes' or 'no' (or 'y' or 'n').\n")
class FileType:
"""Factory for creating file object types
Instances of FileType are typically passed as type= arguments to the
ArgumentParser add_argument() method.
Keyword Arguments:
- mode -- A string indicating how the file is to be opened. Accepts the
same values as the builtin open() function.
- bufsize -- The file's desired buffer size. Accepts the same values as
the builtin open() function.
- encoding -- The file's encoding. Accepts the same values as the
builtin open() function.
- errors -- A string indicating how encoding and decoding errors are to
be handled. Accepts the same value as the builtin open() function.
Backported from CPython 3.12
"""
def __init__(
self,
mode: str = "r",
bufsize: int = -1,
encoding: str | None = None,
errors: str | None = None,
):
self._mode = mode
self._bufsize = bufsize
self._encoding = encoding
self._errors = errors
def __call__(self, string: str) -> typing.IO: # type: ignore[type-arg]
# the special argument "-" means sys.std{in,out}
if string == "-":
if "r" in self._mode:
return sys.stdin.buffer if "b" in self._mode else sys.stdin
elif any(c in self._mode for c in "wax"):
return sys.stdout.buffer if "b" in self._mode else sys.stdout
else:
msg = f'argument "-" with mode {self._mode}'
raise ValueError(msg)
# all other arguments are used as file names
try:
return open(string, self._mode, self._bufsize, self._encoding, self._errors)
except OSError as e:
message = f"can't open '{string}': {e}"
raise argparse.ArgumentTypeError(message)
def __repr__(self) -> str:
args = self._mode, self._bufsize
kwargs = [("encoding", self._encoding), ("errors", self._errors)]
args_str = ", ".join(
[repr(arg) for arg in args if arg != -1]
+ [f"{kw}={arg!r}" for kw, arg in kwargs if arg is not None]
)
return f"{type(self).__name__}({args_str})"
def cli_detect(argv: list[str] | None = None) -> int:
"""
CLI assistant using ARGV and ArgumentParser
:param argv:
:return: 0 if everything is fine, anything else equal trouble
"""
parser = argparse.ArgumentParser(
description="The Real First Universal Charset Detector. "
"Discover originating encoding used on text file. "
"Normalize text to unicode."
)
parser.add_argument(
"files", type=FileType("rb"), nargs="+", help="File(s) to be analysed"
)
parser.add_argument(
"-v",
"--verbose",
action="store_true",
default=False,
dest="verbose",
help="Display complementary information about file if any. "
"Stdout will contain logs about the detection process.",
)
parser.add_argument(
"-a",
"--with-alternative",
action="store_true",
default=False,
dest="alternatives",
help="Output complementary possibilities if any. Top-level JSON WILL be a list.",
)
parser.add_argument(
"-n",
"--normalize",
action="store_true",
default=False,
dest="normalize",
help="Permit to normalize input file. If not set, program does not write anything.",
)
parser.add_argument(
"-m",
"--minimal",
action="store_true",
default=False,
dest="minimal",
help="Only output the charset detected to STDOUT. Disabling JSON output.",
)
parser.add_argument(
"-r",
"--replace",
action="store_true",
default=False,
dest="replace",
help="Replace file when trying to normalize it instead of creating a new one.",
)
parser.add_argument(
"-f",
"--force",
action="store_true",
default=False,
dest="force",
help="Replace file without asking if you are sure, use this flag with caution.",
)
parser.add_argument(
"-i",
"--no-preemptive",
action="store_true",
default=False,
dest="no_preemptive",
help="Disable looking at a charset declaration to hint the detector.",
)
parser.add_argument(
"-t",
"--threshold",
action="store",
default=0.2,
type=float,
dest="threshold",
help="Define a custom maximum amount of noise allowed in decoded content. 0. <= noise <= 1.",
)
parser.add_argument(
"--version",
action="version",
version="Charset-Normalizer {} - Python {} - Unicode {} - SpeedUp {}".format(
__version__,
python_version(),
unidata_version,
"OFF" if md_module.__file__.lower().endswith(".py") else "ON",
),
help="Show version information and exit.",
)
args = parser.parse_args(argv)
if args.replace is True and args.normalize is False:
if args.files:
for my_file in args.files:
my_file.close()
print("Use --replace in addition of --normalize only.", file=sys.stderr)
return 1
if args.force is True and args.replace is False:
if args.files:
for my_file in args.files:
my_file.close()
print("Use --force in addition of --replace only.", file=sys.stderr)
return 1
if args.threshold < 0.0 or args.threshold > 1.0:
if args.files:
for my_file in args.files:
my_file.close()
print("--threshold VALUE should be between 0. AND 1.", file=sys.stderr)
return 1
x_ = []
for my_file in args.files:
matches = from_fp(
my_file,
threshold=args.threshold,
explain=args.verbose,
preemptive_behaviour=args.no_preemptive is False,
)
best_guess = matches.best()
if best_guess is None:
print(
'Unable to identify originating encoding for "{}". {}'.format(
my_file.name,
(
"Maybe try increasing maximum amount of chaos."
if args.threshold < 1.0
else ""
),
),
file=sys.stderr,
)
x_.append(
CliDetectionResult(
abspath(my_file.name),
None,
[],
[],
"Unknown",
[],
False,
1.0,
0.0,
None,
True,
)
)
else:
x_.append(
CliDetectionResult(
abspath(my_file.name),
best_guess.encoding,
best_guess.encoding_aliases,
[
cp
for cp in best_guess.could_be_from_charset
if cp != best_guess.encoding
],
best_guess.language,
best_guess.alphabets,
best_guess.bom,
best_guess.percent_chaos,
best_guess.percent_coherence,
None,
True,
)
)
if len(matches) > 1 and args.alternatives:
for el in matches:
if el != best_guess:
x_.append(
CliDetectionResult(
abspath(my_file.name),
el.encoding,
el.encoding_aliases,
[
cp
for cp in el.could_be_from_charset
if cp != el.encoding
],
el.language,
el.alphabets,
el.bom,
el.percent_chaos,
el.percent_coherence,
None,
False,
)
)
if args.normalize is True:
if best_guess.encoding.startswith("utf") is True:
print(
'"{}" file does not need to be normalized, as it already came from unicode.'.format(
my_file.name
),
file=sys.stderr,
)
if my_file.closed is False:
my_file.close()
continue
dir_path = dirname(realpath(my_file.name))
file_name = basename(realpath(my_file.name))
o_: list[str] = file_name.split(".")
if args.replace is False:
o_.insert(-1, best_guess.encoding)
if my_file.closed is False:
my_file.close()
elif (
args.force is False
and query_yes_no(
'Are you sure to normalize "{}" by replacing it ?'.format(
my_file.name
),
"no",
)
is False
):
if my_file.closed is False:
my_file.close()
continue
try:
x_[0].unicode_path = join(dir_path, ".".join(o_))
with open(x_[0].unicode_path, "wb") as fp:
fp.write(best_guess.output())
except OSError as e:
print(str(e), file=sys.stderr)
if my_file.closed is False:
my_file.close()
return 2
if my_file.closed is False:
my_file.close()
if args.minimal is False:
print(
dumps(
[el.__dict__ for el in x_] if len(x_) > 1 else x_[0].__dict__,
ensure_ascii=True,
indent=4,
)
)
else:
for my_file in args.files:
print(
", ".join(
[
el.encoding or "undefined"
for el in x_
if el.path == abspath(my_file.name)
]
)
)
return 0
if __name__ == "__main__":
cli_detect()

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,80 @@
from __future__ import annotations
from typing import TYPE_CHECKING, Any
from warnings import warn
from .api import from_bytes
from .constant import CHARDET_CORRESPONDENCE, TOO_SMALL_SEQUENCE
# TODO: remove this check when dropping Python 3.7 support
if TYPE_CHECKING:
from typing_extensions import TypedDict
class ResultDict(TypedDict):
encoding: str | None
language: str
confidence: float | None
def detect(
byte_str: bytes, should_rename_legacy: bool = False, **kwargs: Any
) -> ResultDict:
"""
chardet legacy method
Detect the encoding of the given byte string. It should be mostly backward-compatible.
Encoding name will match Chardet own writing whenever possible. (Not on encoding name unsupported by it)
This function is deprecated and should be used to migrate your project easily, consult the documentation for
further information. Not planned for removal.
:param byte_str: The byte sequence to examine.
:param should_rename_legacy: Should we rename legacy encodings
to their more modern equivalents?
"""
if len(kwargs):
warn(
f"charset-normalizer disregard arguments '{','.join(list(kwargs.keys()))}' in legacy function detect()"
)
if not isinstance(byte_str, (bytearray, bytes)):
raise TypeError( # pragma: nocover
f"Expected object of type bytes or bytearray, got: {type(byte_str)}"
)
if isinstance(byte_str, bytearray):
byte_str = bytes(byte_str)
r = from_bytes(byte_str).best()
encoding = r.encoding if r is not None else None
language = r.language if r is not None and r.language != "Unknown" else ""
confidence = 1.0 - r.chaos if r is not None else None
# automatically lower confidence
# on small bytes samples.
# https://github.com/jawah/charset_normalizer/issues/391
if (
confidence is not None
and confidence >= 0.9
and encoding
not in {
"utf_8",
"ascii",
}
and r.bom is False # type: ignore[union-attr]
and len(byte_str) < TOO_SMALL_SEQUENCE
):
confidence -= 0.2
# Note: CharsetNormalizer does not return 'UTF-8-SIG' as the sig get stripped in the detection/normalization process
# but chardet does return 'utf-8-sig' and it is a valid codec name.
if r is not None and encoding == "utf_8" and r.bom:
encoding += "_sig"
if should_rename_legacy is False and encoding in CHARDET_CORRESPONDENCE:
encoding = CHARDET_CORRESPONDENCE[encoding]
return {
"encoding": encoding,
"language": language,
"confidence": confidence,
}

View File

@@ -0,0 +1,635 @@
from __future__ import annotations
from functools import lru_cache
from logging import getLogger
from .constant import (
COMMON_SAFE_ASCII_CHARACTERS,
TRACE,
UNICODE_SECONDARY_RANGE_KEYWORD,
)
from .utils import (
is_accentuated,
is_arabic,
is_arabic_isolated_form,
is_case_variable,
is_cjk,
is_emoticon,
is_hangul,
is_hiragana,
is_katakana,
is_latin,
is_punctuation,
is_separator,
is_symbol,
is_thai,
is_unprintable,
remove_accent,
unicode_range,
is_cjk_uncommon,
)
class MessDetectorPlugin:
"""
Base abstract class used for mess detection plugins.
All detectors MUST extend and implement given methods.
"""
def eligible(self, character: str) -> bool:
"""
Determine if given character should be fed in.
"""
raise NotImplementedError # pragma: nocover
def feed(self, character: str) -> None:
"""
The main routine to be executed upon character.
Insert the logic in witch the text would be considered chaotic.
"""
raise NotImplementedError # pragma: nocover
def reset(self) -> None: # pragma: no cover
"""
Permit to reset the plugin to the initial state.
"""
raise NotImplementedError
@property
def ratio(self) -> float:
"""
Compute the chaos ratio based on what your feed() has seen.
Must NOT be lower than 0.; No restriction gt 0.
"""
raise NotImplementedError # pragma: nocover
class TooManySymbolOrPunctuationPlugin(MessDetectorPlugin):
def __init__(self) -> None:
self._punctuation_count: int = 0
self._symbol_count: int = 0
self._character_count: int = 0
self._last_printable_char: str | None = None
self._frenzy_symbol_in_word: bool = False
def eligible(self, character: str) -> bool:
return character.isprintable()
def feed(self, character: str) -> None:
self._character_count += 1
if (
character != self._last_printable_char
and character not in COMMON_SAFE_ASCII_CHARACTERS
):
if is_punctuation(character):
self._punctuation_count += 1
elif (
character.isdigit() is False
and is_symbol(character)
and is_emoticon(character) is False
):
self._symbol_count += 2
self._last_printable_char = character
def reset(self) -> None: # Abstract
self._punctuation_count = 0
self._character_count = 0
self._symbol_count = 0
@property
def ratio(self) -> float:
if self._character_count == 0:
return 0.0
ratio_of_punctuation: float = (
self._punctuation_count + self._symbol_count
) / self._character_count
return ratio_of_punctuation if ratio_of_punctuation >= 0.3 else 0.0
class TooManyAccentuatedPlugin(MessDetectorPlugin):
def __init__(self) -> None:
self._character_count: int = 0
self._accentuated_count: int = 0
def eligible(self, character: str) -> bool:
return character.isalpha()
def feed(self, character: str) -> None:
self._character_count += 1
if is_accentuated(character):
self._accentuated_count += 1
def reset(self) -> None: # Abstract
self._character_count = 0
self._accentuated_count = 0
@property
def ratio(self) -> float:
if self._character_count < 8:
return 0.0
ratio_of_accentuation: float = self._accentuated_count / self._character_count
return ratio_of_accentuation if ratio_of_accentuation >= 0.35 else 0.0
class UnprintablePlugin(MessDetectorPlugin):
def __init__(self) -> None:
self._unprintable_count: int = 0
self._character_count: int = 0
def eligible(self, character: str) -> bool:
return True
def feed(self, character: str) -> None:
if is_unprintable(character):
self._unprintable_count += 1
self._character_count += 1
def reset(self) -> None: # Abstract
self._unprintable_count = 0
@property
def ratio(self) -> float:
if self._character_count == 0:
return 0.0
return (self._unprintable_count * 8) / self._character_count
class SuspiciousDuplicateAccentPlugin(MessDetectorPlugin):
def __init__(self) -> None:
self._successive_count: int = 0
self._character_count: int = 0
self._last_latin_character: str | None = None
def eligible(self, character: str) -> bool:
return character.isalpha() and is_latin(character)
def feed(self, character: str) -> None:
self._character_count += 1
if (
self._last_latin_character is not None
and is_accentuated(character)
and is_accentuated(self._last_latin_character)
):
if character.isupper() and self._last_latin_character.isupper():
self._successive_count += 1
# Worse if its the same char duplicated with different accent.
if remove_accent(character) == remove_accent(self._last_latin_character):
self._successive_count += 1
self._last_latin_character = character
def reset(self) -> None: # Abstract
self._successive_count = 0
self._character_count = 0
self._last_latin_character = None
@property
def ratio(self) -> float:
if self._character_count == 0:
return 0.0
return (self._successive_count * 2) / self._character_count
class SuspiciousRange(MessDetectorPlugin):
def __init__(self) -> None:
self._suspicious_successive_range_count: int = 0
self._character_count: int = 0
self._last_printable_seen: str | None = None
def eligible(self, character: str) -> bool:
return character.isprintable()
def feed(self, character: str) -> None:
self._character_count += 1
if (
character.isspace()
or is_punctuation(character)
or character in COMMON_SAFE_ASCII_CHARACTERS
):
self._last_printable_seen = None
return
if self._last_printable_seen is None:
self._last_printable_seen = character
return
unicode_range_a: str | None = unicode_range(self._last_printable_seen)
unicode_range_b: str | None = unicode_range(character)
if is_suspiciously_successive_range(unicode_range_a, unicode_range_b):
self._suspicious_successive_range_count += 1
self._last_printable_seen = character
def reset(self) -> None: # Abstract
self._character_count = 0
self._suspicious_successive_range_count = 0
self._last_printable_seen = None
@property
def ratio(self) -> float:
if self._character_count <= 13:
return 0.0
ratio_of_suspicious_range_usage: float = (
self._suspicious_successive_range_count * 2
) / self._character_count
return ratio_of_suspicious_range_usage
class SuperWeirdWordPlugin(MessDetectorPlugin):
def __init__(self) -> None:
self._word_count: int = 0
self._bad_word_count: int = 0
self._foreign_long_count: int = 0
self._is_current_word_bad: bool = False
self._foreign_long_watch: bool = False
self._character_count: int = 0
self._bad_character_count: int = 0
self._buffer: str = ""
self._buffer_accent_count: int = 0
self._buffer_glyph_count: int = 0
def eligible(self, character: str) -> bool:
return True
def feed(self, character: str) -> None:
if character.isalpha():
self._buffer += character
if is_accentuated(character):
self._buffer_accent_count += 1
if (
self._foreign_long_watch is False
and (is_latin(character) is False or is_accentuated(character))
and is_cjk(character) is False
and is_hangul(character) is False
and is_katakana(character) is False
and is_hiragana(character) is False
and is_thai(character) is False
):
self._foreign_long_watch = True
if (
is_cjk(character)
or is_hangul(character)
or is_katakana(character)
or is_hiragana(character)
or is_thai(character)
):
self._buffer_glyph_count += 1
return
if not self._buffer:
return
if (
character.isspace() or is_punctuation(character) or is_separator(character)
) and self._buffer:
self._word_count += 1
buffer_length: int = len(self._buffer)
self._character_count += buffer_length
if buffer_length >= 4:
if self._buffer_accent_count / buffer_length >= 0.5:
self._is_current_word_bad = True
# Word/Buffer ending with an upper case accentuated letter are so rare,
# that we will consider them all as suspicious. Same weight as foreign_long suspicious.
elif (
is_accentuated(self._buffer[-1])
and self._buffer[-1].isupper()
and all(_.isupper() for _ in self._buffer) is False
):
self._foreign_long_count += 1
self._is_current_word_bad = True
elif self._buffer_glyph_count == 1:
self._is_current_word_bad = True
self._foreign_long_count += 1
if buffer_length >= 24 and self._foreign_long_watch:
camel_case_dst = [
i
for c, i in zip(self._buffer, range(0, buffer_length))
if c.isupper()
]
probable_camel_cased: bool = False
if camel_case_dst and (len(camel_case_dst) / buffer_length <= 0.3):
probable_camel_cased = True
if not probable_camel_cased:
self._foreign_long_count += 1
self._is_current_word_bad = True
if self._is_current_word_bad:
self._bad_word_count += 1
self._bad_character_count += len(self._buffer)
self._is_current_word_bad = False
self._foreign_long_watch = False
self._buffer = ""
self._buffer_accent_count = 0
self._buffer_glyph_count = 0
elif (
character not in {"<", ">", "-", "=", "~", "|", "_"}
and character.isdigit() is False
and is_symbol(character)
):
self._is_current_word_bad = True
self._buffer += character
def reset(self) -> None: # Abstract
self._buffer = ""
self._is_current_word_bad = False
self._foreign_long_watch = False
self._bad_word_count = 0
self._word_count = 0
self._character_count = 0
self._bad_character_count = 0
self._foreign_long_count = 0
@property
def ratio(self) -> float:
if self._word_count <= 10 and self._foreign_long_count == 0:
return 0.0
return self._bad_character_count / self._character_count
class CjkUncommonPlugin(MessDetectorPlugin):
"""
Detect messy CJK text that probably means nothing.
"""
def __init__(self) -> None:
self._character_count: int = 0
self._uncommon_count: int = 0
def eligible(self, character: str) -> bool:
return is_cjk(character)
def feed(self, character: str) -> None:
self._character_count += 1
if is_cjk_uncommon(character):
self._uncommon_count += 1
return
def reset(self) -> None: # Abstract
self._character_count = 0
self._uncommon_count = 0
@property
def ratio(self) -> float:
if self._character_count < 8:
return 0.0
uncommon_form_usage: float = self._uncommon_count / self._character_count
# we can be pretty sure it's garbage when uncommon characters are widely
# used. otherwise it could just be traditional chinese for example.
return uncommon_form_usage / 10 if uncommon_form_usage > 0.5 else 0.0
class ArchaicUpperLowerPlugin(MessDetectorPlugin):
def __init__(self) -> None:
self._buf: bool = False
self._character_count_since_last_sep: int = 0
self._successive_upper_lower_count: int = 0
self._successive_upper_lower_count_final: int = 0
self._character_count: int = 0
self._last_alpha_seen: str | None = None
self._current_ascii_only: bool = True
def eligible(self, character: str) -> bool:
return True
def feed(self, character: str) -> None:
is_concerned = character.isalpha() and is_case_variable(character)
chunk_sep = is_concerned is False
if chunk_sep and self._character_count_since_last_sep > 0:
if (
self._character_count_since_last_sep <= 64
and character.isdigit() is False
and self._current_ascii_only is False
):
self._successive_upper_lower_count_final += (
self._successive_upper_lower_count
)
self._successive_upper_lower_count = 0
self._character_count_since_last_sep = 0
self._last_alpha_seen = None
self._buf = False
self._character_count += 1
self._current_ascii_only = True
return
if self._current_ascii_only is True and character.isascii() is False:
self._current_ascii_only = False
if self._last_alpha_seen is not None:
if (character.isupper() and self._last_alpha_seen.islower()) or (
character.islower() and self._last_alpha_seen.isupper()
):
if self._buf is True:
self._successive_upper_lower_count += 2
self._buf = False
else:
self._buf = True
else:
self._buf = False
self._character_count += 1
self._character_count_since_last_sep += 1
self._last_alpha_seen = character
def reset(self) -> None: # Abstract
self._character_count = 0
self._character_count_since_last_sep = 0
self._successive_upper_lower_count = 0
self._successive_upper_lower_count_final = 0
self._last_alpha_seen = None
self._buf = False
self._current_ascii_only = True
@property
def ratio(self) -> float:
if self._character_count == 0:
return 0.0
return self._successive_upper_lower_count_final / self._character_count
class ArabicIsolatedFormPlugin(MessDetectorPlugin):
def __init__(self) -> None:
self._character_count: int = 0
self._isolated_form_count: int = 0
def reset(self) -> None: # Abstract
self._character_count = 0
self._isolated_form_count = 0
def eligible(self, character: str) -> bool:
return is_arabic(character)
def feed(self, character: str) -> None:
self._character_count += 1
if is_arabic_isolated_form(character):
self._isolated_form_count += 1
@property
def ratio(self) -> float:
if self._character_count < 8:
return 0.0
isolated_form_usage: float = self._isolated_form_count / self._character_count
return isolated_form_usage
@lru_cache(maxsize=1024)
def is_suspiciously_successive_range(
unicode_range_a: str | None, unicode_range_b: str | None
) -> bool:
"""
Determine if two Unicode range seen next to each other can be considered as suspicious.
"""
if unicode_range_a is None or unicode_range_b is None:
return True
if unicode_range_a == unicode_range_b:
return False
if "Latin" in unicode_range_a and "Latin" in unicode_range_b:
return False
if "Emoticons" in unicode_range_a or "Emoticons" in unicode_range_b:
return False
# Latin characters can be accompanied with a combining diacritical mark
# eg. Vietnamese.
if ("Latin" in unicode_range_a or "Latin" in unicode_range_b) and (
"Combining" in unicode_range_a or "Combining" in unicode_range_b
):
return False
keywords_range_a, keywords_range_b = (
unicode_range_a.split(" "),
unicode_range_b.split(" "),
)
for el in keywords_range_a:
if el in UNICODE_SECONDARY_RANGE_KEYWORD:
continue
if el in keywords_range_b:
return False
# Japanese Exception
range_a_jp_chars, range_b_jp_chars = (
unicode_range_a
in (
"Hiragana",
"Katakana",
),
unicode_range_b in ("Hiragana", "Katakana"),
)
if (range_a_jp_chars or range_b_jp_chars) and (
"CJK" in unicode_range_a or "CJK" in unicode_range_b
):
return False
if range_a_jp_chars and range_b_jp_chars:
return False
if "Hangul" in unicode_range_a or "Hangul" in unicode_range_b:
if "CJK" in unicode_range_a or "CJK" in unicode_range_b:
return False
if unicode_range_a == "Basic Latin" or unicode_range_b == "Basic Latin":
return False
# Chinese/Japanese use dedicated range for punctuation and/or separators.
if ("CJK" in unicode_range_a or "CJK" in unicode_range_b) or (
unicode_range_a in ["Katakana", "Hiragana"]
and unicode_range_b in ["Katakana", "Hiragana"]
):
if "Punctuation" in unicode_range_a or "Punctuation" in unicode_range_b:
return False
if "Forms" in unicode_range_a or "Forms" in unicode_range_b:
return False
if unicode_range_a == "Basic Latin" or unicode_range_b == "Basic Latin":
return False
return True
@lru_cache(maxsize=2048)
def mess_ratio(
decoded_sequence: str, maximum_threshold: float = 0.2, debug: bool = False
) -> float:
"""
Compute a mess ratio given a decoded bytes sequence. The maximum threshold does stop the computation earlier.
"""
detectors: list[MessDetectorPlugin] = [
md_class() for md_class in MessDetectorPlugin.__subclasses__()
]
length: int = len(decoded_sequence) + 1
mean_mess_ratio: float = 0.0
if length < 512:
intermediary_mean_mess_ratio_calc: int = 32
elif length <= 1024:
intermediary_mean_mess_ratio_calc = 64
else:
intermediary_mean_mess_ratio_calc = 128
for character, index in zip(decoded_sequence + "\n", range(length)):
for detector in detectors:
if detector.eligible(character):
detector.feed(character)
if (
index > 0 and index % intermediary_mean_mess_ratio_calc == 0
) or index == length - 1:
mean_mess_ratio = sum(dt.ratio for dt in detectors)
if mean_mess_ratio >= maximum_threshold:
break
if debug:
logger = getLogger("charset_normalizer")
logger.log(
TRACE,
"Mess-detector extended-analysis start. "
f"intermediary_mean_mess_ratio_calc={intermediary_mean_mess_ratio_calc} mean_mess_ratio={mean_mess_ratio} "
f"maximum_threshold={maximum_threshold}",
)
if len(decoded_sequence) > 16:
logger.log(TRACE, f"Starting with: {decoded_sequence[:16]}")
logger.log(TRACE, f"Ending with: {decoded_sequence[-16::]}")
for dt in detectors:
logger.log(TRACE, f"{dt.__class__}: {dt.ratio}")
return round(mean_mess_ratio, 3)

View File

@@ -0,0 +1,360 @@
from __future__ import annotations
from encodings.aliases import aliases
from hashlib import sha256
from json import dumps
from re import sub
from typing import Any, Iterator, List, Tuple
from .constant import RE_POSSIBLE_ENCODING_INDICATION, TOO_BIG_SEQUENCE
from .utils import iana_name, is_multi_byte_encoding, unicode_range
class CharsetMatch:
def __init__(
self,
payload: bytes,
guessed_encoding: str,
mean_mess_ratio: float,
has_sig_or_bom: bool,
languages: CoherenceMatches,
decoded_payload: str | None = None,
preemptive_declaration: str | None = None,
):
self._payload: bytes = payload
self._encoding: str = guessed_encoding
self._mean_mess_ratio: float = mean_mess_ratio
self._languages: CoherenceMatches = languages
self._has_sig_or_bom: bool = has_sig_or_bom
self._unicode_ranges: list[str] | None = None
self._leaves: list[CharsetMatch] = []
self._mean_coherence_ratio: float = 0.0
self._output_payload: bytes | None = None
self._output_encoding: str | None = None
self._string: str | None = decoded_payload
self._preemptive_declaration: str | None = preemptive_declaration
def __eq__(self, other: object) -> bool:
if not isinstance(other, CharsetMatch):
if isinstance(other, str):
return iana_name(other) == self.encoding
return False
return self.encoding == other.encoding and self.fingerprint == other.fingerprint
def __lt__(self, other: object) -> bool:
"""
Implemented to make sorted available upon CharsetMatches items.
"""
if not isinstance(other, CharsetMatch):
raise ValueError
chaos_difference: float = abs(self.chaos - other.chaos)
coherence_difference: float = abs(self.coherence - other.coherence)
# Below 1% difference --> Use Coherence
if chaos_difference < 0.01 and coherence_difference > 0.02:
return self.coherence > other.coherence
elif chaos_difference < 0.01 and coherence_difference <= 0.02:
# When having a difficult decision, use the result that decoded as many multi-byte as possible.
# preserve RAM usage!
if len(self._payload) >= TOO_BIG_SEQUENCE:
return self.chaos < other.chaos
return self.multi_byte_usage > other.multi_byte_usage
return self.chaos < other.chaos
@property
def multi_byte_usage(self) -> float:
return 1.0 - (len(str(self)) / len(self.raw))
def __str__(self) -> str:
# Lazy Str Loading
if self._string is None:
self._string = str(self._payload, self._encoding, "strict")
return self._string
def __repr__(self) -> str:
return f"<CharsetMatch '{self.encoding}' bytes({self.fingerprint})>"
def add_submatch(self, other: CharsetMatch) -> None:
if not isinstance(other, CharsetMatch) or other == self:
raise ValueError(
"Unable to add instance <{}> as a submatch of a CharsetMatch".format(
other.__class__
)
)
other._string = None # Unload RAM usage; dirty trick.
self._leaves.append(other)
@property
def encoding(self) -> str:
return self._encoding
@property
def encoding_aliases(self) -> list[str]:
"""
Encoding name are known by many name, using this could help when searching for IBM855 when it's listed as CP855.
"""
also_known_as: list[str] = []
for u, p in aliases.items():
if self.encoding == u:
also_known_as.append(p)
elif self.encoding == p:
also_known_as.append(u)
return also_known_as
@property
def bom(self) -> bool:
return self._has_sig_or_bom
@property
def byte_order_mark(self) -> bool:
return self._has_sig_or_bom
@property
def languages(self) -> list[str]:
"""
Return the complete list of possible languages found in decoded sequence.
Usually not really useful. Returned list may be empty even if 'language' property return something != 'Unknown'.
"""
return [e[0] for e in self._languages]
@property
def language(self) -> str:
"""
Most probable language found in decoded sequence. If none were detected or inferred, the property will return
"Unknown".
"""
if not self._languages:
# Trying to infer the language based on the given encoding
# Its either English or we should not pronounce ourselves in certain cases.
if "ascii" in self.could_be_from_charset:
return "English"
# doing it there to avoid circular import
from charset_normalizer.cd import encoding_languages, mb_encoding_languages
languages = (
mb_encoding_languages(self.encoding)
if is_multi_byte_encoding(self.encoding)
else encoding_languages(self.encoding)
)
if len(languages) == 0 or "Latin Based" in languages:
return "Unknown"
return languages[0]
return self._languages[0][0]
@property
def chaos(self) -> float:
return self._mean_mess_ratio
@property
def coherence(self) -> float:
if not self._languages:
return 0.0
return self._languages[0][1]
@property
def percent_chaos(self) -> float:
return round(self.chaos * 100, ndigits=3)
@property
def percent_coherence(self) -> float:
return round(self.coherence * 100, ndigits=3)
@property
def raw(self) -> bytes:
"""
Original untouched bytes.
"""
return self._payload
@property
def submatch(self) -> list[CharsetMatch]:
return self._leaves
@property
def has_submatch(self) -> bool:
return len(self._leaves) > 0
@property
def alphabets(self) -> list[str]:
if self._unicode_ranges is not None:
return self._unicode_ranges
# list detected ranges
detected_ranges: list[str | None] = [unicode_range(char) for char in str(self)]
# filter and sort
self._unicode_ranges = sorted(list({r for r in detected_ranges if r}))
return self._unicode_ranges
@property
def could_be_from_charset(self) -> list[str]:
"""
The complete list of encoding that output the exact SAME str result and therefore could be the originating
encoding.
This list does include the encoding available in property 'encoding'.
"""
return [self._encoding] + [m.encoding for m in self._leaves]
def output(self, encoding: str = "utf_8") -> bytes:
"""
Method to get re-encoded bytes payload using given target encoding. Default to UTF-8.
Any errors will be simply ignored by the encoder NOT replaced.
"""
if self._output_encoding is None or self._output_encoding != encoding:
self._output_encoding = encoding
decoded_string = str(self)
if (
self._preemptive_declaration is not None
and self._preemptive_declaration.lower()
not in ["utf-8", "utf8", "utf_8"]
):
patched_header = sub(
RE_POSSIBLE_ENCODING_INDICATION,
lambda m: m.string[m.span()[0] : m.span()[1]].replace(
m.groups()[0],
iana_name(self._output_encoding).replace("_", "-"), # type: ignore[arg-type]
),
decoded_string[:8192],
count=1,
)
decoded_string = patched_header + decoded_string[8192:]
self._output_payload = decoded_string.encode(encoding, "replace")
return self._output_payload # type: ignore
@property
def fingerprint(self) -> str:
"""
Retrieve the unique SHA256 computed using the transformed (re-encoded) payload. Not the original one.
"""
return sha256(self.output()).hexdigest()
class CharsetMatches:
"""
Container with every CharsetMatch items ordered by default from most probable to the less one.
Act like a list(iterable) but does not implements all related methods.
"""
def __init__(self, results: list[CharsetMatch] | None = None):
self._results: list[CharsetMatch] = sorted(results) if results else []
def __iter__(self) -> Iterator[CharsetMatch]:
yield from self._results
def __getitem__(self, item: int | str) -> CharsetMatch:
"""
Retrieve a single item either by its position or encoding name (alias may be used here).
Raise KeyError upon invalid index or encoding not present in results.
"""
if isinstance(item, int):
return self._results[item]
if isinstance(item, str):
item = iana_name(item, False)
for result in self._results:
if item in result.could_be_from_charset:
return result
raise KeyError
def __len__(self) -> int:
return len(self._results)
def __bool__(self) -> bool:
return len(self._results) > 0
def append(self, item: CharsetMatch) -> None:
"""
Insert a single match. Will be inserted accordingly to preserve sort.
Can be inserted as a submatch.
"""
if not isinstance(item, CharsetMatch):
raise ValueError(
"Cannot append instance '{}' to CharsetMatches".format(
str(item.__class__)
)
)
# We should disable the submatch factoring when the input file is too heavy (conserve RAM usage)
if len(item.raw) < TOO_BIG_SEQUENCE:
for match in self._results:
if match.fingerprint == item.fingerprint and match.chaos == item.chaos:
match.add_submatch(item)
return
self._results.append(item)
self._results = sorted(self._results)
def best(self) -> CharsetMatch | None:
"""
Simply return the first match. Strict equivalent to matches[0].
"""
if not self._results:
return None
return self._results[0]
def first(self) -> CharsetMatch | None:
"""
Redundant method, call the method best(). Kept for BC reasons.
"""
return self.best()
CoherenceMatch = Tuple[str, float]
CoherenceMatches = List[CoherenceMatch]
class CliDetectionResult:
def __init__(
self,
path: str,
encoding: str | None,
encoding_aliases: list[str],
alternative_encodings: list[str],
language: str,
alphabets: list[str],
has_sig_or_bom: bool,
chaos: float,
coherence: float,
unicode_path: str | None,
is_preferred: bool,
):
self.path: str = path
self.unicode_path: str | None = unicode_path
self.encoding: str | None = encoding
self.encoding_aliases: list[str] = encoding_aliases
self.alternative_encodings: list[str] = alternative_encodings
self.language: str = language
self.alphabets: list[str] = alphabets
self.has_sig_or_bom: bool = has_sig_or_bom
self.chaos: float = chaos
self.coherence: float = coherence
self.is_preferred: bool = is_preferred
@property
def __dict__(self) -> dict[str, Any]: # type: ignore
return {
"path": self.path,
"encoding": self.encoding,
"encoding_aliases": self.encoding_aliases,
"alternative_encodings": self.alternative_encodings,
"language": self.language,
"alphabets": self.alphabets,
"has_sig_or_bom": self.has_sig_or_bom,
"chaos": self.chaos,
"coherence": self.coherence,
"unicode_path": self.unicode_path,
"is_preferred": self.is_preferred,
}
def to_json(self) -> str:
return dumps(self.__dict__, ensure_ascii=True, indent=4)

View File

@@ -0,0 +1,414 @@
from __future__ import annotations
import importlib
import logging
import unicodedata
from codecs import IncrementalDecoder
from encodings.aliases import aliases
from functools import lru_cache
from re import findall
from typing import Generator
from _multibytecodec import ( # type: ignore[import-not-found,import]
MultibyteIncrementalDecoder,
)
from .constant import (
ENCODING_MARKS,
IANA_SUPPORTED_SIMILAR,
RE_POSSIBLE_ENCODING_INDICATION,
UNICODE_RANGES_COMBINED,
UNICODE_SECONDARY_RANGE_KEYWORD,
UTF8_MAXIMAL_ALLOCATION,
COMMON_CJK_CHARACTERS,
)
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_accentuated(character: str) -> bool:
try:
description: str = unicodedata.name(character)
except ValueError: # Defensive: unicode database outdated?
return False
return (
"WITH GRAVE" in description
or "WITH ACUTE" in description
or "WITH CEDILLA" in description
or "WITH DIAERESIS" in description
or "WITH CIRCUMFLEX" in description
or "WITH TILDE" in description
or "WITH MACRON" in description
or "WITH RING ABOVE" in description
)
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def remove_accent(character: str) -> str:
decomposed: str = unicodedata.decomposition(character)
if not decomposed:
return character
codes: list[str] = decomposed.split(" ")
return chr(int(codes[0], 16))
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def unicode_range(character: str) -> str | None:
"""
Retrieve the Unicode range official name from a single character.
"""
character_ord: int = ord(character)
for range_name, ord_range in UNICODE_RANGES_COMBINED.items():
if character_ord in ord_range:
return range_name
return None
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_latin(character: str) -> bool:
try:
description: str = unicodedata.name(character)
except ValueError: # Defensive: unicode database outdated?
return False
return "LATIN" in description
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_punctuation(character: str) -> bool:
character_category: str = unicodedata.category(character)
if "P" in character_category:
return True
character_range: str | None = unicode_range(character)
if character_range is None:
return False
return "Punctuation" in character_range
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_symbol(character: str) -> bool:
character_category: str = unicodedata.category(character)
if "S" in character_category or "N" in character_category:
return True
character_range: str | None = unicode_range(character)
if character_range is None:
return False
return "Forms" in character_range and character_category != "Lo"
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_emoticon(character: str) -> bool:
character_range: str | None = unicode_range(character)
if character_range is None:
return False
return "Emoticons" in character_range or "Pictographs" in character_range
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_separator(character: str) -> bool:
if character.isspace() or character in {"", "+", "<", ">"}:
return True
character_category: str = unicodedata.category(character)
return "Z" in character_category or character_category in {"Po", "Pd", "Pc"}
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_case_variable(character: str) -> bool:
return character.islower() != character.isupper()
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_cjk(character: str) -> bool:
try:
character_name = unicodedata.name(character)
except ValueError: # Defensive: unicode database outdated?
return False
return "CJK" in character_name
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_hiragana(character: str) -> bool:
try:
character_name = unicodedata.name(character)
except ValueError: # Defensive: unicode database outdated?
return False
return "HIRAGANA" in character_name
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_katakana(character: str) -> bool:
try:
character_name = unicodedata.name(character)
except ValueError: # Defensive: unicode database outdated?
return False
return "KATAKANA" in character_name
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_hangul(character: str) -> bool:
try:
character_name = unicodedata.name(character)
except ValueError: # Defensive: unicode database outdated?
return False
return "HANGUL" in character_name
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_thai(character: str) -> bool:
try:
character_name = unicodedata.name(character)
except ValueError: # Defensive: unicode database outdated?
return False
return "THAI" in character_name
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_arabic(character: str) -> bool:
try:
character_name = unicodedata.name(character)
except ValueError: # Defensive: unicode database outdated?
return False
return "ARABIC" in character_name
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_arabic_isolated_form(character: str) -> bool:
try:
character_name = unicodedata.name(character)
except ValueError: # Defensive: unicode database outdated?
return False
return "ARABIC" in character_name and "ISOLATED FORM" in character_name
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_cjk_uncommon(character: str) -> bool:
return character not in COMMON_CJK_CHARACTERS
@lru_cache(maxsize=len(UNICODE_RANGES_COMBINED))
def is_unicode_range_secondary(range_name: str) -> bool:
return any(keyword in range_name for keyword in UNICODE_SECONDARY_RANGE_KEYWORD)
@lru_cache(maxsize=UTF8_MAXIMAL_ALLOCATION)
def is_unprintable(character: str) -> bool:
return (
character.isspace() is False # includes \n \t \r \v
and character.isprintable() is False
and character != "\x1a" # Why? Its the ASCII substitute character.
and character != "\ufeff" # bug discovered in Python,
# Zero Width No-Break Space located in Arabic Presentation Forms-B, Unicode 1.1 not acknowledged as space.
)
def any_specified_encoding(sequence: bytes, search_zone: int = 8192) -> str | None:
"""
Extract using ASCII-only decoder any specified encoding in the first n-bytes.
"""
if not isinstance(sequence, bytes):
raise TypeError
seq_len: int = len(sequence)
results: list[str] = findall(
RE_POSSIBLE_ENCODING_INDICATION,
sequence[: min(seq_len, search_zone)].decode("ascii", errors="ignore"),
)
if len(results) == 0:
return None
for specified_encoding in results:
specified_encoding = specified_encoding.lower().replace("-", "_")
encoding_alias: str
encoding_iana: str
for encoding_alias, encoding_iana in aliases.items():
if encoding_alias == specified_encoding:
return encoding_iana
if encoding_iana == specified_encoding:
return encoding_iana
return None
@lru_cache(maxsize=128)
def is_multi_byte_encoding(name: str) -> bool:
"""
Verify is a specific encoding is a multi byte one based on it IANA name
"""
return name in {
"utf_8",
"utf_8_sig",
"utf_16",
"utf_16_be",
"utf_16_le",
"utf_32",
"utf_32_le",
"utf_32_be",
"utf_7",
} or issubclass(
importlib.import_module(f"encodings.{name}").IncrementalDecoder,
MultibyteIncrementalDecoder,
)
def identify_sig_or_bom(sequence: bytes) -> tuple[str | None, bytes]:
"""
Identify and extract SIG/BOM in given sequence.
"""
for iana_encoding in ENCODING_MARKS:
marks: bytes | list[bytes] = ENCODING_MARKS[iana_encoding]
if isinstance(marks, bytes):
marks = [marks]
for mark in marks:
if sequence.startswith(mark):
return iana_encoding, mark
return None, b""
def should_strip_sig_or_bom(iana_encoding: str) -> bool:
return iana_encoding not in {"utf_16", "utf_32"}
def iana_name(cp_name: str, strict: bool = True) -> str:
"""Returns the Python normalized encoding name (Not the IANA official name)."""
cp_name = cp_name.lower().replace("-", "_")
encoding_alias: str
encoding_iana: str
for encoding_alias, encoding_iana in aliases.items():
if cp_name in [encoding_alias, encoding_iana]:
return encoding_iana
if strict:
raise ValueError(f"Unable to retrieve IANA for '{cp_name}'")
return cp_name
def cp_similarity(iana_name_a: str, iana_name_b: str) -> float:
if is_multi_byte_encoding(iana_name_a) or is_multi_byte_encoding(iana_name_b):
return 0.0
decoder_a = importlib.import_module(f"encodings.{iana_name_a}").IncrementalDecoder
decoder_b = importlib.import_module(f"encodings.{iana_name_b}").IncrementalDecoder
id_a: IncrementalDecoder = decoder_a(errors="ignore")
id_b: IncrementalDecoder = decoder_b(errors="ignore")
character_match_count: int = 0
for i in range(255):
to_be_decoded: bytes = bytes([i])
if id_a.decode(to_be_decoded) == id_b.decode(to_be_decoded):
character_match_count += 1
return character_match_count / 254
def is_cp_similar(iana_name_a: str, iana_name_b: str) -> bool:
"""
Determine if two code page are at least 80% similar. IANA_SUPPORTED_SIMILAR dict was generated using
the function cp_similarity.
"""
return (
iana_name_a in IANA_SUPPORTED_SIMILAR
and iana_name_b in IANA_SUPPORTED_SIMILAR[iana_name_a]
)
def set_logging_handler(
name: str = "charset_normalizer",
level: int = logging.INFO,
format_string: str = "%(asctime)s | %(levelname)s | %(message)s",
) -> None:
logger = logging.getLogger(name)
logger.setLevel(level)
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter(format_string))
logger.addHandler(handler)
def cut_sequence_chunks(
sequences: bytes,
encoding_iana: str,
offsets: range,
chunk_size: int,
bom_or_sig_available: bool,
strip_sig_or_bom: bool,
sig_payload: bytes,
is_multi_byte_decoder: bool,
decoded_payload: str | None = None,
) -> Generator[str, None, None]:
if decoded_payload and is_multi_byte_decoder is False:
for i in offsets:
chunk = decoded_payload[i : i + chunk_size]
if not chunk:
break
yield chunk
else:
for i in offsets:
chunk_end = i + chunk_size
if chunk_end > len(sequences) + 8:
continue
cut_sequence = sequences[i : i + chunk_size]
if bom_or_sig_available and strip_sig_or_bom is False:
cut_sequence = sig_payload + cut_sequence
chunk = cut_sequence.decode(
encoding_iana,
errors="ignore" if is_multi_byte_decoder else "strict",
)
# multi-byte bad cutting detector and adjustment
# not the cleanest way to perform that fix but clever enough for now.
if is_multi_byte_decoder and i > 0:
chunk_partial_size_chk: int = min(chunk_size, 16)
if (
decoded_payload
and chunk[:chunk_partial_size_chk] not in decoded_payload
):
for j in range(i, i - 4, -1):
cut_sequence = sequences[j:chunk_end]
if bom_or_sig_available and strip_sig_or_bom is False:
cut_sequence = sig_payload + cut_sequence
chunk = cut_sequence.decode(encoding_iana, errors="ignore")
if chunk[:chunk_partial_size_chk] in decoded_payload:
break
yield chunk

View File

@@ -0,0 +1,8 @@
"""
Expose version
"""
from __future__ import annotations
__version__ = "3.4.4"
VERSION = __version__.split(".")

View File

@@ -0,0 +1,139 @@
Metadata-Version: 2.4
Name: cryptography
Version: 46.0.4
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX
Classifier: Operating System :: POSIX :: BSD
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: Microsoft :: Windows
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Programming Language :: Python :: Free Threading :: 3 - Stable
Classifier: Topic :: Security :: Cryptography
Requires-Dist: cffi>=1.14 ; python_full_version == '3.8.*' and platform_python_implementation != 'PyPy'
Requires-Dist: cffi>=2.0.0 ; python_full_version >= '3.9' and platform_python_implementation != 'PyPy'
Requires-Dist: typing-extensions>=4.13.2 ; python_full_version < '3.11'
Requires-Dist: bcrypt>=3.1.5 ; extra == 'ssh'
Requires-Dist: nox[uv]>=2024.4.15 ; extra == 'nox'
Requires-Dist: cryptography-vectors==46.0.4 ; extra == 'test'
Requires-Dist: pytest>=7.4.0 ; extra == 'test'
Requires-Dist: pytest-benchmark>=4.0 ; extra == 'test'
Requires-Dist: pytest-cov>=2.10.1 ; extra == 'test'
Requires-Dist: pytest-xdist>=3.5.0 ; extra == 'test'
Requires-Dist: pretend>=0.7 ; extra == 'test'
Requires-Dist: certifi>=2024 ; extra == 'test'
Requires-Dist: pytest-randomly ; extra == 'test-randomorder'
Requires-Dist: sphinx>=5.3.0 ; extra == 'docs'
Requires-Dist: sphinx-rtd-theme>=3.0.0 ; extra == 'docs'
Requires-Dist: sphinx-inline-tabs ; extra == 'docs'
Requires-Dist: pyenchant>=3 ; extra == 'docstest'
Requires-Dist: readme-renderer>=30.0 ; extra == 'docstest'
Requires-Dist: sphinxcontrib-spelling>=7.3.1 ; extra == 'docstest'
Requires-Dist: build>=1.0.0 ; extra == 'sdist'
Requires-Dist: ruff>=0.11.11 ; extra == 'pep8test'
Requires-Dist: mypy>=1.14 ; extra == 'pep8test'
Requires-Dist: check-sdist ; extra == 'pep8test'
Requires-Dist: click>=8.0.1 ; extra == 'pep8test'
Provides-Extra: ssh
Provides-Extra: nox
Provides-Extra: test
Provides-Extra: test-randomorder
Provides-Extra: docs
Provides-Extra: docstest
Provides-Extra: sdist
Provides-Extra: pep8test
License-File: LICENSE
License-File: LICENSE.APACHE
License-File: LICENSE.BSD
Summary: cryptography is a package which provides cryptographic recipes and primitives to Python developers.
Author-email: The Python Cryptographic Authority and individual contributors <cryptography-dev@python.org>
License-Expression: Apache-2.0 OR BSD-3-Clause
Requires-Python: >=3.8, !=3.9.0, !=3.9.1
Description-Content-Type: text/x-rst; charset=UTF-8
Project-URL: homepage, https://github.com/pyca/cryptography
Project-URL: documentation, https://cryptography.io/
Project-URL: source, https://github.com/pyca/cryptography/
Project-URL: issues, https://github.com/pyca/cryptography/issues
Project-URL: changelog, https://cryptography.io/en/latest/changelog/
pyca/cryptography
=================
.. image:: https://img.shields.io/pypi/v/cryptography.svg
:target: https://pypi.org/project/cryptography/
:alt: Latest Version
.. image:: https://readthedocs.org/projects/cryptography/badge/?version=latest
:target: https://cryptography.io
:alt: Latest Docs
.. image:: https://github.com/pyca/cryptography/actions/workflows/ci.yml/badge.svg
:target: https://github.com/pyca/cryptography/actions/workflows/ci.yml?query=branch%3Amain
``cryptography`` is a package which provides cryptographic recipes and
primitives to Python developers. Our goal is for it to be your "cryptographic
standard library". It supports Python 3.8+ and PyPy3 7.3.11+.
``cryptography`` includes both high level recipes and low level interfaces to
common cryptographic algorithms such as symmetric ciphers, message digests, and
key derivation functions. For example, to encrypt something with
``cryptography``'s high level symmetric encryption recipe:
.. code-block:: pycon
>>> from cryptography.fernet import Fernet
>>> # Put this somewhere safe!
>>> key = Fernet.generate_key()
>>> f = Fernet(key)
>>> token = f.encrypt(b"A really secret message. Not for prying eyes.")
>>> token
b'...'
>>> f.decrypt(token)
b'A really secret message. Not for prying eyes.'
You can find more information in the `documentation`_.
You can install ``cryptography`` with:
.. code-block:: console
$ pip install cryptography
For full details see `the installation documentation`_.
Discussion
~~~~~~~~~~
If you run into bugs, you can file them in our `issue tracker`_.
We maintain a `cryptography-dev`_ mailing list for development discussion.
You can also join ``#pyca`` on ``irc.libera.chat`` to ask questions or get
involved.
Security
~~~~~~~~
Need to report a security issue? Please consult our `security reporting`_
documentation.
.. _`documentation`: https://cryptography.io/
.. _`the installation documentation`: https://cryptography.io/en/latest/installation/
.. _`issue tracker`: https://github.com/pyca/cryptography/issues
.. _`cryptography-dev`: https://mail.python.org/mailman/listinfo/cryptography-dev
.. _`security reporting`: https://cryptography.io/en/latest/security/

View File

@@ -0,0 +1,180 @@
cryptography-46.0.4.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
cryptography-46.0.4.dist-info/METADATA,sha256=I3ciY9ZZ8NhroRqN5H5-Q33zINxMUxBUrsGAl_Nc_GI,5748
cryptography-46.0.4.dist-info/RECORD,,
cryptography-46.0.4.dist-info/WHEEL,sha256=jkxrJemT4jZpYSr-u9xPalWqoow8benNmiXfjKXLlJw,108
cryptography-46.0.4.dist-info/licenses/LICENSE,sha256=Pgx8CRqUi4JTO6mP18u0BDLW8amsv4X1ki0vmak65rs,197
cryptography-46.0.4.dist-info/licenses/LICENSE.APACHE,sha256=qsc7MUj20dcRHbyjIJn2jSbGRMaBOuHk8F9leaomY_4,11360
cryptography-46.0.4.dist-info/licenses/LICENSE.BSD,sha256=YCxMdILeZHndLpeTzaJ15eY9dz2s0eymiSMqtwCPtPs,1532
cryptography/__about__.py,sha256=jr-6rAv4vD2Nt5SzB7vEndY62BcERV2n1w0Yv59SpSw,445
cryptography/__init__.py,sha256=mthuUrTd4FROCpUYrTIqhjz6s6T9djAZrV7nZ1oMm2o,364
cryptography/__pycache__/__about__.cpython-312.pyc,,
cryptography/__pycache__/__init__.cpython-312.pyc,,
cryptography/__pycache__/exceptions.cpython-312.pyc,,
cryptography/__pycache__/fernet.cpython-312.pyc,,
cryptography/__pycache__/utils.cpython-312.pyc,,
cryptography/exceptions.py,sha256=835EWILc2fwxw-gyFMriciC2SqhViETB10LBSytnDIc,1087
cryptography/fernet.py,sha256=3Cvxkh0KJSbX8HbnCHu4wfCW7U0GgfUA3v_qQ8a8iWc,6963
cryptography/hazmat/__init__.py,sha256=5IwrLWrVp0AjEr_4FdWG_V057NSJGY_W4egNNsuct0g,455
cryptography/hazmat/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/__pycache__/_oid.cpython-312.pyc,,
cryptography/hazmat/_oid.py,sha256=p8ThjwJB56Ci_rAIrjyJ1f8VjgD6e39es2dh8JIUBOw,17240
cryptography/hazmat/asn1/__init__.py,sha256=hS_EWx3wVvZzfbCcNV8hzcDnyMM8H-BhIoS1TipUosk,293
cryptography/hazmat/asn1/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/asn1/__pycache__/asn1.cpython-312.pyc,,
cryptography/hazmat/asn1/asn1.py,sha256=eMEThEXa19LQjcyVofgHsW6tsZnjp3ddH7bWkkcxfLM,3860
cryptography/hazmat/backends/__init__.py,sha256=O5jvKFQdZnXhKeqJ-HtulaEL9Ni7mr1mDzZY5kHlYhI,361
cryptography/hazmat/backends/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/backends/openssl/__init__.py,sha256=p3jmJfnCag9iE5sdMrN6VvVEu55u46xaS_IjoI0SrmA,305
cryptography/hazmat/backends/openssl/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/backends/openssl/__pycache__/backend.cpython-312.pyc,,
cryptography/hazmat/backends/openssl/backend.py,sha256=tV5AxBoFJ2GfA0DMWSY-0TxQJrpQoexzI9R4Kybb--4,10215
cryptography/hazmat/bindings/__init__.py,sha256=s9oKCQ2ycFdXoERdS1imafueSkBsL9kvbyfghaauZ9Y,180
cryptography/hazmat/bindings/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/bindings/_rust.abi3.so,sha256=SLrDaK0kaiylOknAyD4zYzvSXELo23eviAKVE4JCf5k,12817480
cryptography/hazmat/bindings/_rust/__init__.pyi,sha256=KhqLhXFPArPzzJ7DYO9Fl8FoXB_BagAd_r4Dm_Ze9Xo,1257
cryptography/hazmat/bindings/_rust/_openssl.pyi,sha256=mpNJLuYLbCVrd5i33FBTmWwL_55Dw7JPkSLlSX9Q7oI,230
cryptography/hazmat/bindings/_rust/asn1.pyi,sha256=BrGjC8J6nwuS-r3EVcdXJB8ndotfY9mbQYOfpbPG0HA,354
cryptography/hazmat/bindings/_rust/declarative_asn1.pyi,sha256=2ECFmYue1EPkHEE2Bm7aLwkjB0mSUTpr23v9MN4pri4,892
cryptography/hazmat/bindings/_rust/exceptions.pyi,sha256=exXr2xw_0pB1kk93cYbM3MohbzoUkjOms1ZMUi0uQZE,640
cryptography/hazmat/bindings/_rust/ocsp.pyi,sha256=VPVWuKHI9EMs09ZLRYAGvR0Iz0mCMmEzXAkgJHovpoM,4020
cryptography/hazmat/bindings/_rust/openssl/__init__.pyi,sha256=iOAMDyHoNwwCSZfZzuXDr64g4GpGUeDgEN-LjXqdrBM,1522
cryptography/hazmat/bindings/_rust/openssl/aead.pyi,sha256=4Nddw6-ynzIB3w2W86WvkGKTLlTDk_6F5l54RHCuy3E,2688
cryptography/hazmat/bindings/_rust/openssl/ciphers.pyi,sha256=LhPzHWSXJq4grAJXn6zSvSSdV-aYIIscHDwIPlJGGPs,1315
cryptography/hazmat/bindings/_rust/openssl/cmac.pyi,sha256=nPH0X57RYpsAkRowVpjQiHE566ThUTx7YXrsadmrmHk,564
cryptography/hazmat/bindings/_rust/openssl/dh.pyi,sha256=Z3TC-G04-THtSdAOPLM1h2G7ml5bda1ElZUcn5wpuhk,1564
cryptography/hazmat/bindings/_rust/openssl/dsa.pyi,sha256=qBtkgj2albt2qFcnZ9UDrhzoNhCVO7HTby5VSf1EXMI,1299
cryptography/hazmat/bindings/_rust/openssl/ec.pyi,sha256=zJy0pRa5n-_p2dm45PxECB_-B6SVZyNKfjxFDpPqT38,1691
cryptography/hazmat/bindings/_rust/openssl/ed25519.pyi,sha256=VXfXd5G6hUivg399R1DYdmW3eTb0EebzDTqjRC2gaRw,532
cryptography/hazmat/bindings/_rust/openssl/ed448.pyi,sha256=Yx49lqdnjsD7bxiDV1kcaMrDktug5evi5a6zerMiy2s,514
cryptography/hazmat/bindings/_rust/openssl/hashes.pyi,sha256=OWZvBx7xfo_HJl41Nc--DugVyCVPIprZ3HlOPTSWH9g,984
cryptography/hazmat/bindings/_rust/openssl/hmac.pyi,sha256=BXZn7NDjL3JAbYW0SQ8pg1iyC5DbQXVhUAiwsi8DFR8,702
cryptography/hazmat/bindings/_rust/openssl/kdf.pyi,sha256=xXfFBb9QehHfDtEaxV_65Z0YK7NquOVIChpTLkgAs_k,2029
cryptography/hazmat/bindings/_rust/openssl/keys.pyi,sha256=teIt8M6ZEMJrn4s3W0UnW0DZ-30Jd68WnSsKKG124l0,912
cryptography/hazmat/bindings/_rust/openssl/poly1305.pyi,sha256=_SW9NtQ5FDlAbdclFtWpT4lGmxKIKHpN-4j8J2BzYfQ,585
cryptography/hazmat/bindings/_rust/openssl/rsa.pyi,sha256=2OQCNSXkxgc-3uw1xiCCloIQTV6p9_kK79Yu0rhZgPc,1364
cryptography/hazmat/bindings/_rust/openssl/x25519.pyi,sha256=ewn4GpQyb7zPwE-ni7GtyQgMC0A1mLuqYsSyqv6nI_s,523
cryptography/hazmat/bindings/_rust/openssl/x448.pyi,sha256=juTZTmli8jO_5Vcufg-vHvx_tCyezmSLIh_9PU3TczI,505
cryptography/hazmat/bindings/_rust/pkcs12.pyi,sha256=vEEd5wDiZvb8ZGFaziLCaWLzAwoG_tvPUxLQw5_uOl8,1605
cryptography/hazmat/bindings/_rust/pkcs7.pyi,sha256=txGBJijqZshEcqra6byPNbnisIdlxzOSIHP2hl9arPs,1601
cryptography/hazmat/bindings/_rust/test_support.pyi,sha256=PPhld-WkO743iXFPebeG0LtgK0aTzGdjcIsay1Gm5GE,757
cryptography/hazmat/bindings/_rust/x509.pyi,sha256=n9X0IQ6ICbdIi-ExdCFZoBgeY6njm3QOVAVZwDQdnbk,9784
cryptography/hazmat/bindings/openssl/__init__.py,sha256=s9oKCQ2ycFdXoERdS1imafueSkBsL9kvbyfghaauZ9Y,180
cryptography/hazmat/bindings/openssl/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/bindings/openssl/__pycache__/_conditional.cpython-312.pyc,,
cryptography/hazmat/bindings/openssl/__pycache__/binding.cpython-312.pyc,,
cryptography/hazmat/bindings/openssl/_conditional.py,sha256=DMOpA_XN4l70zTc5_J9DpwlbQeUBRTWpfIJ4yRIn1-U,5791
cryptography/hazmat/bindings/openssl/binding.py,sha256=x8eocEmukO4cm7cHqfVmOoYY7CCXdoF1v1WhZQt9neo,4610
cryptography/hazmat/decrepit/__init__.py,sha256=wHCbWfaefa-fk6THSw9th9fJUsStJo7245wfFBqmduA,216
cryptography/hazmat/decrepit/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/decrepit/ciphers/__init__.py,sha256=wHCbWfaefa-fk6THSw9th9fJUsStJo7245wfFBqmduA,216
cryptography/hazmat/decrepit/ciphers/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/decrepit/ciphers/__pycache__/algorithms.cpython-312.pyc,,
cryptography/hazmat/decrepit/ciphers/algorithms.py,sha256=YrKgHS4MfwWaMmPBYRymRRlC0phwWp9ycICFezeJPGk,2595
cryptography/hazmat/primitives/__init__.py,sha256=s9oKCQ2ycFdXoERdS1imafueSkBsL9kvbyfghaauZ9Y,180
cryptography/hazmat/primitives/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/primitives/__pycache__/_asymmetric.cpython-312.pyc,,
cryptography/hazmat/primitives/__pycache__/_cipheralgorithm.cpython-312.pyc,,
cryptography/hazmat/primitives/__pycache__/_serialization.cpython-312.pyc,,
cryptography/hazmat/primitives/__pycache__/cmac.cpython-312.pyc,,
cryptography/hazmat/primitives/__pycache__/constant_time.cpython-312.pyc,,
cryptography/hazmat/primitives/__pycache__/hashes.cpython-312.pyc,,
cryptography/hazmat/primitives/__pycache__/hmac.cpython-312.pyc,,
cryptography/hazmat/primitives/__pycache__/keywrap.cpython-312.pyc,,
cryptography/hazmat/primitives/__pycache__/padding.cpython-312.pyc,,
cryptography/hazmat/primitives/__pycache__/poly1305.cpython-312.pyc,,
cryptography/hazmat/primitives/_asymmetric.py,sha256=RhgcouUB6HTiFDBrR1LxqkMjpUxIiNvQ1r_zJjRG6qQ,532
cryptography/hazmat/primitives/_cipheralgorithm.py,sha256=Eh3i7lwedHfi0eLSsH93PZxQKzY9I6lkK67vL4V5tOc,1522
cryptography/hazmat/primitives/_serialization.py,sha256=chgPCSF2jxI2Cr5gB-qbWXOvOfupBh4CARS0KAhv9AM,5123
cryptography/hazmat/primitives/asymmetric/__init__.py,sha256=s9oKCQ2ycFdXoERdS1imafueSkBsL9kvbyfghaauZ9Y,180
cryptography/hazmat/primitives/asymmetric/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/__pycache__/dh.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/__pycache__/dsa.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/__pycache__/ec.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/__pycache__/ed25519.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/__pycache__/ed448.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/__pycache__/padding.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/__pycache__/rsa.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/__pycache__/types.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/__pycache__/utils.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/__pycache__/x25519.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/__pycache__/x448.cpython-312.pyc,,
cryptography/hazmat/primitives/asymmetric/dh.py,sha256=0v_vEFFz5pQ1QG-FkWDyvgv7IfuVZSH5Q6LyFI5A8rg,3645
cryptography/hazmat/primitives/asymmetric/dsa.py,sha256=Ld_bbbqQFz12dObHxIkzEQzX0SWWP41RLSWkYSaKhqE,4213
cryptography/hazmat/primitives/asymmetric/ec.py,sha256=Vf5ig2PcS3PVnsb5N49Kx1uIkFBJyhg4BWXThDz5cug,12999
cryptography/hazmat/primitives/asymmetric/ed25519.py,sha256=jZW5cs472wXXV3eB0sE1b8w64gdazwwU0_MT5UOTiXs,3700
cryptography/hazmat/primitives/asymmetric/ed448.py,sha256=yAetgn2f2JYf0BO8MapGzXeThsvSMG5LmUCrxVOidAA,3729
cryptography/hazmat/primitives/asymmetric/padding.py,sha256=vQ6l6gOg9HqcbOsvHrSiJRVLdEj9L4m4HkRGYziTyFA,2854
cryptography/hazmat/primitives/asymmetric/rsa.py,sha256=ZnKOo2f34MCCOupC03Y1uR-_jiSG5IrelHEmxaME3D4,8303
cryptography/hazmat/primitives/asymmetric/types.py,sha256=LnsOJym-wmPUJ7Knu_7bCNU3kIiELCd6krOaW_JU08I,2996
cryptography/hazmat/primitives/asymmetric/utils.py,sha256=DPTs6T4F-UhwzFQTh-1fSEpQzazH2jf2xpIro3ItF4o,790
cryptography/hazmat/primitives/asymmetric/x25519.py,sha256=_4nQeZ3yJ3Lg0RpXnaqA-1yt6vbx1F-wzLcaZHwSpeE,3613
cryptography/hazmat/primitives/asymmetric/x448.py,sha256=WKBLtuVfJqiBRro654fGaQAlvsKbqbNkK7c4A_ZCdV0,3642
cryptography/hazmat/primitives/ciphers/__init__.py,sha256=eyEXmjk6_CZXaOPYDr7vAYGXr29QvzgWL2-4CSolLFs,680
cryptography/hazmat/primitives/ciphers/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/primitives/ciphers/__pycache__/aead.cpython-312.pyc,,
cryptography/hazmat/primitives/ciphers/__pycache__/algorithms.cpython-312.pyc,,
cryptography/hazmat/primitives/ciphers/__pycache__/base.cpython-312.pyc,,
cryptography/hazmat/primitives/ciphers/__pycache__/modes.cpython-312.pyc,,
cryptography/hazmat/primitives/ciphers/aead.py,sha256=Fzlyx7w8KYQakzDp1zWgJnIr62zgZrgVh1u2h4exB54,634
cryptography/hazmat/primitives/ciphers/algorithms.py,sha256=Q7ZJwcsx83Mgxv5y7r6CyJKSdsOwC-my-5A67-ma2vw,3407
cryptography/hazmat/primitives/ciphers/base.py,sha256=aBC7HHBBoixebmparVr0UlODs3VD0A7B6oz_AaRjDv8,4253
cryptography/hazmat/primitives/ciphers/modes.py,sha256=20stpwhDtbAvpH0SMf9EDHIciwmTF-JMBUOZ9bU8WiQ,8318
cryptography/hazmat/primitives/cmac.py,sha256=sz_s6H_cYnOvx-VNWdIKhRhe3Ymp8z8J0D3CBqOX3gg,338
cryptography/hazmat/primitives/constant_time.py,sha256=xdunWT0nf8OvKdcqUhhlFKayGp4_PgVJRU2W1wLSr_A,422
cryptography/hazmat/primitives/hashes.py,sha256=M8BrlKB3U6DEtHvWTV5VRjpteHv1kS3Zxm_Bsk04cr8,5184
cryptography/hazmat/primitives/hmac.py,sha256=RpB3z9z5skirCQrm7zQbtnp9pLMnAjrlTUvKqF5aDDc,423
cryptography/hazmat/primitives/kdf/__init__.py,sha256=4XibZnrYq4hh5xBjWiIXzaYW6FKx8hPbVaa_cB9zS64,750
cryptography/hazmat/primitives/kdf/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/primitives/kdf/__pycache__/argon2.cpython-312.pyc,,
cryptography/hazmat/primitives/kdf/__pycache__/concatkdf.cpython-312.pyc,,
cryptography/hazmat/primitives/kdf/__pycache__/hkdf.cpython-312.pyc,,
cryptography/hazmat/primitives/kdf/__pycache__/kbkdf.cpython-312.pyc,,
cryptography/hazmat/primitives/kdf/__pycache__/pbkdf2.cpython-312.pyc,,
cryptography/hazmat/primitives/kdf/__pycache__/scrypt.cpython-312.pyc,,
cryptography/hazmat/primitives/kdf/__pycache__/x963kdf.cpython-312.pyc,,
cryptography/hazmat/primitives/kdf/argon2.py,sha256=UFDNXG0v-rw3DqAQTB1UQAsQC2M5Ejg0k_6OCyhLKus,460
cryptography/hazmat/primitives/kdf/concatkdf.py,sha256=Ua8KoLXXnzgsrAUmHpyKymaPt8aPRP0EHEaBz7QCQ9I,3737
cryptography/hazmat/primitives/kdf/hkdf.py,sha256=M0lAEfRoc4kpp4-nwDj9yB-vNZukIOYEQrUlWsBNn9o,543
cryptography/hazmat/primitives/kdf/kbkdf.py,sha256=oZepvo4evhKkkJQWRDwaPoIbyTaFmDc5NPimxg6lfKg,9165
cryptography/hazmat/primitives/kdf/pbkdf2.py,sha256=1WIwhELR0w8ztTpTu8BrFiYWmK3hUfJq08I79TxwieE,1957
cryptography/hazmat/primitives/kdf/scrypt.py,sha256=XyWUdUUmhuI9V6TqAPOvujCSMGv1XQdg0a21IWCmO-U,590
cryptography/hazmat/primitives/kdf/x963kdf.py,sha256=zLTcF665QFvXX2f8TS7fmBZTteXpFjKahzfjjQcCJyw,1999
cryptography/hazmat/primitives/keywrap.py,sha256=XV4Pj2fqSeD-RqZVvY2cA3j5_7RwJSFygYuLfk2ujCo,5650
cryptography/hazmat/primitives/padding.py,sha256=QT-U-NvV2eQGO1wVPbDiNGNSc9keRDS-ig5cQOrLz0E,1865
cryptography/hazmat/primitives/poly1305.py,sha256=P5EPQV-RB_FJPahpg01u0Ts4S_PnAmsroxIGXbGeRRo,355
cryptography/hazmat/primitives/serialization/__init__.py,sha256=Q7uTgDlt7n3WfsMT6jYwutC6DIg_7SEeoAm1GHZ5B5E,1705
cryptography/hazmat/primitives/serialization/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/primitives/serialization/__pycache__/base.cpython-312.pyc,,
cryptography/hazmat/primitives/serialization/__pycache__/pkcs12.cpython-312.pyc,,
cryptography/hazmat/primitives/serialization/__pycache__/pkcs7.cpython-312.pyc,,
cryptography/hazmat/primitives/serialization/__pycache__/ssh.cpython-312.pyc,,
cryptography/hazmat/primitives/serialization/base.py,sha256=ikq5MJIwp_oUnjiaBco_PmQwOTYuGi-XkYUYHKy8Vo0,615
cryptography/hazmat/primitives/serialization/pkcs12.py,sha256=mS9cFNG4afzvseoc5e1MWoY2VskfL8N8Y_OFjl67luY,5104
cryptography/hazmat/primitives/serialization/pkcs7.py,sha256=5OR_Tkysxaprn4FegvJIfbep9rJ9wok6FLWvWwQ5-Mg,13943
cryptography/hazmat/primitives/serialization/ssh.py,sha256=hPV5obFznz0QhFfXFPOeQ8y6MsurA0xVMQiLnLESEs8,53700
cryptography/hazmat/primitives/twofactor/__init__.py,sha256=tmMZGB-g4IU1r7lIFqASU019zr0uPp_wEBYcwdDCKCA,258
cryptography/hazmat/primitives/twofactor/__pycache__/__init__.cpython-312.pyc,,
cryptography/hazmat/primitives/twofactor/__pycache__/hotp.cpython-312.pyc,,
cryptography/hazmat/primitives/twofactor/__pycache__/totp.cpython-312.pyc,,
cryptography/hazmat/primitives/twofactor/hotp.py,sha256=ivZo5BrcCGWLsqql4nZV0XXCjyGPi_iHfDFltGlOJwk,3256
cryptography/hazmat/primitives/twofactor/totp.py,sha256=m5LPpRL00kp4zY8gTjr55Hfz9aMlPS53kHmVkSQCmdY,1652
cryptography/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
cryptography/utils.py,sha256=bZAjFC5KVpfmF29qS_18vvpW3mKxmdiRALcusHhTTkg,4301
cryptography/x509/__init__.py,sha256=xloN0swseNx-m2WFZmCA17gOoxQWqeU82UVjEdJBePQ,8257
cryptography/x509/__pycache__/__init__.cpython-312.pyc,,
cryptography/x509/__pycache__/base.cpython-312.pyc,,
cryptography/x509/__pycache__/certificate_transparency.cpython-312.pyc,,
cryptography/x509/__pycache__/extensions.cpython-312.pyc,,
cryptography/x509/__pycache__/general_name.cpython-312.pyc,,
cryptography/x509/__pycache__/name.cpython-312.pyc,,
cryptography/x509/__pycache__/ocsp.cpython-312.pyc,,
cryptography/x509/__pycache__/oid.cpython-312.pyc,,
cryptography/x509/__pycache__/verification.cpython-312.pyc,,
cryptography/x509/base.py,sha256=OrmTw3y8B6AE_nGXQPN8x9kq-d7rDWeH13gCq6T6D6U,27997
cryptography/x509/certificate_transparency.py,sha256=JqoOIDhlwInrYMFW6IFn77WJ0viF-PB_rlZV3vs9MYc,797
cryptography/x509/extensions.py,sha256=QxYrqR6SF1qzR9ZraP8wDiIczlEVlAFuwDRVcltB6Tk,77724
cryptography/x509/general_name.py,sha256=sP_rV11Qlpsk4x3XXGJY_Mv0Q_s9dtjeLckHsjpLQoQ,7836
cryptography/x509/name.py,sha256=ty0_xf0LnHwZAdEf-d8FLO1K4hGqx_7DsD3CHwoLJiY,15101
cryptography/x509/ocsp.py,sha256=Yey6NdFV1MPjop24Mj_VenjEpg3kUaMopSWOK0AbeBs,12699
cryptography/x509/oid.py,sha256=BUzgXXGVWilkBkdKPTm9R4qElE9gAGHgdYPMZAp7PJo,931
cryptography/x509/verification.py,sha256=gR2C2c-XZQtblZhT5T5vjSKOtCb74ef2alPVmEcwFlM,958

View File

@@ -0,0 +1,5 @@
Wheel-Version: 1.0
Generator: maturin (1.9.4)
Root-Is-Purelib: false
Tag: cp311-abi3-manylinux_2_34_x86_64

View File

@@ -0,0 +1,3 @@
This software is made available under the terms of *either* of the licenses
found in LICENSE.APACHE or LICENSE.BSD. Contributions to cryptography are made
under the terms of *both* these licenses.

View File

@@ -0,0 +1,202 @@
Apache License
Version 2.0, January 2004
https://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -0,0 +1,27 @@
Copyright (c) Individual contributors.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. Neither the name of PyCA Cryptography nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@@ -0,0 +1,17 @@
# This file is dual licensed under the terms of the Apache License, Version
# 2.0, and the BSD License. See the LICENSE file in the root of this repository
# for complete details.
from __future__ import annotations
__all__ = [
"__author__",
"__copyright__",
"__version__",
]
__version__ = "46.0.4"
__author__ = "The Python Cryptographic Authority and individual contributors"
__copyright__ = f"Copyright 2013-2025 {__author__}"

View File

@@ -0,0 +1,13 @@
# This file is dual licensed under the terms of the Apache License, Version
# 2.0, and the BSD License. See the LICENSE file in the root of this repository
# for complete details.
from __future__ import annotations
from cryptography.__about__ import __author__, __copyright__, __version__
__all__ = [
"__author__",
"__copyright__",
"__version__",
]

View File

@@ -0,0 +1,52 @@
# This file is dual licensed under the terms of the Apache License, Version
# 2.0, and the BSD License. See the LICENSE file in the root of this repository
# for complete details.
from __future__ import annotations
import typing
from cryptography.hazmat.bindings._rust import exceptions as rust_exceptions
if typing.TYPE_CHECKING:
from cryptography.hazmat.bindings._rust import openssl as rust_openssl
_Reasons = rust_exceptions._Reasons
class UnsupportedAlgorithm(Exception):
def __init__(self, message: str, reason: _Reasons | None = None) -> None:
super().__init__(message)
self._reason = reason
class AlreadyFinalized(Exception):
pass
class AlreadyUpdated(Exception):
pass
class NotYetFinalized(Exception):
pass
class InvalidTag(Exception):
pass
class InvalidSignature(Exception):
pass
class InternalError(Exception):
def __init__(
self, msg: str, err_code: list[rust_openssl.OpenSSLError]
) -> None:
super().__init__(msg)
self.err_code = err_code
class InvalidKey(Exception):
pass

Some files were not shown because too many files have changed in this diff Show More