Files

Echo c9a376ab11 Update dashboard, memory, root (+8 ~6)

2026-04-02 19:42:36 +00:00

7.1 KiB

Raw Blame History

Claude Mythos Changes Everything. Your AI Stack Isn't Ready.

URL: https://youtu.be/hV5_XSEBZNg
Durată: 31:20
Data salvare: 2026-04-01
Tags: @work @growth #ai #claude #workflow #prompt-engineering

📋 TL;DR

Claude Mythos (Capybara) - primul model antrenat pe Nvidia GB300 - va fi lansat în 1-2 luni și va schimba fundamental modul în care construim cu AI. Mesajul central: simplificarea este cheia. Modele mai puternice înseamnă mai puține instrucțiuni procedurale, mai puține scaffold-uri complexe, mai multă încredere în capacitatea modelului de a înțelege outcome-ul dorit. Trebuie să ne pregătim ACUM pentru această schimbare, altfel vom rămâne în urmă.

🎯 Puncte Cheie

Ce este Mythos?

Primul model antrenat pe cipuri Nvidia GB300
Noul lineage: Capybara (nu mai e Sonnet/Opus)
Cel mai mare și mai puternic model din lume (conform Anthropic)
Exemplu impact: A găsit zero-day vulnerabilities în Ghost (50k stars GitHub) pe care security researchers nu le-au găsit niciodată

4 Zone de Audit Pre-Mythos

1️⃣ Prompt Scaffolding

Întrebare cheie: "E această instrucțiune aici pentru că modelul o NECESITĂ sau pentru că EU am vrut să o necesite?"
Anthropic: "Consider adding complexity only when it demonstrably improves outcomes"
OpenAI Codex: "Just tell it what you need without writing long instructions"
Exemplu: Prompt de customer support cu 3000 tokens - jumătate procedural → poate dispărea 30-50% când vine Mythos
Regula: Cere WHAT + WHY, nu HOW

2️⃣ Retrieval & Memory Architecture

Mai puțină logică de retrieval pe partea ta, mai multă încredere în model
Nu RAG e mort, dar modul de gândire se schimbă
Noul pattern: Prezintă un repo bine organizat → spune modelului să caute ce are nevoie
Modelul devine mai bun la a-și umple context window-ul eficient

3️⃣ Domain Knowledge Hardcoding

Întrebare: Ce reguli de business le-am scris pentru că modelul NU putea infera vs ce poate infera acum?
Exemplu personal: Prompt de research de 10 linii → o linie → rezultate mai bune (prompt-ul detaliat LIMITA modelul)
House style pentru rapoarte → poate fi inferat din exemplu, nu mai trebuie specificat
Arta promptării evolueză: Din "ce pui în prompt" → "ce lași afară"

4️⃣ Verification & Evals

Ne mutăm de la 85% correct → 99% correct
Recomandare: UN SINGUR eval gate la final care verifică TOTUL
Nu mai pierde timp cu intermediate evals - simplifică pipeline-ul
Pentru non-tech: păstrează standarde ÎNALTE - nu accepta "99% e bine" dacă acel 1% contează
Pentru tech: evals automate comprehensive - omul devine bottleneck la volume

Implicații Practice

Cost & Access:

Modele foarte scumpe (probabil doar pe Max plan la început - $200/lună)
Trebuie să te gândești: investesc în "cutting edge curve" sau stau un pas în urmă?
ROI: Dacă ai acces la Mythos, profită la maxim - poate compensa $200 găsind economii în subscripții

Simplificare Generală:

Modele scumpe → folosește tokens eficient
Nu le aglomera cu proces descris de oameni
Lasă modelul să decidă ordinea tool calls, ce să pună în context, etc.

Multi-Agent Patterns:

Mythos devine planner - tu dai outcome spec + evals + tool suite
Mythos instanțiază agenți, măsoară progres, verifică cu evals
Pattern: agent separat face eval (nu același care a făcut treaba)

Non-Tech Work:

"Under the desk software" devine din ce în ce mai sofisticat
Construiești aplicații utile fără să atingi cod - doar specificând intent
Exemple: family calendar, team tools, workflows complexe

Scaling Law & Bitter Lesson

"The bitter lesson we have to learn: all the way we have described process, the things that are precious to us are things that are associated with our ability to execute work in a certain series of steps and somehow we've decided that's an important reflection of our work identity. What Claude Mythos and similar models are going to teach us is that that doesn't matter anymore and what matters is the outcome and our ability to name the outcome and let go of the process."

Simplitatea funcționează mai bine pe măsură ce modelele devin mai puternice
Contribuțiile noastre "speciale" (scaffold, RAG custom, system prompts) → devin irelevante
Skill nou 2026: Capacitatea de a anticipa cum un model mai inteligent schimbă workflow-ul ȘI de a te adapta

💬 Quote-uri Importante

"Is this instruction here because the model needs it or is it here because I needed the model to need it?"

"Ask for what you want in the end and explain why in plain language. And you don't need to elaborate on how to get there."

"Increasingly across 2026, this is the bitter lesson we have to learn. All of the way we have described process [...] doesn't matter anymore and what matters is the outcome."

"The art of prompting is evolving [...] the skill is evolving because the models are getting better. Increasingly the art of prompting is about what you leave out."

"We are moving toward a point where we want one eval gate at the end of the software process and it needs to check absolutely everything."

"Human talent will not [make up for it]. Like increasingly the whole point of human talent is to simplify and get out of the way so that AI can do its thing."

"Claude Mythos is coming. The inflection point is here. This is another one of those moments when you need to be able to catch the train before it leaves the station."

"When models get bigger, they force you to simplify."

💡 Idei & Aplicații pentru Marius

Immediate (Înainte de Mythos)

Audit prompt-urilor existente:
- Ralph PRD generator - pot simplifica instrucțiunile?
- Cron jobs (rapoarte, coaching) - scot linii procedurale?
- System prompts pentru Echo - ce e hardcodat inutil?
Pregătire retrieval:
- Organizare mai bună a memory/kb/
- Lăsa modelul să caute ce are nevoie vs a specifica eu rag logic
Simplificare workflow:
- Unde specific prea mult HOW în loc de WHAT + WHY?
- Exemplu: procesare bonuri, email workflows

Post-Mythos (când se lansează)

Test security: Ruleaza Mythos împotriva ROA repos pentru vulnerabilities
Upgrade Ralph: Outcome specs mai clare, mai puțin procedural
Re-evaluate Max plan: $200/lună worth it pentru acest nivel de capacitate?

Strategic

Skill de cultivat: Anticipare cum modele mai bune schimbă workflow + adaptare rapidă
Mindset shift: Outcome + evals, nu proces descris
Tool definitions: Investește în tool descriptions clare pentru agenți

📚 Sursă

Video produs de un creator AI/productivity focus. Ton: urgent dar practic, cu exemple concrete din customer support, software dev, knowledge work.

Credibilitate: References concrete (Ghost repo vulnerability, stock market drops pentru cyber security firms, confirmări de la Anthropic).

Aplicabilitate: Înaltă - checklist acționabil pentru pregătire.

7.1 KiB Raw Blame History Unescape Escape