13 KiB
Karpathy's "autoresearch" broke the internet
URL: https://youtu.be/qb90PPbAWz4
Duration: 24:21
Date saved: 2026-03-13
Tags: @work @project
TL;DR
Andre Karpathy a lansat Auto Research - un sistem AI care rulează experimente de optimizare ML automat 24/7. E ca un "robot intern" care testează variante de cod/setări, măsoară rezultate și reține doar îmbunătățirile. Conceptul: dai un goal (ex: "make this model smarter"), AI-ul planifică experimente, editează cod Python, antrenează pe GPU ~5 min, citește metrici, decide next steps și repetă loop-ul. Te trezești dimineața cu best version.
Business opportunity: E foarte timpuriu (early adopters advantage) - cine sare acum și experimentează are avantaj mare față de restul. "In the fog, when people don't understand where the opportunity is, is when there's sometimes an opportunity."
Puncte cheie
Ce e Auto Research?
- Loop simplu: Goal → Plan → Edit code → Train on GPU → Read metrics → Keep if better → Repeat
- Nevoie hardware: GPU Nvidia (H100 testat, dar merge și pe altele) sau rent cloud (Lambda Labs, Vast AI, RunPod, Google Colab)
- Nu merge pe Mac M1/M2 direct (doar via cloud GPU)
- Mental model: Research bot care rulează experimente while you sleep, tries lots of ideas fast, keeps the winners
Cum funcționează (workflow detaliat)
- Set the goal - "improve this model test score" sau business goal
- AI plans experiment - decide ce să testeze
- Edits code & settings - modifică Python code, hyperparameters
- Runs short training (~5 min pe GPU)
- Reads metrics - evaluează rezultat
- Decision: Better? → Save config. Not better? → Log & discard
- Repeat - planifică alt experiment
Două versiuni ale use case-ului:
- ML/Code: Optimizare modele, hyperparameters, arhitecturi
- Business: Marketing, pricing, conversion optimization, competitor research
10 Business Ideas (aplicații practice)
1. Niche Agent in a Box
- Package tiny auto-research loops pentru o nișă specifică dureroasă
- Exemple: Amazon listing optimizer, email sequence tuner pt realtors, pricing optimizer pentru SaaS
- Model: Monthly subscription - "This thing runs experiments 24/7, shows you the winner to click accept"
2. A/B Testing for Marketing (CRO 2.0)
- Landing pages: Agent scrie variante (headlines, layouts, offers), push to traffic, măsoară conversii
- Ads: Auto-test creatives, angles, audiences - keep combos that lower CAC or raise ROAS
- Model: Retainer $5k/lună - "Always-on experiment engine"
- Basically viitorul Optimizely
3. Research as a Service
- Loop pentru: search → read → summarize → compare → repeat
- Use cases:
- Market & competitor research (pricing, features, gaps)
- Investor/M&A due diligence
- Compliance & regulation tracking (crypto, healthcare, finance)
- Model: Per-report fee sau monthly subscription pentru "always fresh dashboards"
4. Power Tool Inside Your Product
- Embed auto-research agent în SaaS existent
- Big "OPTIMIZE" button - user presses, system runs mini research loop
- Exemple: tune prompts, pick best pricing, rank suppliers
- Model: Upsell la Pro/Enterprise tiers sau wedge pentru conversie
5. Agency: "We Run More Tests Than Anyone"
- Pitch simplu: "We do 100x more testing than other shops for same/lower fee"
- Niches: Shopify conversion lab, B2B SaaS pricing experiments, email optimization
- Model: Monthly retainer + bonus/revshare dacă hitting specific KPI lifts
6. AutoQuant for Trading
- Run small fast backtests of simple trading rules
- LLM-based factor screens, sentiment filters - overnight pe un GPU
- Model: Trade on own account SAU sell signals/strategy reports ca digital product
- ⚠️ Risc: "People will get burned by blindly trusting auto-research - need human in the loop"
7. Always-On Lead Qualification & Follow-up
- Point agent la CRM (Salesforce etc) și inbound leads
- Test rules/messages → grade leads → suggest next actions → draft follow-up
- Sales people focus doar pe high-value deals
- Rezultat: More revenue per hour spent
8. Finance Ops Autopilot
- Invoice matching, expense report generation, exception detection
- Continuous improvements to rules & prompts
- Pitch: "We cut your AP/expense time in half"
- Model: Software SAU ops service cu small team + agent
- Exit potential: Achiziție de fintech mare sau bancă
9. Internal Productivity Lab
- Treat company ca Karpathy's GPU lab
- Define KPIs: response time, close rate, ticket resolution
- Let agents iterate pe workflows, templates, routing rules
- Rezultat: Fewer meetings, less grunt work, touch doar high-impact decisions
- Higher productivity → higher profit
10. Done-for-You Research/Due Diligence Shop
- Chew through docs, filings, product pages, reviews
- Keep evolving "living memo" pentru clients (investors, acquirers, execs)
- Model: Fast structured briefs + monthly update packs (nu one-off manual research)
- Speaker-ul: "I would pay for this"
Beyond Business - Impact pe Medicine & Science
Morgan Linton idea (medicine):
- Clinical trial design = hyperparameter search
- Current: tens of millions $ minimum
- Viitor cu auto-research: Agent swarm optimizează treatment protocols pe small proxy experiments
- Promote most promising → move to humans to review (humans in loop dar later, experimentation deeper/faster/cheaper)
- Impact: Disease treatment, human health
AgentHub - What's Next?
Karpathy a lansat și AgentHub - "GitHub for agents" (GitHub e pentru humans)
- Concept: Agent swarm collaboration platform
- No main branch, no PRs, no merges - sprawling DAG of commits în every direction
- Message board pentru agents să coordoneze
- First use case: auto-research, dar mult mai general
Getting Started - Practical Steps
- Hardware: Need Nvidia GPU (H100 best, dar orice Nvidia merge)
- Fără GPU? Rent cloud: Lambda Labs, Vast AI, RunPod, Google Colab
- Mac M1? Nu merge direct - folosește cloud
- Setup cu Claude Code:
- Give it the GitHub repo link (github.com/karpathy/autoresearch - 25k stars already!)
- Ask: "I need help installing auto research by Karpathy"
- Claude Code ghidează prin: clone repo, install UV package manager, dependencies, prepare data, run experiment
- Google Colab (easiest path):
- Go to colab.google.com
- Create new notebook
- Change runtime → T4 GPU
- Paste commands Claude Code gives you
Strategy & Mindset
- "It's very early" - most people don't realize the breakthrough Karpathy is making
- 80/20 thinking: In the fog, opportunity is hidden - when you see legends like Karpathy doing things, PAY ATTENTION
- Tinkering advantage: "You want to tinker with it, have fun, see what it's all about"
- Speed to market: Build fast, learn fast, first mover advantage
Key Technical Details
- Tools needed: UV package manager, Git, Python
- Training time: ~5 min per experiment pe GPU
- Loop philosophy: Plan → Act → Read results → Update plan → Repeat
- Output: Logs everything (charts, metrics) + written summary în normal language
Quote-uri cheie
"It's like having a super nerd robot intern that runs science experiments on AI models for you all night without you doing the boring stuff."
"If you've seen my video on the Ralph loop where it basically would do engineering 24/7 and you'd wake up to new stuff happening - in simplest terms, that's what auto research is helping you do."
"We do 100 times more testing than other shops for the same or lower fee."
"I always find that in the fog, when people don't really understand where the opportunity is, is when there's sometimes an opportunity."
"One thing I've just learned in my career is just like when I see people like Karpathy doing things like this, you want to pay attention. You want to tinker with it. You want to have some fun with it, and you want to see what it's all about."
"Auto research works even better for optimizing any piece of software. Make an auto folder, add a program.md which is really the foundation of how you're going to be using auto research and a bench script, make a branch and let it rip." - Toby Lütke (Shopify CEO)
"I woke up this morning and all I can think about is auto research. So many ideas swirling around in my head. Not sure 99% of the world realized the incredible breakthroughs Karpathy is making and just sharing casually on X right now." - Morgan Linton
"Clinical trial design is itself kind of like a hyperparameter search. It feels like an agent swarm could optimize treatment protocols on small proxy experiments, promote the most promising candidates and then move to humans to review." - Morgan Linton
"For me, while I'm not a doctor, what I'm the most excited about when it comes to AI is the impact it will have on human health and critical areas like disease treatment."
"I'm watching him speedrun a $1 billion company." - despre Karpathy
"This is another solo podcast that I'm doing on the Startup Ideas podcast. The last time I did this last week, I had a lot of comments that said, 'Yeah, Greg, I actually really like when you just come in solo and just start telling us what's on your mind and stuff like that in real time.'"
Conexiuni cu proiecte/context Marius
1. Ralph Loop Connection
- Speaker menționează explicit Ralph loop ca analog pentru engineering
- Auto-research = Ralph pentru ML/research în loc de coding
- Potential: Combinație Ralph + Auto-research pentru optimizare automată code + ML?
2. ROA Business Applications
Oportunități directe pentru Marius/ROA:
A. Internal Optimization
- Finance ops autopilot (idea #8): Perfect pentru automatizare invoice matching, expense reports în ROA
- Auto-research loop pentru optimizare queries Oracle, performance tuning
- Test diferite arhitecturi FastAPI endpoints pentru roa2web
B. Product Features pt Clienți
- "Optimize" button în roa2web (idea #4):
- Pricing optimizer pentru produse client
- Cash flow forecasting optimization
- Inventory level optimization
- Sell ca feature premium → upsell la clienți existenți (80/20 mindset - more value to existing clients)
C. New Product Ideas
- Research as a Service pentru compliance (idea #3):
- ANAF regulation tracking (already doing manual!)
- Auto-monitor declarații noi (D406, D394, etc.)
- Living memo pentru clienți despre schimbări fiscale
- Monthly subscription model
- Lead qualification pentru sales (idea #7):
- Auto-grade inbound leads pt ROA
- Draft follow-up emails
- Focus doar pe high-value prospects
3. Agency Model (idea #5)
- Marius preferă "more work from existing clients at good price"
- Pitch perfect: "We run 100x more experiments on YOUR business (pricing, inventory, forecasting) - same retainer fee"
- Diferențiere clară față de competiție - nimeni altcineva nu face asta
4. Technical Setup
- Marius are deja infra Proxmox cu GPU potential (check LXC 104 Ollama setup)
- Poate rula auto-research local sau rent cloud GPU când e nevoie
- Already comfortable cu Python, FastAPI, automation scripts
5. Learning & Tinkering
Perfect aligned cu mindset Marius:
- "When you see legends like Karpathy doing things, pay attention"
- Early adopter advantage (foggy opportunity)
- Tinkering = learning = outperform 99.9% of people
- 80/20: minimum effort (rent GPU când e nevoie), maximum learning
6. Avoided Complexity Trap
⚠️ Atenție la regula anti-complexitate:
- Nu propune "să construim AgentHub de la zero"
- Nu propune "să facem toate 10 ideas deodata"
- Focus: Pick ONE idea cu highest impact/lowest effort
- Probabil idea #3 (ANAF compliance tracking) sau #4 (optimize button în roa2web)
Recomandare pentru Marius
✅ RECOMAND TINKERING (nu full project):
Why:
- Very early stage - learning curve acum = competitive advantage later
- Low initial investment - Google Colab free tier pentru experimentare
- Direct ROA applications - compliance tracking, pricing optimization
- Aligned cu 80/20 - rent GPU when needed, nu build infrastructure
Action steps (minimal):
- Dedică 1-2 ore (ideal Luni-Joi 15-16 când e mai liber) să rulezi tutorial cu Claude Code
- Test pe un use case simplu ROA: optimizare pricing produse sau test variante email marketing
- Dacă merge → explore idea #3 (ANAF tracking automation)
- Dacă nu merge sau e prea complex → skip, măcar ai învățat early
NOT recommended:
- Building custom AgentHub
- Toate 10 business ideas simultan
- Infrastructure investment heavy fără proof of concept
Timeline:
- Experimentare: acum (Martie 2026)
- Decision point: După 2-4 ore tinkering
- Potential integration în ROA: Aprilie-Mai dacă proof of concept e solid
Source: Startup Ideas Podcast (Greg, solo episode)
Context: Andre Karpathy launched auto-research + AgentHub - going viral on X
Reception: 25k GitHub stars în câteva zile, Toby Lütke (Shopify CEO) endorsement