Files

MoltBot Service f2973aa76f stage-1: project bootstrap

Structure, config loader, personality/tools/memory from clawd, venv, 22 tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-13 10:20:55 +00:00

20 KiB

Raw Blame History

Claude Code's New Agent Teams Are Insane (Opus 4.6)

URL: https://youtu.be/VWngYUC63po
Durată: 13:54
Data salvare: 2026-02-07
Tags: @work @growth

TL;DR

Experiment comparativ: același prompt (task manager app) executat de 1) un singur agent și 2) agent team (Opus 4.6). Rezultat: agent team mai rapid la faza inițială (4:50 vs 6:55), dar a necesitat rework pentru bug-uri → timp final similar. Single agent a dat UI mai fun (emojis), agent team a dat features mai avansate (board view, settings panel, export/import) fără să fie cerute explicit.

Insights cheie

Anatomia Agent Teams vs Sub-agents

Sub-agents (vechi):

Primesc prompt specific → execută → returnează DOAR rezultatele
Rulează în aceeași sesiune/instanță Claude Code
Token consumption mai mic
Comunicare tranzacțională (one-shot)
Best pentru: taskuri simple, focused

Agent Teams (nou):

Team Lead distribuie taskuri către specialized teammates
Fiecare teammate = instanță separată Claude Code
Token consumption mai mare
Comunicare directă între agenți (teammate A ↔ teammate B)
Poți vorbi cu orice agent direct (shift + arrows pentru toggle)
Best pentru: probleme complexe, colaborare necesară

Cum lucrează Agent Teams

Team Lead primește taskul → creează plan → spawnează teammates specializați
Teammates rulează în paralel, fiecare cu scope distinct
Cross-communication: Dacă teammate A se blochează, poate cere ajutor teammate B
Human-in-the-loop: Poți toggle între agenți în timpul build-ului și ajusta scope

Use Cases recomandate de Anthropic

Agent teams > single agent când:

Code review: fiecare agent se concentrează pe arie diferită (security, performance, readability)
Complex issues: fiecare agent explorează explicație diferită, nu doar o temă

Single agent suficient când:

Task focused, clar delimitat
Nu necesită perspective multiple

Experiment: Task Manager App

Prompt identic pentru ambele:

Build a basic task manager as single page web app

Single agent (Opus 4.6):

Timp: 6:55
Features: task create/edit/delete, priority, projects, subtasks, day/night mode
UI: emoji-uri, mai fun, mai accesibil vizual
Bug-uri: 0 (a funcționat out-of-the-box)

Agent team (UI builder + JS dev):

Timp inițial: 4:50 (mai rapid)
Timp cu rework: ~6:25 (similar final)
Features: tot ce are single agent + board view (kanban drag&drop) + settings panel (export/import JSON, clear data)
UI: mai polished, mai profesional
Bug-uri: settings și board view nu funcționau → a necesitat rework

Concluzii practice

Când folosești agent teams:

Proiecte complexe cu multiple componente (frontend + backend + database)
Când vrei features avansate fără să le ceri explicit (agent team "gândește mai profund")
Code review cu perspective multiple

Când folosești single agent:

Prototype rapid, simple apps
UI mai important decât features avansate
Vrei ceva funcțional fast fără rework

Trade-off:

Agent team → mai multe features, mai polished, dar mai mare risc de bug-uri inițiale
Single agent → mai rapid la funcțional, mai puține features dar mai stabil

Detalii tehnice

Setup agent teams:

// ~/.claude/settings.json
{
  "CLAUDE_CODE_EXPERIMENTAL": "1"
}

Models disponibile:

Opus 4.6 (standard)
Opus 4.6 1M context (pentru long sessions, complex tasks)

Reasoning effort:

Low (default)
Mid
High (unrestricted tokens, best pentru complex builds)

Navigation între agenți:

Shift + Up/Down pentru toggle între team lead și teammates
Poți da instrucțiuni directe fiecărui agent

Observație interesantă

Agent team a adăugat features necerute (export/import JSON, settings panel) - semnalează că gândește mai proactiv și mai profund despre ce ar trebui să conțină un task manager real.

Aplicabilitate pentru Echo

Ce învățăm

Pattern folositor: Specialized roles mai clar definite

În loc de sub-agent generic pentru "procesează YouTube"
Ar putea fi: Transcript Extractor + Insight Analyzer + KB Organizer

Trade-off conștient:

Agent teams = mai multe tokens, mai multe features, risc bug-uri
Single agent = mai puține tokens, mai rapid, mai stabil
Echo folosește sub-agenți (lighter) pentru taskuri diverse → corect pentru use case-ul nostru

Când ar merita agent teams pentru Echo:

Dacă ar face coding projects mari (ex: feature nou pentru dashboard)
Code review pentru scripts existente
Security audit complex

Când rămânem cu sub-agenți:

Email processing, insights extraction, reports → single focused task
Coordonare simplă între taskuri (morning report, evening report)

Transcript complet

[... transcriptul integral ...]

Hello legends. In this video, we're going to be diving into the new Claude Opus 4.6 and we'll be explicitly exploring the new agent teams. So, I'm going to be running an experiment in this video where I'm going to be spooling up a Claude code instance using the Opus 4.6 model, getting one agent to build out a certain project for us, and then I'm going to be spooling up another Cloud Code instance, enabling the agent teams mode, then giving the exact same prompt, and then comparing the results. So I want to quickly touch on the anatomy of an agent team versus what we previously had which was a sub agent. So sub aents work by receiving a very specific prompt. Then that prompt is actioned and completed and then only the results of the completion are shared back with the main context window. Whereas agent teams are a little bit more sophisticated. So when we enable the agent team, we actually first pull up a team lead that manages a number of different prompts depending what kind of tasks we need to complete. The team leader will then distribute those prompts amongst very specialized teammates and all of these teammates actually spool up their own instance of clawed code. So this is important to note because uh sub agents actually work within the one instance or the one session but then each of these guys actually runs in their own isolated and dedicated session. So the sub agent there is uh lower token consumption and technically speaking a lower quality of output. They're meant to be used for like simpler tasks. But when you're using an agent team because they have their own instance, they can actually use a lot more tokens and they can run more independently. Now the interesting thing is that all these guys can actually speak between each other. So if teammate A is working on the front end, teammate B is working on the back end. If teammate A gets stuck, they can directly communicate with team A B to kind of get unstuck or to try and figure things out and they can keep communicating back and forth between the team lead and then the teammate. Whereas in this instance, you'd spool up the sub agent that would complete the task and then communicate it back and if they had to do another run at completing a specific part of work, it wouldn't actually be a direct communication. It would just kind of be very transactional in nature. And then the final interesting thing is that we can actually speak to any one of the team members at any time. So when we're in the cloud code terminal, we'll see later on that we actually have access to toggle between the team leader and any of the open teammates and just communicate with them directly. So we can change their scope of work, get feedback on their working, what what they're working on, and even try and unblock them if they get stuck. Anthropic summarizes the difference really well over here where they say sub agents are better at working out smaller, more focused tasks, and then agent teams are better at figuring out complex problems where they can collaborate together. So here's a really good example of when you would actually fire off an agent team. So if you want to review a code base, typically when you just issue one agent that task, they'll latch on to one theme or one key area and then struggle to actually diversify their approach. Whereas if you have an agent team, you can give each of those members in a team a different distinct area to focus on. Therefore, each of these areas will get an appropriate coverage from each agent. And the second example from anthropic is in a very similar light where if you want to understand a complex issue, typically one agent would again just find one possible explanation, one key theme, latch on to it. So then you actually want to prompt up an agent team and then give different members of the team distinct areas to explore and to look at. So I've got two instances of cursor open up. I've got a normal instance where I'll just be using one agent to build out this prompt for us and then another instance where I'll be spooling up an agent team and giving it the exact same prompt. So I'm just spooling up the very first instance using this single agent. I'm asking it to build me a very basic task manager app as a single page web application. So it's nothing fancy. It's just enough for us to test this out and see what happens. So now in our next terminal, in order for us to set this up, we first have to initialize agent teams. So to enable agent teams, we have to go into our settings.json file and then add this environment variable, which is claude code experimental, and then toggle it to equal one. So just before we do that, make sure that you're running the latest version of claude code. And you can just check which models you have available to you by going into forward slashmodel. And once you get into here, you got two different options for an Opus 4.6 model. You have the regular Opus 4.6. Then you also have the 1 million token context 4.6, which is better for those longer sessions and more complex tasks. If you toggle with your keyboard left and right, you'll also see that on the very bottom, you can toggle your reasoning effort. So you go low effort, mid-effort, and then high effort, where high effort basically uses unrestricted amount of tokens, which will give you better results for more complex tasks and bigger project builds. I'm going to leave mine on low effort and just going to use the regular Opus 4.6 model. And I'm going to ask Claude Code to initialize the uh basically change the settings for me so we can initialize agent teams. So Claude's found that file. Let's just give it permissions for the session. And it's identified the change that it has to make. So, it's going to be adding exactly what we had in the instructions. So, I'm going to go yes once again and done. Looks like Claude Code has made that change for us in the settings.json file. Now, important to note, we cannot start an agent team in this specific session. We have to restart Claude Code. So, to do that, you can just go into plus and then go into a new terminal, restart Claude Code here. And now Claude will be able to access the agent teams. So it's important to note when we want to initialize a new agent team at the prompt level. We have to explicitly give those instructions to cloud code we have to say create an agent team to XY Z to complete some kind of task or project for us. And here is our prompt. So create an agent team to build out the following for us. And this is the exact same prompt that we gave to the very first instance on cursor where we had just a single agent working for working on this project. So here we are. Cloud code is immediately telling us I'll create a team to build out this task manager app in parallel. let me set up the team and then break down the work. So, we'll first be building out the plan of what we need built and then we'll be assigning each of those items from the plan to a separate agent on our team. So, now we're sporting up the teammates for each task that we created. So, the first task is to create the HTML and CSS. So, the UI builder is going to be working on the HTML and CSS. Once that's done, I'll spawn up the JS developer to add the application logic. Let me wait for the first task to complete. So, after each task was defined, then Claude Code was able to spin up the teammates. So, it span up the very first teammate which is building out the uh looks like the front end, the HTML and CSS. And then on the bottom over here, we can see that there's two teammate agents already deployed. We have the main agent, which is the person that's the actual team lead. And then we have the UI builder agent. And if I hold shift and I press up and down arrows, I can see that I have access to the team lead to speak with them directly. And if I press shift and then down, I have the UI builder. Now, in this instance, the app is pretty simple, but I've actually built out a couple of apps where there's three agents working in parallel. Let me just click yes for these edits. And then I actually have options of three different agents that I can speak with at one time. And sometimes during the course of the project, you'll actually find that once one specific component is finished and the feedback goes back to the team lead, the team lead will create some new project and actually spin up new agents to work on the next part of the build. So, it gets pretty advanced and it's also really fun to see. There we go. So, we're actually spinning up different agents now. So task one assigned to the UI builder. Build the HTML structure and CSS theming. Create the foundational HTML structure. We've got a skeleton given to it. And actually a very clear set of instructions for this builder. And now we can see down below if I hold down shift and I toggle up and down, we've got the team lead just as before. Then we have the JavaScript dev, which is the new agent that was just created. And then we have the UI builder. Coming back to our first agent, let's just click accept on these permissions. Oh, and there we go. It's actually just finished now. Let's pull this up. Okay, so this looks really cool. We have views on our side, due date, priority, projects. We also have toggle for night mode and day mode. Let's create a new task. So, cook dinner. Leave the description empty. We pop up a little calendar. That's awesome. Subtasks. Even subtasks. That's cool. So, buy food. Let's hit enter. All right. Just click new subtask, cook food, and then a new one, eat food. So that's interesting. It's like a layered, it's like a folder. It's like a layered task. Typically, like from demo task managers that I've built, it's usually one top layer and then you just click close in a top layer. But now you have little subtasks, which is what a lot of actual like ClickUp like actual task managers do. They'll have subtasks within a task. So that's really interesting. We have active task completed. So, if I actually just click one, two, and if I click this cook dinner, and it goes into the completed view. That's nice. And let me just see if I right click on here. Okay, we have edit, reopen. Wow, this functionality is actually insane. I actually I can't believe that I can change priority by doing the right click on here. Move to inbox, move to work, move to personal. Okay, so we have different views over here. So, personal view shows that project. Let's move it across to inbox now. and it moves. This is beautiful. This is really nice. This is Opus 4.6 on its own. And that is very impressive. And if we delete that and let's go to create a new project as well. I'm going to call this dinner. Let's click enter. There we go. So, as this stands, this is actually very impressive. It's It looks really good. It doesn't look like it's been made by a botch AI like a lot of the older apps that we've been building across the last few months. And on top of that, this is usable. It's immediately usable. The functionality is very easy, very smooth to use. And it gave us a bunch of different sections, due dates, priorities, and projects as a starting point. This is fantastic. You could literally use this to run your personal life without buying another tool. So now back in the agent team, we see there was a shutdown request because all the tasks were completed. Let's fire this off and see how it looks. Okay, so this looks really interesting. The first thing I want to note is that the name was the exact same as that first version from the single agent task flow. Otherwise, this looks a lot more polished. We have the night mode and day mode with a bit more of a transition between both. Let's add a task here. What needs to be done? Cook dinner. Leave the description for now. We can choose the date. So, looks like the date creation function or the task creation functionality is very similar. Oh, we've got subtasks here as well. So, by food, add a new one. Cooked food. And if I hit enter, okay, just creates the task. Nice. Let's actually go back to edit this and have one more. Eat food. We can't forget about eating the food. Save the task. If I click on write. Okay. So, our options here are a little bit different. We can edit the task, duplicate, or move it to a different section. And we can just choose numbers here. So, that navigation panel is a little bit different. If I just come back into here and quickly create a an eat food task and I create this. If I do the right click here, I've got a lot more of a navigation panel. So, this is yeah, it looks a little bit more like less polished, less refined, but it is in my opinion for me a bit more usable cuz I just like having everything easy access here. But you can definitely see like this side panel over here, it does look a like a lot more polished. So, I'm just going to quickly flip across for both. This is that single agent one. We're using more of these emojis. We could definitely go back and refine this and say don't use the emojis, but I do like how it's more filled out over here. But then the agent team one does look a lot more refined. So I actually like the side panel here. It's very nice. We didn't have the settings panel in the first version. But if I click this, it doesn't actually go anywhere. And if I go onto board view, it doesn't go anywhere as well. So I'm just going to come back to the agent team. I'm going to click escape to return to the team lead. I'm just going to say, can you confirm the JavaScript is working because some of the buttons which was settings or board view, they don't work. So, let's see what happens when we uh when we try and fix this. So, there we go. Looks like we have a fix. Now, the board view and the settings buttons uh will actually do something. So, now coming back to our task manager, if I click on board view, nice. Actually, looks like we can drag and drop between here as well. So the board view now works and it changes the status as we drag it to a different section of our pipeline between urgent to high to medium. It actually applies that to the task as well. And now let's check out the settings panel. So over here we have light mode or dark mode. Okay, default view and then we have export data as JSON or import and then clear all data. So now to do a postmortem between both builds either using the single agent or the team of agents. The single agent took 6 minutes and 55 seconds to build out the task manager. And I actually like how this looks more than the agent team app. I like that they're using emojis over here. It's a bit more fun. It makes me want to create tasks and follow through with them. And then looking at the agent team, the first part of the build was actually a little bit faster. So it was 4 minutes and 50 seconds versus 6 minutes and 50 seconds. So 2 minutes faster. But then we had to come back and rework the canban board button and also the settings button. And that added an additional one and a half minutes to our build time. So I would say the build time is like for like. And in this portal, while I don't necessarily like the UI as much as the first one, I could definitely fix it by prompting up. I do like that it went a little bit deeper and it had the board view. And I do like the settings panel. So I didn't even prompt up that I wanted a settings panel. And I didn't even think of having an export or an import functionality to take tasks between workspaces. So I've been testing the new Claude Opus 4.6 the six model on some other projects that I've been building. I ended up building a like a car dealership portal that runs tasks. It has a database on Superbase. I installed the Superbase MCP into the 4.6 model and was able to integrate the database for authentication and start building a bit of the backend so it can actually create customers, create tasks and jobs and all this stuff. So, that was very interesting. If you guys want to see more of an advanced build, please let me know in the comments. Thanks for watching and I'll see you in the next

20 KiB Raw Blame History