Thread | Plain Text Nostr

+- brian -- 22d --------------------------------------------------------------------------------------------------[...]+
|                                                                                                                      |
| Claude dropping subscriptions access to agents is a win. So much compute being wasted by meandering calls. It means  |
| the ai companies are compute constrained and are recognizing it.                                                     |
|                                                                                                                      |
| Honestly, agents shouldn't run in enterprise models. You should build modules for your agent with top models but     |
| execute the system with localish 30-80B models. Qwen3 80B and qwen3.5-35B are both usable at 128 GB VRAM.            |
|                                                                                                                      |
| But most people use agents for repeating tasks. Better to concretize those tasks in modules that you can call again  |
| and again. My personal feeling is that common users will be boxed out of compute sometime in 2027-28 as corporate    |
| use scales faster than energy production. Adapting to local focus is critical to building something you can actually |
| count on. DGX spark is interesting but speed constrained on memory bandwidth and current NVFP4 compatibility with    |
| vllm and sglang. If/when NVFP4 runs well on DGX Spark, that will be a local breakthrough. Then I think we will see   |
| 80 tokens/s for agentic tasks. TQ3 context quant could also bring incremental improvement due to the aforementioned  |
| memory bandwidth issue.                                                                                              |
|                                                                                                                      |
+-- reply --------------------------------------------------------------------------------------------------------- ---+