Post History

Current version by Nick Antonaccio

Current VersionMay 09, 2026 at 03:09

NOTE from future Nick: the Asus GX10 (DGX Spark) is better for agentic workflows. See https://aibynick.com/thread/31 TL;DR: Apple Mac laptops cost at least twice as much as Strix Halo laptops, for the same 128Gb shared memory, so the price/perfomance of the ASUS ROG Flow Z13 is hard to beat. But if you're building a self-hosted GPU server for long context workflows, the prompt processing speed of the Nvidia GB10 DGX Spark machines, and the ability to cluster multiple machines easily, is worth the extra $1000.

I got another ASUS ROG Flow Z13 laptop today. For $2,572.59 on Amazon, you get a complete AMD Ryzen AI MAX+ 395 (Strix Halo) machine - not just a GPU - with 128GB LPDDR5X 8000MHz shareable RAM. That price is for a 'used like new' machine (which is what I bought), or you can buy a pristine new one for $2707, but I think totally new is just a wasted few hundred dollars (new is still more than $1000 less than any machine that can run anything close to comparable sized LLM models).

For that price I'm willing to deal with the small 1TB PCIe Gen 4 SSD. That's big enough to load up all the models I want to use.

Out of the box, this thing runs some really capable LLM models (70b dense and 120b MOE class) quickly enough to be very useful for real work. The laptop is small and light enough to carry easily in a little backpack or briefcase, for totally offline inference tasks while traveling, or anywhere no Internet is available.

Just download LM Studio, Jan, Ollama, or KoboldCPP, etc.

For coding and agentic tasks, download the following models - you can run these all in 8K quantization on the Strix-Halo!:

  • Gemma4: 31B dense and 26B MOE models (and smaller versions for speed)

  • Qwen 3.5: 122B-A10B MOE and 27B dense models (and 35B-A3B MOE for speed)

  • GPT-OSS: 120b (and 20b for speed)

For knowledge work, try highly quantized versions of Nemotron Super and Minimax 2.5.

You're able to work completely offline with those models. They're legitimately useful, not just toys.

Previous Versions
Version 9May 09, 2026 at 03:09

NOTE from future Nick: the Asus GX10 (DGX Spark) is better for agentic workflows. See https://aibynick.com/thread/31

I got another ASUS ROG Flow Z13 laptop today. For $2,572.59 on Amazon, you get a complete AMD Ryzen AI MAX+ 395 (Strix Halo) machine - not just a GPU - with 128GB LPDDR5X 8000MHz shareable RAM. That price is for a 'used like new' machine (which is what I bought), or you can buy a pristine new one for $2707, but I think totally new is just a wasted few hundred dollars (new is still more than $1000 less than any machine that can run anything close to comparable sized LLM models).

For that price I'm willing to deal with the small 1TB PCIe Gen 4 SSD. That's big enough to load up all the models I want to use.

Out of the box, this thing runs some really capable LLM models (70b dense and 120b MOE class) quickly enough to be very useful for real work. The laptop is small and light enough to carry easily in a little backpack or briefcase, for totally offline inference tasks while traveling, or anywhere no Internet is available.

Just download LM Studio, Jan, Ollama, or KoboldCPP, etc.

For coding and agentic tasks, download the following models - you can run these all in 8K quantization on the Strix-Halo!:

  • Gemma4: 31B dense and 26B MOE models (and smaller versions for speed)

  • Qwen 3.5: 122B-A10B MOE and 27B dense models (and 35B-A3B MOE for speed)

  • GPT-OSS: 120b (and 20b for speed)

For knowledge work, try highly quantized versions of Nemotron Super and Minimax 2.5.

You're able to work completely offline with those models. They're legitimately useful, not just toys.

Version 8May 09, 2026 at 02:57

NOTE from future Nick: the Asus GX10 (DGX Spark) is better for agentic workflows

** See https://aibynick.com/thread/31 **

I got another ASUS ROG Flow Z13 laptop today. For $2,572.59 on Amazon, you get a complete AMD Ryzen AI MAX+ 395 (Strix Halo) machine - not just a GPU - with 128GB LPDDR5X 8000MHz shareable RAM. That price is for a 'used like new' machine (which is what I bought), or you can buy a pristine new one for $2707, but I think totally new is just a wasted few hundred dollars (new is still more than $1000 less than any machine that can run anything close to comparable sized LLM models).

For that price I'm willing to deal with the small 1TB PCIe Gen 4 SSD. That's big enough to load up all the models I want to use.

Out of the box, this thing runs some really capable LLM models (70b dense and 120b MOE class) quickly enough to be very useful for real work. The laptop is small and light enough to carry easily in a little backpack or briefcase, for totally offline inference tasks while traveling, or anywhere no Internet is available.

Just download LM Studio, Jan, Ollama, or KoboldCPP, etc.

For coding and agentic tasks, download the following models - you can run these all in 8K quantization on the Strix-Halo!:

  • Gemma4: 31B dense and 26B MOE models (and smaller versions for speed)

  • Qwen 3.5: 122B-A10B MOE and 27B dense models (and 35B-A3B MOE for speed)

  • GPT-OSS: 120b (and 20b for speed)

For knowledge work, try highly quantized versions of Nemotron Super and Minimax 2.5.

You're able to work completely offline with those models. They're legitimately useful, not just toys.

Version 7May 09, 2026 at 02:57

NOTE from future Nick: the Asus GX10 (DGX Spark) is better for agentic workflows

** See https://aibynick.com/thread/31**

I got another ASUS ROG Flow Z13 laptop today. For $2,572.59 on Amazon, you get a complete AMD Ryzen AI MAX+ 395 (Strix Halo) machine - not just a GPU - with 128GB LPDDR5X 8000MHz shareable RAM. That price is for a 'used like new' machine (which is what I bought), or you can buy a pristine new one for $2707, but I think totally new is just a wasted few hundred dollars (new is still more than $1000 less than any machine that can run anything close to comparable sized LLM models).

For that price I'm willing to deal with the small 1TB PCIe Gen 4 SSD. That's big enough to load up all the models I want to use.

Out of the box, this thing runs some really capable LLM models (70b dense and 120b MOE class) quickly enough to be very useful for real work. The laptop is small and light enough to carry easily in a little backpack or briefcase, for totally offline inference tasks while traveling, or anywhere no Internet is available.

Just download LM Studio, Jan, Ollama, or KoboldCPP, etc.

For coding and agentic tasks, download the following models - you can run these all in 8K quantization on the Strix-Halo!:

  • Gemma4: 31B dense and 26B MOE models (and smaller versions for speed)

  • Qwen 3.5: 122B-A10B MOE and 27B dense models (and 35B-A3B MOE for speed)

  • GPT-OSS: 120b (and 20b for speed)

For knowledge work, try highly quantized versions of Nemotron Super and Minimax 2.5.

You're able to work completely offline with those models. They're legitimately useful, not just toys.

Version 6May 09, 2026 at 02:57

I got another ASUS ROG Flow Z13 laptop today. For $2,572.59 on Amazon, you get a complete AMD Ryzen AI MAX+ 395 (Strix Halo) machine - not just a GPU - with 128GB LPDDR5X 8000MHz shareable RAM. That price is for a 'used like new' machine (which is what I bought), or you can buy a pristine new one for $2707, but I think totally new is just a wasted few hundred dollars (new is still more than $1000 less than any machine that can run anything close to comparable sized LLM models).

For that price I'm willing to deal with the small 1TB PCIe Gen 4 SSD. That's big enough to load up all the models I want to use.

Out of the box, this thing runs some really capable LLM models (70b dense and 120b MOE class) quickly enough to be very useful for real work. The laptop is small and light enough to carry easily in a little backpack or briefcase, for totally offline inference tasks while traveling, or anywhere no Internet is available.

Just download LM Studio, Jan, Ollama, or KoboldCPP, etc.

For coding and agentic tasks, download the following models - you can run these all in 8K quantization on the Strix-Halo!:

  • Gemma4: 31B dense and 26B MOE models (and smaller versions for speed)

  • Qwen 3.5: 122B-A10B MOE and 27B dense models (and 35B-A3B MOE for speed)

  • GPT-OSS: 120b (and 20b for speed)

For knowledge work, try highly quantized versions of Nemotron Super and Minimax 2.5.

You're able to work completely offline with those models. They're legitimately useful, not just toys.

Version 5Apr 07, 2026 at 19:24

I got another ASUS ROG Flow Z13 laptop today. For $2,572.59 on Amazon, you get a complete AMD Ryzen AI MAX+ 395 (Strix Halo) machine - not just a GPU - with 128GB LPDDR5X 8000MHz shareable RAM. That price is for a 'used like new' machine (which is what I bought), or you can buy a pristine new one for $2707, but I think totally new is just a wasted few hundred dollars (new is still more than $1000 less than any machine that can run anything close to comparable sized LLM models).

For that price I'm willing to deal with the small 1TB PCIe Gen 4 SSD. That's big enough to load up all the models I want to use.

Out of the box, this thing runs some really capable LLM models (70b dense and 120b MOE class) quickly enough to be very useful for real work. The laptop is small and light enough to carry easily in a little backpack or briefcase, for totally offline inference tasks while traveling, or anywhere no Internet is available.

Just download LM Studio, Jan, Ollama, or KoboldCPP, etc.

For coding and agentic tasks, download the following models - you can run these all in 8K quantization on the Strix-Halo!:

Gemma4: 31B dense and 26B MOE models (and smaller versions for speed) Qwen 3.5: 122B-A10B MOE and 27B dense models (and 35B-A3B MOE for speed) GPT-OSS: 120b (and 20b for speed)

For knowledge work, try highly quantized versions of Nemotron Super and Minimax 2.5.

You're able to work completely offline with those models. They're legitimately useful, not just toys.

Version 4Apr 07, 2026 at 19:15

I got another ASUS ROG Flow Z13 laptop today. You get a complete AMD Ryzen AI MAX+ 395 (Strix Halo) machine - not just a GPU - with 128GB LPDDR5X 8000MHz shareable RAM for $2,572.59 on Amazon. That price is for used like new (which I bought), or you can buy new for $2707.

For that price I'm willing to deal with the small 1TB PCIe Gen 4 SSD. That's big enough to load up all the models I want to use.

Out of the box, this thing runs some really capable models (70b dense and 120b MOE class) quickly enough to be very useful for real work. The laptop is small and light enough to carry easily in a little backpack or briefcase, for totally offline inference tasks while traveling, or anywhere no Internet is available.

Just download LM Studio, Jan, Ollama, or KoboldCPP, etc.

For coding and agentic tasks, download the following models - you can run these all in 8K quantization!:

Gemma4: 31B dense and 26B MOE models Qwen 3.5: 122B-A10B MOE, 35B-A3B MOE, and 27B dense models GPT-OSS: 20b and 120b models

For knowledge work, try highly quantized versions of Nemotron Super and Minimax 2.5.

You're able to work completely offline. These things are legitimately useful, not just toys.

Version 3Apr 07, 2026 at 19:10

I got another ASUS ROG Flow Z13 laptop today. You get a complete AMD Ryzen AI MAX+ 395 (Strix Halo) machine - not just a GPU - with 128GB LPDDR5X 8000MHz shareable RAM for $2,572.59 on Amazon (used like new, or $2707 new). For that price I'm willing to deal with the 1TB PCIe Gen 4 SSD. Out of the box, this thing runs some real capable models (70b dense and 120b MOE class) quickly enough to be very useful for real work. It's small and light enough to carry easily in a little backpack or briefcase, for totally offline inference tasks while traveling, or anywhere no Internet is available. Just download LM Studio, Jan, etc., plus some qwen3.5 and gemma4 models in 8k quant for coding and agentic tasks, and/or highly quantized nemotron super, minimax 2.5, etc. for knowledge work, and you're able to work completely offline. These things are legitimately useful, not just toys.

Version 2Apr 07, 2026 at 01:36

I got another ASUS ROG Flow Z13 laptop today. You get a complete AMD Ryzen AI MAX+ 395 (Strix Halo) machine - not just a GPU - with 128GB LPDDR5X 8000MHz shareable RAM for $2,572.59 on Amazon (used like new, or $2707 new). For that price I'm willing to deal with the 1TB PCIe Gen 4 SSD. Out of the box, this thing runs some real capable models (70b dense and 120b MOE class) quickly enough to be very useful for real work. It's small and light enough to carry easily in a little backpack or briefcase, for totally offline inference tasks while traveling, or anywhere no Internet is available. Just download LM Studio, Jan, etc., plus some qwen3.5 and gemma4 models in 8k quant for coding and agentic tasks, and/or highly quantized nemotron super, minimax 2.5, etc. for knowledge work, and you're able to work completely offline. These things are legitimately useful, not just toys. ()

Version 1Apr 07, 2026 at 01:35

I got another ASUS ROG Flow Z13 laptop today. You get a complete AMD Ryzen AI MAX+ 395 (Strix Halo) machine - not just a GPU - with 128GB LPDDR5X 8000MHz shareable RAM for $2,572.59 on Amazon (used like new, or $2707 new). For that price I'm willing to deal with the 1TB PCIe Gen 4 SSD. Out of the box, this thing runs some real capable models (70b dense and 120b MOE class) quickly enough to be very useful for real work. It's small and light enough to carry easily in a little backpack or briefcase, for totally offline inference tasks while traveling, or anywhere no Internet is available. Just download LM Studio, Jan, etc., plus some qwen3.5 and gemma4 models in 8k quant for coding and agentic tasks, and/or highly quantized nemotron super, minimax 2.5, etc. for knowledge work, and you're able to work completely offline. These things are legitimately useful, not just toys.