Now that I've got 2 Strix Halo machines, one of the biggest things to explore is how well they will work in a cluster, to run even bigger models. Minimax will run on just 2 Strix Halo machines. Adding just a few more machines would allow me to run GLM5 at a good quantization and Kimi 2.5 at a usable quantization. This is a really exciting possibility for the lowest possible cost to get these sorts of models running not only at home, but on the road - it's no huge deal to fit 4 ASUS ROG Flow Z13 in a little travel bag, and the power they required can come from a normal home power outlet almost anywhere.
Apple is currently estimating that there will be a 1/2 year delay shipping M5 Ultra machines, and they've stopped making the 512Gb version of M3 Ultras, due to the RAM shortage, so even Ebay prices on those machines are more than $20k right now.
The only other realistic way to get huge LLM models running a home 'inexpensively' is by using machines with the Nvidia GB10 chip, such as the DGX Spark, which costs around $4500. The ASUS Ascent GX10 is basically the same machine, and you can get one of those with a 1TB HD for only $3500 on Amazon, so that's likely the best contender - and it will perform faster than Strix Halo in a cluster, because it has faster built-in networking (and the better Nvidia stack). If I was serious about running a huge model, I'd focus on that. You'd need to buy cables and a spend a minimum of $3-5k for a fast network switch, in order to run anything more than 2 GX-10s together, so that architecture can get costly quick - but still nothing like a single $100K+ GX B200 Station.
For my needs, I'm tickled with the Strix-Halo. It gets you out of the toy class models, and works fast enough to be useful with models that can actually be used to get real work done. The portability and lower power consumption of the ASUS ROG Flow Z13 laptop is just awesome, and if Q4 or Q5 quant of Minimax can be run on just 2 of them, that's serious capability in a tiny mobile setup. I'll likely push the limits of clustering 2 of those laptops together, this week, and will report back here...