My $20/month zip file software development workflow with ChatGPT

234 views Pinned
Nick Antonaccio
Nick AntonaccioAdmin
Apr 21, 2026 at 11:36 (edited, 19 revisions)
#1

TL;DR - package your entire project, with all supporting files/context, into a zip file; upload the zip file to a ChatGPT conversation, and prompt it to add/adjust project functionalities as needed; GPT will surgically edit the zip file and provide a new version to download; upload the zip file to your server, unpack it, and run it; when a chat context does eventually get long enough that performance degrades, save a .mhtml file of the current conversation, upload it, along with your most recent zip file, to a new chat - continue working where you left off, rinse and repeat...

Your zip file content can be much bigger than the context length limit of the LLM - GPT will automatically spawn and manage coordinated sub-agents to complete long tasks which require their own multiple, separate contexts (you don't need to do anything to make this happen, just work with zip files and GPT knows how to break down even the biggest projects into manageable pieces)

Everywhere I look, developers seem to be spending piles of money building software with Claude Code and the Claude LLMs, and using a variety of agents which run on local hardware, burning tremendous volumes of tokens through costly LLM APIs, deploying to unnecessarily pricey propriety hosting systems, using complex IDEs & other closed tool systems, and otherwise making the process of developing software with LLMs much messier, more complicated, more proprietary, and more expensive than it needs to be.

I've written in another post on this forum about how I've used gemini-3.1-flash-lite-preview with Nullclaw as an effective/inexpensive alternative to Claude Code and the Anthropic LLMs, but none of those class of agentic options form my go-to workflow, which I currently use to build complex commercial software.

After 30+ years of software development experience, and 3.5 years of using LLMs to build a huge volume of production software, I've settled on a simple workflow which has been ridiculously cheap to use and extraordinarily successful.

I've built and shipped a pile of expansive production applications over the past 6 months, with a zip file workflow that runs entirely in ChatGPT, which requires zero agentic tools installed on any of my local machines or VPSs - and all for a grand total of $20 per month.

I expect that the work I've completed with this $20/month approach would have cost at least $3000-$5000 per month if I'd been using Claude Code and the Anthropic LLMs to achieve the same results - and that savings has all been added directly to my bottom line.

To be clear, the resulting applications created with this ChatGPT zip file workflow do run on local machines and VPSs, but the development workflow happens entirely within ChatGPT, without any other infrastructure required. This means I can employ it on any client machine that has a browser and a command console available (I've even used it on my inexpensive Android phone, in Termux). I switch between machines often throughout the week, and can do that without having to set up any sort of local development environment, on any device, anywhere I go. This makes traveling and regular mobile development a piece of cake to manage.

OpenAI has built all the agentic tooling required for this workflow, directly into ChatGPT.

The production projects I've completed with this workflow are not vibe coded toys (although I've built plenty of little one-off throw-away apps), but significant works which have involved hundreds of deployed versions over many months of constant iteration, with multiple clients, including multiple connected apps consisting of many tens of thousands of lines of code each, all involving many deep/precise logic, schema, UI, and workflow specifications, very complex performance optimizations related to things like critical real-time interactions, many challenging role-based feature integrations, and a wide variety of complex functionalities that go far beyond basic authenticated CRUD interactions. Building traditional CRUD functionality with nice UI interfaces is a no-brainer for this sort of workflow, no matter the level of schema complexity.

Before explaining how the zip file process works, I'd like to point out that most of my clients pay $50 per year for their VPS hosting, to deploy apps. I don't ask clients to rely on any unnecessary 3rd party hosted services, except for a few things like Twilio SMS service, merchant payment processing systems, LLM APIs, and some other industry-specific and specialized services which very few organizations choose to self-host. I never rely on companies like Vercel or any other hosting service which mark up basic functionality that can simply be installed on a bare VPS.

Some clients, who need to comply with HIPAA and other regulations, host with companies such as Atlantic.net, to leverage their monitoring, backup, and other service offerings that help satisfy the most troublesome and time/money consuming IT/infrastructure compliance requirements, but again, the working environment for any application I deploy is typically run on a plain old bare VPS (typically using Ubuntu (Linux) as the OS).

I'm also more than happy to deliver to Microsoft OS server environments when needed. All my tooling runs 100% cross platform (in other unusual OS environments, on older/ancient OS versions, in light weight local environments that are air-gapped from the Internet, on mobile server devices like old phones and IoT devices, etc.). The client interface for almost all the software I build, is a browser.

Using this development process, spending $20 per month for AI development, and $50 per year for hosting, provides a solid baseline foundation for building basically any sort of common application functionality, across the typical sorts of connected applications suites needed by most small-medium sized businesses.

Nick Antonaccio
Nick AntonaccioAdmin
Apr 20, 2026 at 15:56 (edited, 2 revisions)
#2

How the zip file development process works

When a new software development process is started in ChatGPT, I ask for a downloadable zip file such as projectname1.zip , which contains all generated code and required environment information.

For a Flask project, that includes all the SQLAlchemy model files (database schema), all the UI templates, all the environment variables in a .env file, the libraries needed in a requirements.txt file, all supporting config files, JSON files, image files, documents (.csv, .md, .xlsx, .txt, .doc form-fill templates), etc. - everything required to make the project run, in the typical directory structures they're expected to be in, all in a single zip file.

If I'm continuing a project from an existing workflow somewhere else, I package everything required into a similar single zip file.

Along with the zip file, in the chat conversation, I upload any other required files & documentation, explain all surrounding perspective, and provide any necessary data/information which is required to build the proposed functionality in the current development step.

Any requirements which need to be satisfied for the current iteration of development work get explained in as much detail as possible in the conversation with ChatGPT. As always, with LLM prompting, more detail and more concise context generally yields better output. Dividing work into smaller iteration steps is almost always better. Focusing on one manageably sized iterative goal at a time, within a well engineered overall process, is what works.

When working with the ChatGPT zip file workflow, you should ask for the final output of every development session to be provided as an incrementally numbered version of the project zip file (projectname2.zip, projectname3.zip, etc.).

Download each new complete zip file to your local machine (ChatGPT will make a download link available directly in the chat), and then send it to your server with SCP (for example: scp projectname1.zip username@yourdomain.com:~/projectfolder). SCP is built into most operating systems by default, and ChatGPT runs in a browser, so you shouldn't need to install any tooling to enable this workflow.

I typically save all the versions of my project zip file in /saved and /unused folders, within a larger project folder that also contains /documentation, /mhtml, and other supporting folders. This gives me a consistent structure to store the complete history of an entire project, and all historical context, notes, emails with clients, etc., in a single folder.

When I travel or use a new machine, I simply zip up the entire project folder and transfer it to another machine. Every time I make a transfer to a new machine, I update the date in that master project folder name. Using this routine, I can transfer the entire history of multiple projects, with every piece of surrounding context, and everything needed to work on every project, in just a few minutes.

I back up each of these dated master project zip files on local physical hard drives that I keep automatically synced, and in online file servers which are also automatically synced, so there are many multiple redundancies, some of which are physically portable (I always keep a portable flash drive, micro SD, and external drive with me when I travel), and all of which are available online.

Because Flask apps are so small, most of my complete project zip files are only a few megabytes. When they get too big, I prune any unneeded versions of the project zip files (v1.zip, v2.zip, etc.) from the archive - since those versions are always available in previous backup archives, nothing ever gets lost.

On the server, you can have as many applications running simultaneously in separate tmux sessions, each with their own environments containing the necessary installed libraries (in Python, using venv and pip, for example). Simply:

  • SSH into your server command line
  • attach the tmux session of the project you're working on (tmux a -t projectname)
  • unzip -o projectname1.zip in the active environment
  • then run the application (python3 run.py)

You can instantly revert to any previous version on server, just by unzipping and running the required zip file.

I keep a text file in each master project folder on my local machine, which contains all the SCP commands and credentials I need to SCP and SSH into the project server, so logging in on any new machine is just a matter of copy/pasting a line into the command console (it doesn't matter what OS I'm using as my local client - all these pieces like SCP and SSH are generally 100% portable).

Because SSH is built into every common OS by default, again, you don't need to install any software to enable this workflow. You may just need to apt-install unzip or tmux on your server (and perhaps nano and/or whatever other small tools you need, and perhaps whatever particular version of Python, or Node, or whatever other ecosystems tools you use) when you use the server for the first time, but using the tool kits I prefer, that's a trivial one-time task that takes just a few seconds.

To be clear, none of this process involves editing any code files on your local machine or on the server, although you can choose to manually edit anything you want - just be sure to share any manual code edits you make, back in your ChatGPT conversation, so GPT can incorporate any manual changes you've made into the current working project zip file. Even if you change something as simple as the port an application runs on, let GPT know, so you don't have to continue to make those edits manually every time you unpack a new zip file version of the application.

Typically, in this process, GPT writes all the production code - the current models of GPT do a better job than most human developers do at writing code, so I tend to simply prompt GPT to make any edits which I would have previously performed manually.

The important thing that happens with a zip file project package, is that a large majority of the work you previously needed to perform in code editors, and/or in interactions with a chat bot (copying/pasting code, for example, from a conversation, or using an IDE or locally hosted agentic system integrated with an LLM API to edit code), is all eliminated. You don't need to edit any code, or install any tools whatsoever. You don't need to give a locally running agent access to your production system, or even your development system. You just have GPT edit your project zip files.

One key point is that with the zip file workflow, the context size of what the LLM can handle, is tremendously expanded - and that work is automatically divided up and handed off to multiple agentic processes built into ChatGPT.

So instead of copying/pasting generated code into files that you need to manage manually (or doing this work in an IDE connected to an LLM coding agent), you let ChatGPT use its own built-in agentic tooling to do that work for you. It will intelligently use its own built-in tools, an work within its own internal workspace, to open up the zip file, and for example, use regular expressions to search for and intelligently explore the code in every file contained in the zip package.

Most important, ChatGPT will use its own agentic capabilities to spawn as many sub-process contexts to achieve any required development goal, so that your main conversation context doesn't get polluted and filled up by each of those mini-goals. A separate process is launched to open the zip file, and another process is launched to summarize the content of the contained files, and separate processes are typically spawned to write and test newly generated code, and separate processes are launched to surgically replace code in each existing file within the zip package, based on all that writing and testing, and separate processes may likely get spawned to package those files back into a newly downloadable file, and so on.

None of those separate processes use up context in your main conversation. Each process completes its task, provides an artifact, and/or reports a summary back to the context that spawned it. GPT does all this entirely automatically with its integrated tooling, knowledge, and the workspace it has available (all separate from your local or hosted dev/prod workspaces).

When working with zip files, GPT automatically knows how to break down complex requirements into smaller steps that can be spawned as separate contexts, and the work completed in those contexts can be coordinated by GPT's built-in agentic workflow capabilities, using built-in tooling, MCP servers installed by OpenAI, etc. By automatically breaking up tasks into throw-away processes that have their own separate contexts, the context size of a project becomes virtually unlimited, and the main conversation never needs to contain the entire workflow methodology devised by GPT.

What I've been amazed by is just how much work OpenAI gives you for free with the $20 subscription. I've never hit a rate limit with the zip file process, although clearly burning many tens of millions of tokens at a time in ChatGPT, sometimes building multiple projects at the same time in multiple open ChatGPT sessions, nonstop all day long, for days in a row, for many months in a row. This sort of work load would cost thousands of dollars per month using Claude Code and the Anthropic LLMs - and GPT does a comparably fantastic job not just writing code, but also intuiting your intended goals based on less than perfect prompts.

Keep in mind that your success with any LLM based software development effort depends tremendously on the tech stack you chose. Using Flask and the Python ecosystem is a guaranteed win - there are many billions of lines of code and documentation published in that ecosystem, which every LLM gets trained on deeply. The same is likely true of HTML/CSS/JavaScript, React and the other popular web UI libraries, Java, and other mainstream programming language tools.

Just don't expect to get good code results from an LLM if you're using any sort of lesser known language/library/tooling. You can accomplish some in-context learning by providing documentation - and many lesser know tools actually do provide LLM files to help orient and guide LLMs to more effectively work in-context with unknown tools - but LLMs will always work orders of magnitude more effectively with tools they've been trained on deeply.

During the process of building code, the context of your conversation with GPT should be focused on steering the LLM towards clear and achievable goal small goals, within a larger set of engineered steps. You should avoid filling the conversation context with written out code - that code should stay within zip files. The conversation context should hold links to developed artifacts (new zip file packages that contain the entire newly developed project and all information/context needed to continue from that point forward), and should display information needed to understand the decisions which were made, and the efforts that were taken during the time consuming process of building and testing iterative solutions. The results of all that work are contained in the zip file artifact, that artifact can be explored later as needed, and it becomes the solid basis of all future work. You shouldn't need to forever share previous conversations, just share the current zip file and work from it.

GPT generally does a great job of compacting (summarizing) your main conversation as context gets filled up, but you should do your best to keep it lean.

When a conversation context gets too big, GPT's code writing and reasoning performance will begin to degrade. When that happens, save your entire current conversation as a .mhtml file, start a new conversation with the current project .zip file and that .mhtml file attached, and tell GPT to continue working from where you left off. GPT will spawn a process to read and summarize what it needs to know from the .mhtml file - and it can continue to refer back to that file as your new conversation progresses, especially if you point out important pieces of the previous conversation which it should pay attention too.

I pay a lot of attention to ensuring that I have logic pointed out (in displayed text) in any chat conversations, if I think that information may need to be available to future conversations (i.e., I make sure the displayed text of my current conversation contains everything I'd want a future conversation to be able to read and understand), so that I never need to re-do work in a future conversation. I never need to re-type any explanations, and ChatGPT should never need to perform any development work again, if everything is contained in the project zip file.

Another really important pointer is to always include this sentence in your prompts to edit code: 'Please be very careful not to change any other functionality in the application'. Current LLMs are getting much better at avoiding unintended regressions, but I still adding that to prompts helps. I also specifically tell GPT that that sentence is very important to remember every time we're building software code.

Sometimes I have GPT actually check its own work with prompts like 'Please ensure that no app functionality has been changed, beyond what has been requested in this conversation' and 'Please confirm that the changes between v530.zip and v540.zip are only those specified in this conversation, and that no other functionality in the application has changed'. GPT will perform diffs between all the files in your project, explain what the code changes do, and explain whether or not those code diffs could lead to any unexpected behavior (that's rarely the case, if ever).

Along the same lines, you can use GPT to evaluate entire existing functionalities, whenever you need to understand exactly what the code currently does, and how any functionality needs to be changed. You can ask for anything from a high level synopsis, to the lowest level details. You can choose to never touch a database schema, or be fully involved with how any schema is created and altered. You can choose to never make a decision about which libraries should be used, or how a logical process is devised, or you can choose to control every step of the logic, down to the individual characters in the code. However involved you are in the details, you can ask GPT to explain what already exists in the functionality of the app, and change that functionality, build upon it, etc. In most cases, if you understand what you're asking GPT to build, it will write very solid code. Just test absolutely everything before pushing to production, and if there are errors or functionality issues, provide GPT with enough debug information to understand anything that's not working, and the full scope of context needed to understand how any unexpected edge case data values may need to be attended to. Treat the process like working with a team of human developers. Communicate well, clarify intent, provide complete requirement details, test and iterate, and it will do a good job.

One other tip that really helps to make development with GPT move more quickly, is to be clear about where in the code a particular functionality is found. For example, I'm constantly using verbiage such as 'In patient chats such as https://mydomain.com/patient/262 and staff chats such as https://mydomain.com/conversation/5 the blah blah blah functionality needs to be updated to do blah blah blah blah. Providing those route paths saves lot of work for GPT having to search among pieces of code scattered throughout the entire code base, to reason about where it needs to work. If you can point out a URL, or a template file, or a function name, etc., let GPT know. It very rarely hurts to provide more concrete details and to clarify context.

Another bit that may save you some time, is that at this point I almost always run GPT in plain old default thinking mode - I have not recently needed use extended thinking mode for any typical work (even for complex tasks - as long as tasks are broken down so that every detail in a task can basically be explained in a single prompt). The current default GPT models are so smart, they typically get most coding tasks right first-shot, without any extended thinking, so you likely don't need to waste the time or the tokens on extended thinking.

Once you get the pipeline of this workflow fully established (all server infrastructure installed, and environments configured, local SCP and SSH commands saved, project folder established, etc.), it's not uncommon to literally paste emails from clients into existing ChatGPT conversations, and get new functionality completely built without any work whatsoever - and those sorts of improvement can continue over hundreds of iterations steps (or more)!

Being accustomed to working within a well conceived set of iterative steps, and understanding how to provide all required context, is the reason experienced engineers typically have better results completing software projects with LLM generated code, than non-technical users do. Understanding how all connected architecture works, and understanding how every piece fits together, is still essential: understanding supporting infrastructure, ecosystem tooling, network/database configuration, logic, CPU cycle and other hardware usage optimization (reducing big O complexity everywhere), etc., along with understanding what the potential solution might be to any issue, and simply being able to communicate with clients and collect requirements about what they intend to build, is a huge part of the work required to build any big project, and that work can always be steered more effectively by an engineer who has decades of experience doing all those things entirely from scratch with hand written code and manual system configuration, throughout every stage and in every detail of writing code and setting up the larger context of a project. The more experience you have with general software development, networking/IT, and business domain knowledge (understanding the workflows your clients engage in daily, and what all their data actually means, how it fits together, and how it's used to make decisions, etc.), the better you'll be able to succeed using LLMs to complete software projects. Treat LLMs like very talented and knowledgeable team members who work at the speed of light, but who still need to understand the context and specific details of any project, as well as the preferences and points of view of your clients, the purpose of the project within the bigger set of their established unique business practices, etc., and you'll do far better than just expecting magic output from the AI.

Nick Antonaccio
Nick AntonaccioAdmin
Apr 20, 2026 at 15:40
#3

Minimax

Minimax is the only other hosted frontier model chat interface that provides zip file handling, .mhtml reading, and all the other built-in agentic capabilities needed to read, understand, and surgically replace code in entire large projects contained in a single zip file, automatically spawning sub-agents to complete tasks without filling up the main conversation context window, automatically compacting the main context window as needed, etc. And although Minimax is pretty darn capable, it's no where near as good/reliable at completing complex development tasks, as any of the GPT 5+ series.

Nick Antonaccio
Nick AntonaccioAdmin
Apr 20, 2026 at 15:42 (edited, 1 revision)
#4

Historical Notes from Rebolforum

Below are some messages from from the old Rebolforum discussion about the zip file workflow. These are scattered throughout the full copy of the old Rebolforum posts, which are available in full, in the General Discussion thread on this forum - the most important pieces are copied here.

These notes trace a bit about how the technique developed, as well as some redundant conversation with other developers, but all the relevant notes from old discussions are copied in full, so that details about the methodology and benefits are available in this single post:

Nick — 14-Sep-2025/23:47:31-7:00

I'm absolutely still favoring ChatGPT as my core LLM workhorse. It's just the best at working with files, and making long iterative workflows progress ergonomically (for lack of a better word). My favorite methodology is to upload a complete code base for a project, in a single zip file.

I always ask for complete updated code in a downloadable zip file, with an incremented revision number in each new file name (v26.zip for example). I SCP that file to my server, unzip it and overwrite all, then run the app (python3 app.py). On the server this all happens in a folder with a Python environment active - all libraries in requirements.txt (which stays in the zip file). I also keep all current environment variables in a .env file in the zip package.

GPT edits the code directly in each successive zip file, and provides a downloadable zip with all the updated code needed for each revision. This makes it super simple to switch between revisions, revert, jump around, etc. - just unzip any revision numbered zip file and run the app. This workflow is really fast and productive.

At the end of every conversation, I save the entire conversation text, together with all the zip file versions, into a single big zip file, which I backup. This is a really clean way to archive every single step of work I've done on any project - and even to automatically document my work hours and the tasks I've completed in any session (to add to invoices) - I just use GPT to summarize all the work which was completed in the conversation :)

Whenever I start a new session, I simply upload the most recent zip file revision and begin prompting to make revisions from there. If I ever need to merge development paths, I just upload revisions which contain working code for features that I want to add to the main trunk revision. I tell GPT that the master branch features should not be removed or changed, and that the revision branch simply contains some example working code which needs to be merged in.

This whole workflow keeps me ridiculously productive over complex projects which can last weeks-months, without ever having to keep even a full day's work in context. Whenever a conversation context gets to long, I just start another session with a stable master branch version, and work from there. I can rely on GPT to understand exactly how it's own code works, so I very rarely need to provide any context - just start making revisions. This is a production quality workflow that actually works without end and produces consistently effective results.

I have brought Gemini Pro 2.5 into the mix many times (and quite a few other frontier models, just to evaluate their effectiveness and to get experience with each of their personalities, strengths & weaknesses), but that's becoming less and less of a necessity. Those most important part of the development process is to provide detailed, clear, organized instructions for bite sized pieces of code and functionalities which need to be built in an application. If you're working on manageably sized changes in each iteration, you will be successful. Basically, I still do all the detailed work of organizing application structure and properly engineering the details of how an application's code base is built - I just don't need to write the code manually. If you break the whole process down to the point where you'd actually write code, then you can have the LLM write that particular code. That's an entirely different thing than asking an LLM to create an app that does ___ ...

Nick — 23-Oct-2025/10:03:27-7:00

I've shown several people the workflow that I quickly demonstrate in this video:

https://youtu.be/9k0mwKhfSTY

It's become clear to me that the power of that workflow is not immediately obvious. That workflow with GPT, particularly the use of zip files to contain all the code in a project, and the process of uploading previous conversations saved as .mhtml files, has scaled fantastically well.

I've been working on several medium sized projects of 30,000+ lines, and that workflow has made managing and continuing in any direction with projects, from any point, utterly painless.

One key is that this workflow gives GPT the ability to see every piece of connected code in any entire project, and gets to understand everything you intend to accomplish - without consuming all the context window size - and without you having to repeat what you think is important. You simply ask GPT to read the entire previous conversation (in the attached .mhtml file). It starts with that understanding, and also sees all the steps you've previously worked through, all the console output you've previously pasted, etc. -so it has access to all the work you've accomplished, the results of every step, everything you've described about your intended goals, etc.

Another key is that by using GPT to surgically replace code in the project zip file, without displaying multiple revisions of the code in the chat history, you avoid output which would otherwise completely consume the context window. The zip file contains everything in the entire project (including every associated piece, such as environment variables, if needed), and GPT can explore, access, and write code to adjust any piece.

It's so critically important that GPT can see the entire project and all previous work which has taken place, without nuking context window length. Attaching the full project zip file, and any previous chat conversation related to a current goal, makes it painless to provide a complete understanding of the full scope and working context, without completely filling up the current working context memory, and without taking any time on your part.

I've noticed very clearly that GPT begins to produce less effective output after long conversations. In the early days, I relied on keeping all the required context within a single conversation, but now I do the opposite. I no longer worry about ending a conversation as soon as I see GPT's performance begin to degrade. Instead, I simply start a new conversation, with the .mhtml of the previous conversation and the current zip file version attached. It *always performs better in the new conversation. Always.

Using SCP to upload the complete current project version, and overwriting the entire project on the server, makes for super simple project management. If I need to collaborate with another developer, I can still use Git. I accept any changes, then the entire current working master version on the server simply gets downloaded as a zip file, and I upload that to GPT. I can revert to any version at any time by simply unzipping a previous version number - and those version numbers are all fully documented in my saved chat conversations. It just takes seconds. I switch between these branches constantly as I build and test. It's utterly painless and fast.

Like most things, the devil is in the details, and the details of this workflow are critically important to how I'm able to work so successfully and easily on larger projects with GPT.

Nick — 23-Oct-2025/10:33:01-7:00

It's also important to note that I still never spend more than $20 per month, total, for all of the generative AI tools I use ($20 for ChatGPT). I very often spend many hours per day, multiple days a week, on ChatGPT, and I've never once hit a rate limit. I've also never run into any development task which I couldn't complete with the help of GPT, for many hundreds of significant tasks, within many dozen of large projects in the past 3+ years. I also use GPT to regularly help with research about engineering decisions, server configuration, communication with team members, clients, etc., which saves an absolutely massive amount of time and energy, and helps lead to better outcomes in every way.

Rolling back to previous versions of an application requires simply unpacking a zip file version on the server - it just takes a few seconds.

I've used the GPT zip file technique explained previously to complete a large number of progressively complex projects, without having to manually write any code from scratch at any point, and my largest current project, which consists of multiple integrated applications, each 10,000 lines or more, is made up of code largely written by GPT - and that code connects to many tables in Baserow which have been created almost entirely by client stakeholders (as well as additional SQLAlchemy connections to other databases, created by GPT).

It is GPT's proven ability to handle applications with code bases of tens of thousands of lines, and most important, its ability to manage what appear to be multiple separate 'helper' contexts (in its 'thoughts', which are hidden from the main user display by default, but can be shown), which is what has made GPT so much more effective than the other models, without requiring any other tools.

I haven't seen this sort of context management native to any other models. You *can manage that sort of context using agentic systems, but that requires a lot more tooling, all set up with API access, and lots of cost for token use - and to use those tools, you need give access to your system's command line. I don't like any of that idea at all, and from what I've seen, those sorts of systems are just a huge mess.

Until I see another model able to manage multiple contexts, the way GPT currently does - with the entire project code being examined, and the code within an input zip file being surgically updated, and then only the results summarized and made available in the main conversation context - will any of the other models be as effective as GPT, for the way I'm currently building software.

For most pieces of projects, I don't expect to run up against huge context walls, because I expect it's entirely possible to keep pieces of projects limited to 10,000 lines or less - but GPT has already been able to handle much more than that, without *any additional tooling, and it's just so fantastically capable of working with the Flask ecosystem and anything I can attach to it (for example, learning in-context how to use unknown 3rd party REST APIs from docs, learning to use any existing Python back-end libraries, front-end web based UI libraries, database ORMs, real-time features, etc.).

And I still only spend $20 per month to do all this, and never spend any time installing IDEs or any of the big agentic frameworks or tools which need command line access to my machine.

I still jump between working on multiple machines at different locations, throughout my week - all by simply transferring current project zip files between machines, connecting to servers via SSH, and sending files to servers via SCP (all of which only requires command line tools on any modern desktop machine, and which I can even run conveniently on my cheap Android phone).

And I only ever use the web based chat interface to interact with GPT, for all of this (which also requires no installation).

The SSH/SCP setup used to transfer zip files and manage projects on the server is super streamlined and sooooo fast to iterate with - plus I keep every version I ever create of an app, on the server, so can instantly revert to any previous version, in just a few seconds. That all works better for me than the Git fetch and merge scripting automations I had gotten used to using with Anvil. And of course, I can hook up Git if I want other developers to be able to work on branches - everything that I send to an LLM just gets downloaded in a zip file.

It takes me just a few minutes to set up new environments on new VPS servers, or on in-house physical servers managed by an in-house IT team, or on my own machines, etc., on any common OS - and everything is so stinking light weight. Python is already installed everywhere, and I could literally serve projects on a $40 Android phone if I wanted.

Larger mainstream no-code solutions do need a server machine to run well, but really with only the most minimal hardware specifications you'd ever expect to see, in the least expensive VPS solutions which cost just a couple dollars a month to operate.

I've noticed the biggest improvements in my recent development processes, whenever I have to make sweeping changes to existing application functionality, when refactoring and integrating new requirement specifications with old code, when altering/extending existing workflows, UI interfaces, logic, database schema, etc.

Building new features from the ground up has always been the easiest phase of developing software. It's rolling with the changes that clients request, which has always been the hardest part of long term projects - especially when clients ask to tear down and refactor existing code that has been forgotten about for months/years.

This is one of the areas where AI-based development really shines, especially when using the full project zip file workflow that has matured recently with GPT.

All the LLMs are absolute whizzes at refactoring and integrating code with new features. In fact, with GPT capability to employ the zip file workflow, I've successfully integrated changes across multiple apps connected by APIs, all in single sessions - and by that I mean I've uploaded complete applications in zip files, and made changes to the functionality of all the applications, in a single conversation.

As a comparison, I started using GPT to refactor functions a few years ago, but that was a complex and careful process, fraught with diligent attention required, which typically involved techniques such as providing function stubs, and for example, explaining parameters which would be sent from a front-end call to a back-end function, and refactoring existing functions, over a series of careful passes, with code review, debugging, and testing in between to, ensure there were no regressions, errors, etc.

Now I work at a much higher level - typically a level where I feed conceptual requirements, as they come from my clients, to the LLM - often using my client's descriptive words, along with my explanations of the context involved - and the LLM makes all the changes required to UI, logic, database schema, etc., all at once, from that described conceptual idea.

Sometimes there's a bit of debugging required, but that's been reduced by orders of magnitude, over even a year ago. Mostly, I review detailed code explanations provided in the chat conversation, perform updates (upload/unpack zip files and re-run apps), and test functionality.

A big part of the real work is now centered around clearly explaining required functionality, which I try to clarify in writing, during initial conversations about requirement specifications with my clients. I get those requirements written in a format that the LLM will understand, with all the context the LLM will require - which is typically what I also need to understand, about how the solution might be implemented - but that implementation is *not written in stone. I'll give the AI the ability to conceive its own technical solution.

So, with that in mind, the other work I do most is building multiple branches, and then re-integrating features from successfully implemented branches, into my main project branch.

I'll often have several simultaneous branches of new features being developed, and I'll need to take fully working features from one branch, and integrate those features with another branch that has other fully working features implemented, without allowing for regressions in either of the working branches, or among any other existing functionalities withing the app.

The new zip file workflow has made this so much easier to handle. It just requires lots of clear explanation about what needs to be accomplished, at the structural level, but rarely at the line-by-line code level.

I'll explain in detail to the LLM that in version vXXX.zip we added _ functionality, and need to integrate that _ functionality into vYYY.zip of the app, without changing anything else about the functionality of version vYYY.zip. I've been absolutely astounded at how well GPT has been able to make these sorts of complex conceptual changes, even in significantly sized code bases. It can deal with extremely high level conceptual tasks, especially when the code needed to achieve a task has been completed - it does a better job of integrating existing working code, than having to build all new untested code, all at once.

The productivity gains of being able to work at that very high conceptual level, over working strictly at the level of adjusting functions, parameters, variables, etc., has dramatically altered how difficult it is to complete sweeping changes and code refactoring alterations in applications. It takes time to do all this well, but there are still manifold productivity improvements over what was possible even just a few months ago.

What used to require iterations overs weeks of discussions about refactoring functionality, now can often be accomplished several times, with several significant pieces of an application, in a single night.

What I'm finding is that I'm just taking on much deeper change requests, and relentlessly working to fully satisfy long term expectations and improvements to projects which may never have even been on the horizon in projects from just a few years ago.

This sort of productivity improvement couldn't have been achieved if I'd relied on doing all the work myself in a framework like Anvil, and just building pieces of code with the LLM. I need to give the LLM access to the entire project and all supporting files/context (server configuration, etc.), as well as freedom to choose tools which it knows best - while I work at defining requirements better, rather than defining narrowly conceived technical solutions better.

One of the most important changes I'm able to better explore and handle successfully now, is building multiple branches of solutions, comparing how they turn out, and picking which works best to form a long term foundation for current functionality and for future extensibility.

Being able to fearlessly build multiple branches in a single sitting, as opposed to the days/weeks/months it would have taken in the past, has changed absolutely everything.

Using my current full-project zip file workflow, I can build and abandon versions as needed, and refactor/combine chosen branches, without anything near the comparable work, time, and fatigue I would have experienced even a few months ago.

The most recent versions of GPT since 5.0 have made this sort of workflow much more easily achievable, and I think all the other frontier LLMs (Claude, Gemini, Grok) are right in line in terms of coding capability (including all the best most recent open source LLMs: Kimi, Deepseek, GLM, and Minimax).

I just currently rely on the zip file project and context management which GPT has enabled, and none of the other LLMs do natively. That's been the biggest improvement I've experienced which has made workflows explode in productivity.

I'm sure I can write code just a well with the other models, especially in conjunction with other tools that help work on full source code repositories - but GPT's ability to manage context size, even with large code bases, requiring no other tools beyond zip files - surgically refactoring the code in existing zip files, and providing entire updated project as another single zip file, has been an absolute game changer.

I think most developers currently using LLMs, agentic tools and IDE plugins to write code, have never even begun to explore this approach, and I haven't yet seen any other tooling which can effectively surpass this methodology in as many ways that affect the whole life cycle of how a project grows, changes, and improves. It's such a simple workflow, and the potential of this process to work on projects involving absolutely massive context, is something I haven't even even begun to reach the limits of yet.

Nick — 2025-12-25 14:16:56

The short version is that I now use GPT to build most production code, using its native tools to surgically edit code within zip files. This helps dramatically to manage context length and workflow complexity, without needing any other agentic tools, and it keeps costs to $20 per month for even the most complex projects.

I use Flask and all the typical Flask ecosystem libraries to handle ORM, auth, real-time interactions, etc. This keeps project sizes really small/lightweight (and able to be deployed virtually anywhere), and enables GPT to use language tools which it's trained on deeply (so it can typically build solutions easily, with virtually no manual debugging).

When discussions with GPT grow too long, I save the entire GPT conversation as an .mhtml file, to avoid ever needing to re-explain context from previous conversations.

GPT's native ability to work directly with zip files and .mhtml files has eliminated so much work, and has streamlined my development workflow to be much simpler, and manageable for even very large projects.

Managing context by having GPT work with the entire codebase of a project, entirely within zip files, has been the game changer which makes it possible to manage much larger projects with an LLM. In my experience, those few simple features built into GPT have replaced all the heavy agentic tools, IDEs, etc. used in more complex LLM workflows which I see everyone struggling to make work as effectively.

Whatever tools you use, try the zip file development process I've described above, with GPT. You'll be astonished at how well you can build and manage large projects using it alone, without any of the agentic tooling mess that everyone is pushing these days.

Nick — 2026-01-01 19:31:47

I've written about this, but to be clear about how my workflow goes, AI code generation can now handle database development entirely without hand-holding, so when building any sort of small application, I'll let GPT build the database schema and queries as needed to satisfy requirements. When using Flask, SQLAlchemy will be the default ORM. All the schema definitions are included in the code, so zip files contain all the info needed by GPT (or any other LLM), to work with the database. You don't need anything else when building applications of any scale, to manage internal database schema and queries.

All that capability together provides a massively capable tool set to handle any situation. I'm currently at version 291 of one single app that's connected to the big Baserow project (~30,000 lines), and there are several other apps in that project suite which connect and communicate with one another via REST API, and by interactions with the Baserow database, all of which have evolved through at least a few dozen version iterations, and all more than 10,000 lines of code each.

GPT has been able to connect each app, entirely without any hand-written code, using the zip file project management system I've described above. That suite of applications is mature and complex, not just in terms of the number of features, but also the complexity of each feature - these apps go far beyond CRUD functionality - they include a deeply capable real-time messaging system wired into other apps, an entirely in-house Notinatim geocoding and Open Street maps mapping system (not the simple Google Maps integrations you see everywhere), an intricate SMS notification system, as well as real-time in-app notifications per user and group, and lots of other complex user/group oriented features.

Nick — 2026-01-01 21:14:32

I can't express how important the zip file workflow is to my productivity - it's the reason I'm using GPT almost exclusively these days. Otherwise, I'd have to use a lot of other agentic tools.

Most applications get built incrementally. So in a typical session, I typically start by uploading the current zip file of an existing project, and make incremental changes to it, to satisfy very specific requirements/change requests. If context beyond the existing structure of the application is needed, then .mhtml files of previous conversations can save a massive amount of time explaining any previously discussed information, but I typically only need that when the requirements related to that information haven't been completed - and that's rarely a situation I find myself in any more.

You've seen GPT's ability to surgically update entire projects contained in a zip file. Other models don't do that yet - instead, you typically need to install something like Claude Code and let it run on your local PC, giving it access to system resources on your command line, and using your RAM, CPU, etc. to get work done. Most important, and what most people seem to miss, it that GPT has developed some absolutely fantastic internal methodologies for managing context size. You can get 1 million tokens of context in a conversation with Gemini, but upload a single large file, and the entire conversation context is spent. I've uploaded 16Mb documentation files to GPT, along with multiple other files, in a single conversation - GPT goes off in *another context, reads the files, summarizes to itself what it needs, without using up the context of your conversation. It does something similar with every code update it makes to a project contained in a project zip file. So in some sessions, I'm dealing with tens of millions of tokens, without ever making GPT sweat. And when the conversation context does start getting polluted, I just start another conversation with an uploaded zip and/or mhtml file, and it starts fresh, managing contexts for those files, without immediately filling up the conversation context. I haven't seen other models' chat interfaces do that yet. The agentic frameworks do that, but at the expense of needing to work on your machine. I can use ChatGPT on my phone, or on any device, without having to install or give permissions to some mess of agentic tooling.

I saw this article, which was well written, and I'm sure based on solid experience, but which made me realize that I get all the same described benefits using the plain, simple Vanilla Flask/Python ecosystem, and agents built into GPT to generate code and fully build massive projects, without any other tooling whatsoever beyond cheap VPSs needed to deploy projects:

https://www.kdnuggets.com/tech-stack-for-vibe-coding-modern-applications

I've used Claude Code and a pile of other agentic systems, and nothing comes close to the simplicity and effectiveness (not to mention the low cost, flexibility, and portability) of the system I've described with GPT and zip files. That simple tool set has been *staggeringly effective across an absolutely massive set of tough projects, with hundreds of deep revisions, over many months of development on each project.

I still only spend $20 per month for GPT, to get all my production worked completed - that has produced six figure revenue, working as a sole developer, with no other overhead, employees, or contractors, on a wide variety of projects which I never would have been able to take on otherwise.

After ChatGPT, Minimax now provides the first hosted chat system which has all the built-in agentic capabilities required to work with full project zip files, the way I've been working with ChatGPT during the past 1/2 year.

I just used Minimax to update a project, using the same simple zip file workflow I've been using with GPT (upload the project zip file for a complete full-stack Flask application, prompt the model to complete required functionality updates, and download the updated zip file). It did a great job. Minimax appears to work just as well as GPT with Flask, Python, HTML, JS, etc. - and now it can be used to update complete projects automatically.

None of the other hosted chat systems have enabled this particular blend of built-in agentic capabilities yet. Most other chat systems don't even accept zip files as file attachments yet, so they're desperately behind, in terms of providing the workflow I've become accustomed to using recently.

It may seem trivial, if you haven't worked this way, but this set of capabilities is what saves me from having to install a pile of agentic tools on any particular local development machine (no Claude Code, no IDE, etc.), and from having to give those tools access to files on my local system. I can work from any device, even my phone, in any location, entirely within a remotely hosted chat environment. This workflow manages all the agentic tooling needed to open the code files in the zip package, as well as the spawning of agentic swarms that handle opening multiple context workflows to reason and research about required solutions - and then provides a file download of the final updated project - all without anything installed on my local computer (which is the current messy state of the art with most other AI based development systems such as those based on Claude Code).

None of the other commercial or open-source frontier chat systems beside GPT, have previously enabled this workflow. I'm absolutely tickled to have an alternative :)

These are some older discussions from when I was just getting started with the zip file technique. I no longer use the zip file scripts to work with other LLMS:

I typically start workflow conversations in GPT5, because its file handling is far superior to the other chat interfaces. You can upload an entire project in a single zip file, without polluting the entire context of a ChatGPT conversation. GPT will even accept and read pieces like SQLite databases and other binary data files - just pile them all into a single uploaded zip file.

After the context of a single chat has gotten too long for GPT, I'll often copy an entire conversation and upload it as a single text file, into a fresh new conversation, and response times will speed up dramatically. This is a fundamentally useful workflow hack, which I never hear people discuss. Without it, GPT would regularly grind to a halt and be useless for anything but toy projects. With it, GPT is a powerhouse (has been since version 4).

If/when GPT struggles with any task, I'll paste the *entire copied conversation from GPT5 (CTRL+a, CTRL+c on the entire UI chat page in ChatGPT) directly into a fresh new Gemini chat interface in Google AI Studio, and go from there with your prompt workflow.

When copying any full conversation from ChatGPT, OpenWebUI, or any other chat interface, I make sure to always show all the thought processes, and especially all the 'analysis' processes throughout the conversation, so that Gemini and other LLMs can see all the generated code, even for zip files that GPT provides for download (the 'analyzing' sections typically contain all the code used to generate downloadable zip packages which contain multiple code files - just pop that section open so you can copy/paste it). Since Gemini and other LLMs only handle a small subset of file types, copy/pasting all those sections from any GPT output is the only way to neatly provide all the required info to other systems. Gemini and a few Qwen models both have fully usable contexts of 1 million tokens (hundreds of pages of chat text), so this is typically a successful generalized workflow (i.e., have your initial project conversation with GPT - upload all existing project files to it - then copy the *entire conversation with thoughts and analysis sections into AI studio, Qwen chat, OpenWebUI on your own server, LM Studio or Jan on your local machine, etc.)

Beware aware that it's much tougher to copy/paste entire long conversations from Google AI Studio. I'll typically simply upload the resulting output files from a conversation with Gemini, back into another GPT, Qwen, etc. chat interface, with some context about project goals, and iterate the same way again, starting fresh with the current state of an existing generated codebase - exactly as you would with any code you've written.

I prefer to run code manually and feed output exceptions manually back into a chat context, and discuss/steer the logic of debug cycles, instead of using automated agents. Current agents move a lot faster than humans, but they tend to spin off into long processes which can massively broaden the scope of a workflow, when they don't fully understand that a simpler solution is available - but that's just my preference. I'd prefer to spend a bit more time explaining the context of an error to the LLM, when it's clear why any exception occurred, than let it pollute and derail the context with thousands of unnecessary debug/test cycle tokens.

Finally, using only the best known languages and libraries makes the biggest difference. The quality of LLM output depends entirely on how much training data was provided in pre-training. You can provide in-context learning materials (LLM's are great at learning new REST APIs from documentation, for example), and even perform fine-tuning, but the benefit of billions of tokens of example code and documentation in a pre-training corpus will always out-perform any later training materials. Flask, HTML, CSS, JS, Bootstrap, and jQuery code seem to be extremely well represented in most training data (the total volume and quality of code examples, tutorials, books, etc. ever written and shared on the Internet during the past few decades), so those particular tools are hard to beat when it comes to what LLMs understand deeply, which code patterns in an ecosystem actually work, and why, how various alternate solutions can be substituted and combined, etc.

Here's one little tool that's proved useful for my larger AI development workflows (of course the code was generated by GPT in a few seconds):

https://com-pute.com/nick/combine_folder.py

It combines all the files in a given folder, as well as all that folder's subfolders, into a single text file, with the path and filename of each file inserted as a header, before each file's content. This is helpful because what often happens with such rapid development sessions, is that I have the opportunity to explore multiple development paths. Perhaps I'll try using different UI libraries, or implementing different logical solutions, or I'll experiment with how multiple interactivity patterns effect UX - basically I'll build multiple complete versions of applications to see which paths are more usable, scalable, maintainable, etc. Perhaps I'll prefer the UI in one version, and the database structure, logical workflow, etc., in another (I just did that yesterday, choosing whether to keep data for a particular workflow in Baserow and using API calls, or in a local SQLite database, using SQLAlchemy).

That whole exploratory effort can lead to huge messes of code, with many hundreds of files that are hard to manage - and only GPT currently enables zip file uploads. So I'll pack all the files in the project folder for each version of an application, into a single text file (LLMs work most efficiently with pure text), using the script above, upload all those single file text versions of the application code to any of the LLM's chat interface, and prompt to combine and integrate the preferred features of each. I can even do this with chat interfaces that don't enable file uploads - just copy/paste text.

This eliminates problems with, for example, Gemini not accepting zip files, or Qwen and others only accepting 5-10 attached files. In Flask, all my UI files are in the /template and /static folders, and of course all the frontier models understand those conventions.

That iterative revision process can start anew in fresh chat sessions, ad infinitum, where I and the LLM always have a small very small set of clean files or paste-able text to deal with. This keeps context size manageable, and I can combine/clean all the useful work, from all the versions of exploratory application solutions, which would otherwise be too complicated to keep entirely organized and understood in my head.

That stupid simple tool dramatically broadens the scope of project size and productive workflows which yield useful solutions. Little workflow improvements like that come from many hundreds of hours working at improving LLM based software development patterns, and for me have been far more successful than using IDEs, agentic workflow tools, etc. Plain old chat with plain old text continues to work best for me. The chat based workflow is more malleable and versatile, and enables the involvement of more numerous intelligent frontier models, all of which bring different capabilities, perspectives, and strengths to solving problems and creating solutions.

I extended the utility script above to convert entire zip files to a single text file, and back again to a zip file. The entire folder structure, with all file names, is kept intact during both conversion processes (from zip2text and text2zip). You can optionally choose to include include binary files as base64.

This enables quick upload of all the source code and supporting files in a project, as a single text file, which any LLM can read and alter. Providing the complete source code and every other files used in a project, is critical for any LLM development work to go well, and most LLM interfaces can't work directly with zip files, so this makes all the difference in the world.

When you and the LLM are done making changes to project code, download the updated text file, convert it back to a zip package, then upload that zip file directly to your application server.

I find this process to be far and away more effective than all the IDE and agentic tools which are being flooded into the market. That methodology doesn't require any complex tooling, or even API access to any LLM.

Instructions are included in the script:

https://com-pute.com/nick/zip_text_roundtrip.py

I haven't done any production work using that little zip2text utility yet, but I have tried it with a small application example using GPT-OSS:20b, on my laptop with the little RTX 3080ti GPU, and that small open-source model understood the project & was able to produce a working code update, which was successfully converted back into a zip file, and uploaded to run successfully on a VPS server :)

Nick Antonaccio
Nick AntonaccioAdmin
May 05, 2026 at 12:42 (edited, 1 revision)
#5

The following text is extracted from a recent conversation I had with a colleague - perhaps it helps shed some light on how the zip file approach can be easily integrated with the local agent approach, and how the local agent approach can be switched between remotely hosted LLM API services (Openrouter, or any LLM API by OpenAI, Anthropic, etc.), and a locally hosted LLM API. What I should add eventually is a quick discussion about how those approaches can be wired in directly to using Git:

The idea with the zip file approach is that you keep the entire code base, all configuration settings, documentation, supporting files, etc., in one single zip file. When you want to work on it, you upload the zip to ChatGPT, it rips the zip file apart, updates any code files and settings that need to be adjusted, and gives you back another newly packaged zip file that contains everything you and the LLM need to fully understand and deploy the project.

The real benefit to that routine are that Chat GPT basically has unlimited context to work with, because internally it can spawn endless sub agents that complete portions of the required work. Those tools are all build into the hosted Chat GPT system (most other hosted chat systems don't currently accept zip file uploads).

When you're done working on a development task, you just upload the zip file to your server, unpack it, and run it (I use SCP to send zip files to the server). That means you always have every version of the application on the server, and you can roll back, simply by unzipping any of the zip files, and running the app again.

By far the biggest benefit of using Chat GPT is having one of the absolute best frontier models work, without any rate limiting at all (if you're not wasting your ChatGPT account use on generating images and other token-heavy processes).

If you want to work with a local agent like Pi, you just unzip the project zip file directly into a folder on the computer where you're running Pi, and have it do its work in that folder. That workflow involves prompting basically like you would with Chat GPT - Pi just enables the LLM to touch all the files on your hard drive, read, write, run, read debug output, and continue in an unattended loop, doing all those iterations, until your goal is satisfied. When you're done with all that work, you zip up the project folder, upload to your server, and/or later work with that zip file back in ChatGPT.

Since you can use any LLM in Pi - local or hosted - you can switch between working online or offline, anytime you need. Switch back and forth between getting the great intelligence that's in Chat GPT, and all the basically free tokens OpenAI gives you for $20 a month, or switch to using any other LLM, including an offline one, when you travel.

This makes the whole development approach completely modular - you can switch between using any tools and services you want. As long as everything you need for a project is always contained in a zip file, the entire project is always portable between any tooling you want to use. What makes Chat GPT special, is it's the only hosted chat system which has all the tools built in, to let you work with zip files like that - well actually, Minimax can do the same thing, but it's not quite as good as Chat GPT, and in the end it costs much much more to use, even though it's a very inexpensive system compared to everything else. Chat GPT gives me $3,000-$5,000 a month of usage compared to what people are doing with Claude Code.

One thing that might not be apparent about Pi, is that it can unzip your project zip file, install all the environment requirements for you, start and stop the app, and package the project directory back into a zip file again when you're done working - just ask.

Please login to post a reply.

© 2026 AI By Nick.