Local AI Studio — Part 2: Driving ComfyUI From Code

This is Part 2 of the series. In Part 1
we got ComfyUI running on the Mac's GPU. Now the fun part.

The ComfyUI web interface is lovely for designing a workflow. But for actually
producing things — fifty variations, a nightly batch, one image per blog post — clicking
is the wrong tool. The right tool is hiding in plain sight:

A ComfyUI workflow is just JSON, and there's an HTTP API behind the GUI. So generation is really a function: JSON in, image out.

Workflows are JSON

In "API format," a workflow is a flat object of nodes. Each node has a class_type and an
inputs map, and inputs reference other nodes as ["node_id", output_index]. Here's a
complete, minimal SDXL text-to-image graph:

{
  "1": { "class_type": "CheckpointLoaderSimple",
         "inputs": { "ckpt_name": "sd_xl_base_1.0.safetensors" } },
  "2": { "class_type": "CLIPTextEncode",
         "inputs": { "clip": ["1", 1], "text": "a jaguar on a mossy Mayan ruin, golden light" } },
  "3": { "class_type": "CLIPTextEncode",
         "inputs": { "clip": ["1", 1], "text": "blurry, low quality, watermark" } },
  "4": { "class_type": "EmptyLatentImage",
         "inputs": { "width": 1024, "height": 1024, "batch_size": 1 } },
  "5": { "class_type": "KSampler",
         "inputs": { "model": ["1", 0], "positive": ["2", 0], "negative": ["3", 0],
                     "latent_image": ["4", 0], "seed": 777, "steps": 25, "cfg": 7.0,
                     "sampler_name": "dpmpp_2m", "scheduler": "karras", "denoise": 1.0 } },
  "6": { "class_type": "VAEDecode", "inputs": { "samples": ["5", 0], "vae": ["1", 2] } },
  "7": { "class_type": "SaveImage", "inputs": { "images": ["6", 0], "filename_prefix": "demo" } }
}

Read it bottom-up and it's just the pipeline: load checkpoint → encode a positive and a
negative prompt → make an empty latent → sample → decode → save.

Tip: you don't have to write these by hand. Build a graph once in the GUI, then enable
"Save (API Format)" in the settings and it exports exactly this shape. You can also ask
the running server GET /object_info for the precise name and inputs of every node — I
lean on that constantly when wiring up a model I haven't used before.

The three endpoints you actually need

The API is bigger than this, but generation only needs three calls:

POST /prompt — queue a workflow. Returns a prompt_id.
GET /history/{prompt_id} — once it's done, this holds the results (and the output filenames).
GET /view?filename=... — download an output file.

A ~60-line runner

Here's the whole helper I use — submit a workflow, wait, save whatever it produced. No
dependencies beyond the standard library:

import json, time, uuid, os, urllib.request, urllib.parse

HOST = "127.0.0.1:8188"

def post(path, data):
    req = urllib.request.Request(f"http://{HOST}{path}", data=json.dumps(data).encode(),
                                 headers={"Content-Type": "application/json"})
    return json.loads(urllib.request.urlopen(req).read())

def get(path):
    return json.loads(urllib.request.urlopen(f"http://{HOST}{path}").read())

def generate(workflow, out="api_out"):
    pid = post("/prompt", {"prompt": workflow, "client_id": str(uuid.uuid4())})["prompt_id"]
    print("queued", pid)
    while pid not in get(f"/history/{pid}"):
        time.sleep(2)                       # poll until it lands in history
    os.makedirs(out, exist_ok=True)
    hist = get(f"/history/{pid}")
    for node in hist[pid]["outputs"].values():
        for item in node.get("images", []):
            q = urllib.parse.urlencode({k: item.get(k, "") for k in ("filename", "subfolder", "type")})
            data = urllib.request.urlopen(f"http://{HOST}/view?{q}").read()
            open(os.path.join(out, item["filename"]), "wb").write(data)
            print("saved", item["filename"])

That's the entire integration. From here, generate(load_workflow_and_set_prompt("..."))
is a one-liner, and a batch is just a for loop. Want ten seeds? Loop and overwrite the
seed field. Want to walk away while it renders a hundred images? It's a queue — fire them
all and the server works through them.

The recursion I promised

The featured images on this very series are generated this way. The look I was after is the
blog's house style — Cubist fragmentation rendered in a bright Central American folk-art
palette: geometric planes, bold black outlines, saturated primaries, and naive joyful
motifs (suns, birds, hearts, stylized faces), with no text anywhere. That whole aesthetic
lives in a single reusable style string that I prepend to each post's specific subject, so
every cover is unmistakably from the same hand. I had been producing these with a cloud
model; for this series I rewrote the generator to point at ComfyUI instead — same style
string, same per-post concepts, but now the SDXL graph above runs on my desk and the result
is converted to a .webp and dropped straight into the blog's media folder.

So the image at the top of this post was painted by the exact code this post describes.
That's the thing I find genuinely powerful about the API approach: generation stops being a
place you go and becomes a step in your pipeline.

In Part 3 we put real
numbers on it — SDXL vs FLUX on Apple Silicon — and I eat my words about a "speed fix" that
turned out to do nothing at all.