Hosted GPU Workflows

Sometimes your local machine doesn't have enough GPU power for what you want to do — maybe you're running large models that need 24+ GB of VRAM, or you want burst capacity for a heavy generation session without investing in new hardware. Sweet Tea supports connecting to hosted GPU runtimes on services like RunPod and Vast.ai, letting you rent the horsepower you need and work with the same Studio interface you already know.

When to Use Hosted GPUs

Hosted GPUs make sense when:

Your local GPU doesn't have enough VRAM for the models you want to run
You need faster generation throughput for a time-limited project
You want a standardized environment that you can spin up and tear down on demand
You're working from a machine with no GPU at all (laptop, etc.)

For day-to-day work with models that fit your local GPU, running locally is simpler and cheaper. Hosted GPUs are best as an on-demand power boost.

The Three-Layer Model

Hosted GPU sessions work with three layers:

Studio (your local machine) — The creative control surface. You're still using Prompt Studio, Gallery, and all the same tools locally. Studio just sends generation requests to the remote engine instead of a local one.
Hosted runtime (the remote GPU) — A cloud instance running ComfyUI with your models, nodes, and Sweet Tea's backend. This is where the actual image generation happens.
Durable storage (R2 or similar) — Object storage that persists your models, workflows, and outputs between sessions. When you spin up a new pod, it syncs from storage so your environment is ready.

Setting Up a Hosted Session

1. Configure storage credentials

Set up your cloud storage connection with these environment variables (typically R2-compatible):

R2_ACCOUNT_ID=your_account_id
R2_ACCESS_KEY_ID=your_access_key
R2_SECRET_ACCESS_KEY=your_secret_key
R2_BUCKET=your_bucket_name

2. Provision the runtime

Launch a pod on your provider (RunPod, Vast.ai, etc.) using a Sweet Tea Image template. The image comes with Studio, ComfyUI, and nginx pre-configured.

Default access points on the hosted runtime:

Port	Service
`:3000`	Unified proxy (Sweet Tea UI + ComfyUI + file access)
`:8000`	Sweet Tea backend and UI directly
`:8188`	ComfyUI directly

3. Wait for startup sync

On first boot, the runtime syncs scripts, custom nodes, workflows, and user data from your storage bucket to the local SSD. Larger assets (models, media) continue syncing in the background.

4. Connect Studio

In your local Studio's Settings > Engine, enter the hosted runtime's URL. Studio connects the same way it would to a local ComfyUI instance.

5. Link your account (optional)

If you want cloud continuity features on the hosted session, use the browser relay flow from Connecting Your Account. The hosted runtime supports the same linking handshake as a local install.

Session Runbook

Once your hosted session is up and running, follow this sequence for a productive session:

Verify engine health — Confirm Studio shows a connected engine in the status bar.
Run a validation generation — One simple test with a known-good Pipe to confirm end-to-end health.
Do your production work — Generate, iterate, curate as normal.
Run sync-up before shutdown — This pushes your session's work (outputs, new models, workflow changes) back to durable storage.
Verify sync completion — Check the logs for confirmation that sync finished.
Terminate the runtime — Shut down the pod.

Warning: Don't skip the sync-up step. Hosted pod storage is ephemeral — when you terminate the instance, anything not synced back to your bucket is gone. Make sync-up part of your shutdown ritual.

Cost Control

Hosted GPU time costs money, so a few habits keep your bills reasonable:

Keep warm-up short — Have your storage and templates ready before you spin up the pod. Don't leave an instance running while you're configuring things.
Validate before heavy batches — Run one cheap test generation before committing to expensive large-batch or high-resolution runs.
Align regions — Keep your storage bucket and GPU instance in the same region to minimize data transfer costs and latency.
Shut down promptly — When you're done, sync and terminate. Don't leave pods running overnight.
Use persistent volumes wisely — Some providers offer persistent storage that survives pod restarts. This saves startup sync time but costs extra.

Pre-Shutdown Checklist

Before terminating a hosted session:

All critical outputs verified and visible in Gallery
sync-up command executed
Sync completion confirmed in logs
Session notes recorded (what was done, any issues encountered)
Runtime terminated

Tip: If you work with hosted GPUs regularly, keep a simple session log — date, pod type, what you worked on, any issues. It helps when you're debugging a "my outputs disappeared" situation two weeks later.

Next: Web Platform Overview