m-gpux Documentation¶

Welcome to the official docs for m-gpux, a production-focused CLI toolkit for Modal GPU operations.

One CLI to manage profiles, launch GPU runtimes, deploy web apps and LLM APIs, and track cloud costs.

What is m-gpux?¶

m-gpux turns Modal's serverless GPU platform into a streamlined developer experience:

Capability	Description
Multi-profile management	Add, switch, and remove Modal identities, all stored in `~/.modal.toml`
Dev Container Mode	Turn the current folder into a persistent Modal CPU/GPU devbox with Volume-backed `/workspace`
Interactive GPU Hub	Guided wizard to launch Jupyter Lab, run Python scripts, or open a web shell on any GPU
Session Manager	Track running Hub/dev sessions, pull remote workspaces, view logs, and stop apps
Workload Presets	Save repeatable compute, dependency, and exclude settings for common workloads
Web Hosting	Deploy ASGI apps, WSGI apps, and static sites with generated Modal templates, dependency prompts, and deploy/run modes
Docker Compose Lift-and-Shift	Analyze local Compose files, generate Modal deployments, and sync app code into running stacks
Vision Training	Generate sample image data, then train classification models from local folders with configurable model, GPU, optimizer, scheduler, and checkpointing
LLM API Server	Deploy any HuggingFace model as an OpenAI-compatible endpoint with Bearer token auth, streaming, and warm containers
API Key Management	Create, list, show, and revoke `sk-mgpux-*` keys stored locally in `~/.m-gpux/api_keys.json`
Billing Dashboard	Inspect 7/30/90-day usage per profile or aggregated across all accounts
GPU Metrics Probe	Live hardware utilization (GPU %, VRAM, temperature) on running containers
App Lifecycle	Stop any running m-gpux app (Jupyter, shells, hosted apps, LLM servers) from one command

Quick Install¶

pip install m-gpux

Or from source:

git clone https://github.com/PuxHocDL/m-gpux.git
cd m-gpux && pip install -e .

Requirements

Python 3.10+, Modal CLI installed (pip install modal), and at least one Modal account with token_id / token_secret.

Start Here¶

Page	What you'll learn
Getting Started	Install, add your first profile, and launch a GPU session in 5 minutes
Command Reference	Every command, flag, and option with examples
Dev Container Mode	Use `m-gpux dev` as a persistent Modal-powered project workspace
Session Manager	Manage tracked dev and Hub sessions
Workload Presets	Save and rerun common launch configs
Recipes	Practical flows for devboxes, RL training, hosting, and file recovery
Web Hosting	Host FastAPI, Flask, Django, or static sites on Modal with `m-gpux host`
Docker Compose	Analyze, deploy, and sync Compose stacks on Modal
Vision Training	End-to-end image classification workflow on Modal GPUs
Architecture	How m-gpux works internally: proxy layer, template generation, profile resolution
FAQ & Troubleshooting	Common errors and how to fix them

Common Workflows¶

cd my-project
m-gpux dev

m-gpux dev launches a browser terminal backed by a Modal Volume. Local files refresh into /workspace every launch, while remote-only outputs stay available until you pull or clean them.

m-gpux sessions list
m-gpux sessions pull <session-id> --to ./m-gpux-workspace

2. Launch Jupyter on a GPU¶

m-gpux account add
m-gpux hub

The hub generates a modal_runner.py script, shows it for review, then executes modal run to start a GPU-backed Jupyter Lab with a public URL.

Hub terminal update

The hub can launch Jupyter, Python scripts, vLLM serving, or a clean VS Code-like Web Bash terminal. The terminal uses direct bash by default, keeps tmux optional, and reduces WebSocket heartbeat noise for smoother interaction.

3. Deploy an LLM as an OpenAI-compatible API¶

m-gpux serve keys create --name prod
m-gpux serve deploy

The wizard walks through:

Model 11 presets or a custom HuggingFace model ID
GPU choose the hardware for inference
Context length max sequence length
Engine tuning GPU memory utilization, max concurrent sequences, tensor parallel size
Keep warm 0 scales to zero, 1+ keeps container(s) always running
API key pick an existing key or auto-create one

After deploy, monitor your server with the live dashboard:

m-gpux serve dashboard

4. Train an image classification model¶

m-gpux vision sample-data
m-gpux vision train --dataset ./data/m-gpux-vision-sample

The vision wizard walks through:

Dataset folder accepts train/, val/, optional test/ splits or a single root folder with class subdirectories
Model choose from many TorchVision backbones such as ResNet, EfficientNet, ConvNeXt, DenseNet, ViT, Swin, and more
Training knobs GPU, epochs, batch size, image size, optimizer, scheduler, augmentation, mixed precision, and early stopping
Artifacts checkpoints and metrics are persisted in a Modal Volume for later download with modal volume get

After training, run inference on fresh local images:

m-gpux vision predict

m-gpux host asgi --entry main:app

The hosting flow supports:

ASGI FastAPI, Starlette, Quart, Django ASGI
WSGI Flask, Django WSGI
Static plain HTML, CSS, and JavaScript folders

During the wizard, m-gpux asks for:

App name
CPU or GPU compute
Python dependencies or requirements.txt
Upload exclude patterns
Warm replica strategy
deploy vs run

Full web guide

The complete walkthrough lives in Web Hosting, including project layouts, generated Modal patterns, scaling behavior, and troubleshooting.

cd my-compose-project
m-gpux compose check
m-gpux compose up

Use VM mode when the stack needs fuller image behavior:

m-gpux compose vm check
m-gpux compose vm up

If you keep editing local code after launch, m-gpux compose sync can stream changes into the running workspace volume.

7. Save A Reusable Workload Preset¶

m-gpux preset create
m-gpux preset run rl-a100

Hub and dev mode can also ask whether you want to save a preset after you configure a workload.

8. Check costs across all accounts¶

m-gpux billing usage --days 7 --all

Aggregates compute spend from every configured profile into a single Rich table.

9. Stop running apps and release GPUs¶

m-gpux stop --all
m-gpux serve stop

Pro workflow

Keep one profile for personal experiments and one for team workloads, then run m-gpux billing usage --all weekly to track total burn across both.

Supported GPUs¶

m-gpux supports all Modal GPU types:

#	GPU	VRAM	Best for
1	T4	16 GB	Light inference, exploration
2	L4	24 GB	Cost/performance balance
3	A10G	24 GB	Training and inference
4	L40S	48 GB	Large-batch inference
5	A100	40 GB	High-performance training
6	A100-40GB	40 GB	Ampere 40GB variant
7	A100-80GB	80 GB	Large models (30B+)
8	RTX PRO 6000	48 GB	Pro workstation GPU
9	H100	80 GB	Hopper architecture
10	H100!	80 GB	H100 reserved (guaranteed)
11	H200	141 GB	HBM3e, next-gen Hopper
12	B200		Blackwell architecture
13	B200+		B200 reserved (guaranteed)

Links¶

PyPI: pypi.org/project/m-gpux
Repository: github.com/PuxHocDL/m-gpux
Issues: github.com/PuxHocDL/m-gpux/issues
Modal docs: modal.com/docs