Getting Started¶
This guide gets you from zero to running GPU workloads and hosted web apps on Modal in a few minutes.
Prerequisites¶
| Requirement | Why |
|---|---|
| Python 3.10+ | Runtime for the CLI |
| Modal account | You need token_id and token_secret from modal.com/settings |
modal CLI |
Must be installed and available in PATH (pip install modal) |
Install¶
From PyPI¶
pip install m-gpux
From source¶
git clone https://github.com/PuxHocDL/m-gpux.git
cd m-gpux
pip install -e .
Verify the install:
m-gpux --help
Step 1: Add your first profile¶
m-gpux account add
You will be prompted for:
| Field | Description | Where to find it |
|---|---|---|
| Profile name | A label like personal or work |
You choose this |
| Token ID | Modal API token ID | modal.com/settings -> API Tokens |
| Token Secret | Modal API token secret | Same page -> API Tokens, shown once at creation |
Profiles are stored in ~/.modal.toml. You can add as many as you need.
Step 2: Verify your profiles¶
m-gpux account list
You will see a table of all configured profiles, with the active one marked.
Step 3: Launch a GPU session¶
For a persistent remote devbox:
m-gpux dev
For the full wizard:
m-gpux hub
The interactive hub walks you through:
- Select a profile
- Pick a GPU
- Pick an action: Jupyter Lab, Run Python script, Web Bash shell, or vLLM inference
- Review the generated
modal_runner.py - Press Enter to launch, or edit the script first
What happens under the hood
The hub generates a Modal deployment script (modal_runner.py) with your chosen GPU and action, then runs it with Modal. The script is fully editable, so you can add pip packages, change timeouts, or customize the container image before launch.
Smooth browser terminal
The Web Bash shell uses direct bash by default for smoother interaction and cleaner rendering. tmux is still installed; run tmux manually when you want detachable sessions.
Tracked sessions
Detached dev and Hub sessions are tracked locally. Use m-gpux sessions list to find them, m-gpux sessions pull <id> to recover files, and m-gpux sessions stop <id> to release compute.
Step 4: Host your first web app¶
For a FastAPI app:
m-gpux host asgi --entry main:app
For a Flask app:
m-gpux host wsgi --entry app:app
For a static site:
m-gpux host static --dir ./site
The hosting wizard asks for:
- Modal profile
- App name
- CPU or GPU compute
- Dependencies or
requirements.txt - Upload exclude patterns
- Warm replica strategy
deployvsrun
Recommended next read
The complete walkthrough is in Web Hosting, including project layout examples, generated Modal decorators, and troubleshooting tips.
Step 5: Deploy a Compose stack¶
If you already have a docker-compose.yml, you can lift it onto Modal directly:
m-gpux compose check
m-gpux compose up
For heavier Docker-native stacks, use VM mode:
m-gpux compose vm check
m-gpux compose vm up
Compose guide
The dedicated walkthrough is in compose.md, including deployment modes, file discovery, sync, and x-mgpux overrides.
Step 6: Check your costs¶
m-gpux billing usage --days 30 --all
This aggregates usage across all your configured profiles for the last 30 days.
To open the Modal billing dashboard in your browser:
m-gpux billing open
Step 7: Deploy an LLM API¶
Turn any HuggingFace model into a production OpenAI-compatible API with authentication.
1. Create an API key¶
m-gpux serve keys create --name my-key
This generates a sk-mgpux-... key and stores it in ~/.m-gpux/api_keys.json.
2. Deploy with the interactive wizard¶
m-gpux serve deploy
The wizard guides you through model choice, GPU, context length, engine tuning, keep-warm behavior, and API key selection.
3. Test your endpoint¶
curl https://<workspace>--m-gpux-llm-api-serve.modal.run/health
4. Stop when done¶
m-gpux serve stop
m-gpux stop --all
Typical daily workflow¶
Morning
m-gpux account switch work
m-gpux hub
... train / experiment ...
Afternoon
m-gpux host asgi --entry main:app
m-gpux serve deploy
Evening
m-gpux stop --all
m-gpux billing usage --days 1 --all