Skip to content

Getting Started

This guide gets you from zero to running GPU workloads and hosted web apps on Modal in a few minutes.

Prerequisites

Requirement Why
Python 3.10+ Runtime for the CLI
Modal account You need token_id and token_secret from modal.com/settings
modal CLI Must be installed and available in PATH (pip install modal)

Install

From PyPI

pip install m-gpux

From source

git clone https://github.com/PuxHocDL/m-gpux.git
cd m-gpux
pip install -e .

Verify the install:

m-gpux --help

Step 1: Add your first profile

m-gpux account add

You will be prompted for:

Field Description Where to find it
Profile name A label like personal or work You choose this
Token ID Modal API token ID modal.com/settings -> API Tokens
Token Secret Modal API token secret Same page -> API Tokens, shown once at creation

Profiles are stored in ~/.modal.toml. You can add as many as you need.

Step 2: Verify your profiles

m-gpux account list

You will see a table of all configured profiles, with the active one marked.

Step 3: Launch a GPU session

For a persistent remote devbox:

m-gpux dev

For the full wizard:

m-gpux hub

The interactive hub walks you through:

  1. Select a profile
  2. Pick a GPU
  3. Pick an action: Jupyter Lab, Run Python script, Web Bash shell, or vLLM inference
  4. Review the generated modal_runner.py
  5. Press Enter to launch, or edit the script first

What happens under the hood

The hub generates a Modal deployment script (modal_runner.py) with your chosen GPU and action, then runs it with Modal. The script is fully editable, so you can add pip packages, change timeouts, or customize the container image before launch.

Smooth browser terminal

The Web Bash shell uses direct bash by default for smoother interaction and cleaner rendering. tmux is still installed; run tmux manually when you want detachable sessions.

Tracked sessions

Detached dev and Hub sessions are tracked locally. Use m-gpux sessions list to find them, m-gpux sessions pull <id> to recover files, and m-gpux sessions stop <id> to release compute.

Step 4: Host your first web app

For a FastAPI app:

m-gpux host asgi --entry main:app

For a Flask app:

m-gpux host wsgi --entry app:app

For a static site:

m-gpux host static --dir ./site

The hosting wizard asks for:

  1. Modal profile
  2. App name
  3. CPU or GPU compute
  4. Dependencies or requirements.txt
  5. Upload exclude patterns
  6. Warm replica strategy
  7. deploy vs run

Recommended next read

The complete walkthrough is in Web Hosting, including project layout examples, generated Modal decorators, and troubleshooting tips.

Step 5: Deploy a Compose stack

If you already have a docker-compose.yml, you can lift it onto Modal directly:

m-gpux compose check
m-gpux compose up

For heavier Docker-native stacks, use VM mode:

m-gpux compose vm check
m-gpux compose vm up

Compose guide

The dedicated walkthrough is in compose.md, including deployment modes, file discovery, sync, and x-mgpux overrides.

Step 6: Check your costs

m-gpux billing usage --days 30 --all

This aggregates usage across all your configured profiles for the last 30 days.

To open the Modal billing dashboard in your browser:

m-gpux billing open

Step 7: Deploy an LLM API

Turn any HuggingFace model into a production OpenAI-compatible API with authentication.

1. Create an API key

m-gpux serve keys create --name my-key

This generates a sk-mgpux-... key and stores it in ~/.m-gpux/api_keys.json.

2. Deploy with the interactive wizard

m-gpux serve deploy

The wizard guides you through model choice, GPU, context length, engine tuning, keep-warm behavior, and API key selection.

3. Test your endpoint

curl https://<workspace>--m-gpux-llm-api-serve.modal.run/health

4. Stop when done

m-gpux serve stop
m-gpux stop --all

Typical daily workflow

Morning
  m-gpux account switch work
  m-gpux hub
  ... train / experiment ...

Afternoon
  m-gpux host asgi --entry main:app
  m-gpux serve deploy

Evening
  m-gpux stop --all
  m-gpux billing usage --days 1 --all