Skip to content

Application Development

Runpod
API integration

Ship Application Development features without building the integration. Full Runpod API access via Proxy and 50+ MCP-ready tools for AI agents — extend models and mappings to fit your product.

Talk to us
Runpod

Use Cases

Why integrate with Runpod

Common scenarios for SaaS companies building Runpod integrations for their customers.

01

Embed GPU-powered inference in your SaaS

Let your customers connect their own Runpod account so your product can route AI inference jobs to their serverless endpoints. You ship generative features without owning GPU infrastructure or absorbing inference costs.

02

Offer BYOC GPU provisioning for AI workspaces

MLOps and developer platforms can let users one-click provision GPU Pods on their own Runpod account, with templates pre-configured for training, fine-tuning, or notebook environments.

03

Automate batch AI job orchestration

Workflow and automation platforms can queue thousands of async inference jobs against a user's Runpod serverless endpoint, stream results back, and handle retries and cancellations programmatically.

04

Surface GPU spend in FinOps dashboards

Cloud cost observability tools can pull granular Pod, endpoint, and network volume billing data from each connected Runpod account to flag idle workers and attribute inference costs to teams.

05

Manage scheduled Pod lifecycles to cut idle spend

Cost-control and scheduling tools can programmatically start, stop, and reset user Pods on a schedule — for example, spinning training Pods up overnight and stopping them by morning.

What You Can Build

Ship these features with Truto + Runpod

Concrete product features your team can ship faster by leveraging Truto’s Runpod integration instead of building from scratch.

01

Sync and async serverless job submission

Submit inference payloads to a user's Runpod endpoint with sync or async semantics, then poll status or cancel and retry stalled jobs.

02

Token and metric streaming UI

Stream accumulated output chunks and hardware telemetry from in-flight serverless jobs to power real-time generation UIs in your product.

03

One-click GPU Pod provisioning

Create, update, start, stop, reset, and delete Pods from a user's Runpod account using saved templates with specific GPU types and container images.

04

Serverless endpoint deployment and autoscaling controls

Programmatically create and update endpoints with configurable GPU IDs, flashboot, and min/max worker bounds so users can deploy models from inside your product.

05

Template and network volume management

Create and manage Pod templates and network volumes to standardize environments and mount model weights into newly provisioned GPU workloads.

06

GPU billing and endpoint health dashboards

Pull billing records for Pods, endpoints, and network volumes plus endpoint health metrics to power cost attribution, idle-worker alerts, and usage analytics.

SuperAI

Runpod AI agent tools

Comprehensive AI agent toolset with fine-grained control. Integrates with MCP clients like Cursor and Claude, or frameworks like LangChain.

list_all_runpod_openapi_jsons

Get the OpenAPI 3.0 specification for the runpod API. Returns the complete API specification as an opaque JSON document; the source provides no enumerable field-level schema (example payload is an empty object), and the response structure follows the OpenAPI 3.0 standard.

list_all_runpod_docs

Get the interactive RunPod API documentation page. Returns an HTML page rendering the OpenAPI schema UI — this endpoint serves a text/html response rather than structured JSON data, so no enumerable fields are available.

list_all_runpod_pods

List runpod Pods with optional filters. Returns: env, gpu, id, image, interruptible, locked, machine, name, ports. Supports filtering by computeType, desiredStatus, gpuTypeId, dataCenterId, imageName, and more.

create_a_runpod_pod

Create a new runpod Pod and optionally deploy it. Returns: env, gpu, id, image, interruptible, locked, machine, machineId, name, ports, templateId, imageName.

get_single_runpod_pod_by_id

Get a single runpod Pod by id. Returns: env, gpu, id, image, interruptible, locked, machine, name, ports. Required: id.

update_a_runpod_pod_by_id

Update a runpod Pod by id, potentially triggering a reset. Returns: env, gpu, id, image, interruptible, locked, machine, machineId, name, ports, templateId, imageName. Required: id.

delete_a_runpod_pod_by_id

Delete a runpod Pod by id. Returns an empty 204 response on success. Required: id.

runpod_pods_start

Start or resume a runpod Pod. Returns: id, desiredStatus. Required: pod_id.

runpod_pods_stop

Stop a running runpod Pod. Returns: id, desiredStatus. Required: pod_id.

runpod_pods_reset

Reset a runpod Pod. Returns: id, desiredStatus. Required: pod_id.

create_a_runpod_pod_update

Update a runpod pod by pod_id (synonym for PATCH /pods/{podId}). Returns the updated pod object including id, imageName, env, machineId, machine details, and desiredStatus. Required: pod_id.

create_a_runpod_pod_restart

Restart a runpod Pod by its pod_id. Returns an empty 204 response on success. Required: pod_id.

list_all_runpod_endpoints

List all RunPod serverless endpoints. Returns: createdAt, env, id, name, template, version, workers.

create_a_runpod_endpoint

Create a new RunPod serverless endpoint. Returns: allowedCudaVersions, computeType, createdAt, dataCenterIds, env, executionTimeoutMs, gpuCount, gpuTypeIds, id, idleTimeout, instanceIds, minCudaVersion, name, networkVolumeId, networkVolumeIds, scalerType, scalerValue, template, templateId, userId, version, workers, workersMax, workersMin, gpuIds, locations, flashBootType, pods.

get_single_runpod_endpoint_by_id

Get a single RunPod serverless endpoint by id. Returns: createdAt, env, id, name, template, version, workers. Required: id.

update_a_runpod_endpoint_by_id

Update an existing RunPod serverless endpoint by id. Returns: createdAt, env, id, name, template, templateId, version, workers, workersMax, gpuIds. Required: id.

delete_a_runpod_endpoint_by_id

Delete a RunPod serverless endpoint by id. Returns an empty 204 response on success. Required: id.

create_a_runpod_endpoint_update

Update a RunPod endpoint by ID (synonym for PATCH /endpoints/{endpointId}). Returns the updated endpoint object including id, name, gpuIds, templateId, workersMax, workersMin, scalerType, scalerValue, idleTimeout, locations, networkVolumeId, and flashBootType. Required: endpoint_id.

list_all_runpod_templates

List Runpod templates. Returns: category, earned, env, id, name, ports, readme. Optionally broaden results to include endpoint-bound templates, community public templates, or official Runpod templates using the filter parameters.

create_a_runpod_template

Create a new Runpod template. Returns: category, earned, env, id, imageName, name, ports, readme. Required: imageName, name.

get_single_runpod_template_by_id

Get a single Runpod template by id. Returns: category, earned, env, id, name, ports, readme. Required: id.

update_a_runpod_template_by_id

Update a Runpod template by id. Returns: category, earned, env, id, name, ports, readme. Required: id.

delete_a_runpod_template_by_id

Delete a Runpod template by id. Returns an empty 204 response on success. Required: id.

create_a_runpod_template_update

Update a RunPod template by template_id (synonym for PATCH /templates/{templateId}). Returns the updated template object including id, imageName, name, and env. Required: template_id.

list_all_runpod_networkvolumes

List all runpod network volumes. Returns: id, name, size, dataCenterId.

create_a_runpod_networkvolume

Create a new runpod network volume. Returns: dataCenterId, id, name, size. Required: name.

get_single_runpod_networkvolume_by_id

Get a single runpod network volume by id. Returns: dataCenterId, id, name, size. Required: id.

update_a_runpod_networkvolume_by_id

Update a runpod network volume by id. Returns: dataCenterId, id, name, size. Required: id.

delete_a_runpod_networkvolume_by_id

Delete a runpod network volume by id. Returns an empty 204 response on success. Required: id.

create_a_runpod_networkvolume_update

Update a runpod network volume. Acts as a synonym for PATCH on the network volume resource. Returns the updated network volume including its id and name. Required: network_volume_id.

list_all_runpod_containerregistryauths

List all container registry auths in runpod. Returns: id, name, created_at, updated_at.

create_a_runpod_containerregistryauth

Create a new container registry auth in runpod. Returns: id, name, created_at, updated_at.

get_single_runpod_containerregistryauth_by_id

Get a single runpod container registry auth by id. Returns: id, name, created_at, updated_at. Required: id.

delete_a_runpod_containerregistryauth_by_id

Delete a runpod container registry auth by id. Returns an empty 204 response on success. Required: id.

list_all_runpod_billing_pods

Retrieve runpod pod billing history aggregated into configurable time buckets. Returns a BillingRecords collection; the record-level field schema is defined by the BillingRecords type in the runpod REST API reference and is not enumerated in the available source documentation. Optionally filter by gpuTypeId or podId and control aggregation granularity via bucketSize and grouping.

list_all_runpod_billing_endpoints

Retrieve Runpod Serverless endpoint billing history aggregated into time buckets. Returns billing record objects grouped by the chosen field; the field-level response shape is not enumerated in the available source documentation. All query parameters are optional.

list_all_runpod_billing_networkvolumes

List runpod network volume billing history aggregated into configurable time buckets. Returns billing records for the requested time range (response field details are not enumerated in the API docs). Optional: bucketSize (defaults to day), startTime, endTime.

create_a_runpod_pod_start

Start (resume) a stopped runpod Pod via the podResume mutation. Returns: id, desiredStatus, imageName, data. Required: pod_id. Optionally supply gpuCount and allowedCudaVersions to control GPU allocation and CUDA version filtering when the Pod resumes.

create_a_runpod_pod_stop

Stop a runpod Pod by pod_id, releasing the GPU while preserving volume data. Returns: id, desiredStatus, data. Required: pod_id.

create_a_runpod_pod_reset

Reset a runpod Pod, stopping and restarting it in place. Returns the pod's id and desiredStatus reflecting the result of the reset operation. Required: pod_id.

list_all_runpod_network_volumes

List all runpod network volumes available to your account — persistent storage units attachable to pods and serverless endpoints. Returns: id, name, size, dataCenterId, attributes, data.

get_single_runpod_network_volume_by_id

Get a single runpod network volume by id. Returns: dataCenterId, id, name, size, attributes, data. Required: id.

create_a_runpod_network_volume

Create a new runpod network volume for persistent storage attachable to pods and serverless endpoints. Returns: dataCenterId, id, name, size, attributes, data.

update_a_runpod_network_volume_by_id

Update an existing runpod network volume by id. Returns: dataCenterId, id, name, size, attributes, data. Required: id.

delete_a_runpod_network_volume_by_id

Delete a runpod network volume by id. Returns an empty 204 response on success. Required: id.

list_all_runpod_container_registry_auths

List all container registry auth credentials saved in runpod for connecting to private Docker registries. Returns: id, name.

get_single_runpod_container_registry_auth_by_id

Get a single runpod container registry auth credential by id. Returns: id, name. Required: id.

create_a_runpod_container_registry_auth

Create a new container registry auth credential in runpod to enable pulling from a private Docker registry. Returns: id, name, data.

delete_a_runpod_container_registry_auth_by_id

Delete a runpod container registry auth credential by id. Returns an empty 204 response on success. Required: id.

list_all_runpod_billing_network_volumes

List runpod network volumes with associated billing information. Returns: amount, time. No required parameters.

create_a_runpod_serverless_sync_job

Submit a synchronous job to a runpod serverless endpoint and wait for completion before returning the result. Returns: id, status. Required: endpoint_id and input. Default wait time is 90 seconds, adjustable via the wait query parameter (1000–300000 ms); results are retained for 1 minute after completion.

create_a_runpod_serverless_async_job

Submit an asynchronous job to a RunPod serverless endpoint. The job processes in the background; retrieve its result via the /status endpoint. Results are available for 30 minutes after completion. Returns: id, status. Required: endpoint_id, input.

get_single_runpod_serverless_job_status_by_id

Get the status of a runpod serverless job by job and endpoint ID. Returns: id, status. Required: endpoint_id, id.

get_single_runpod_serverless_job_stream_by_id

Get the accumulated streaming output chunks for a runpod serverless job by job id. Returns an array of stream chunk objects, each containing output (with text, input_tokens, output_tokens) and metrics (with stream_index, scenario, gpu_kv_cache_usage, running, pending, and throughput stats). Required: endpoint_id, id.

create_a_runpod_serverless_job_cancellation

Cancel a runpod serverless job that is currently queued or running. Returns: id, status. Required: endpoint_id, job_id.

create_a_runpod_serverless_job_retry

Retry a failed or timed-out RunPod serverless job by requeuing it with its original input parameters and the same job ID, discarding any previous output. Returns: id, status. Required: endpoint_id, job_id. Only works for jobs whose current status is FAILED or TIMED_OUT; expired jobs (async results older than 30 minutes) cannot be retried.

create_a_runpod_serverless_queue_purge

Purge all pending jobs from the queue of a runpod serverless endpoint. Returns: removed, status. Required: endpoint_id.

get_single_runpod_serverless_endpoint_health_by_id

Get the operational health status of a runpod serverless endpoint. Returns: jobs, workers. Required: endpoint_id.

Why Truto

Why use Truto’s MCP server for Runpod

Other MCP servers give you a static tool list for one app. Truto gives you a managed, multi-tenant MCP infrastructure across 550+ integrations.

01

Auto-generated, always up to date

Tools are dynamically generated from curated documentation — not hand-coded. As integrations evolve, tools stay current without manual maintenance.

02

Fine-grained access control

Scope each MCP server to read-only, write-only, specific methods, or tagged tool groups. Expose only what your AI agent needs — nothing more.

03

Multi-tenant by design

Each MCP server is scoped to a single connected account with its own credentials. The URL itself is the auth token — no shared secrets, no credential leaking across tenants.

04

Works with every MCP client

Standard JSON-RPC 2.0 protocol. Paste the URL into Claude, ChatGPT, Cursor, or any MCP-compatible agent framework — tools are discovered automatically.

05

Built-in auth, rate limits, and error handling

Tool calls execute through Truto’s proxy layer with automatic OAuth refresh, rate-limit handling, and normalized error responses. No raw API plumbing in your agent.

06

Expiring and auditable servers

Create time-limited MCP servers for contractors or automated workflows. Optional dual-auth requires both the URL and a Truto API token for high-security environments.

How It Works

From zero to integrated

Go live with Runpod in under an hour. No boilerplate, no maintenance burden.

01

Link your customer’s Runpod account

Use Truto’s frontend SDK to connect your customer’s Runpod account. We handle all OAuth and API key flows — you don’t need to create the OAuth app.

02

We handle authentication

Don’t spend time refreshing access tokens or figuring out secure storage. We handle it and inject credentials into every API request.

03

Call our API, we call Runpod

Truto’s Proxy API is a 1-to-1 mapping of the Runpod API. You call us, we call Runpod, and pass the response back in the same cycle.

04

Unified response format

Every response follows a single format across all integrations. We translate Runpod’s pagination into unified cursor-based pagination. Data is always in the result attribute.

FAQs

Common questions about Runpod on Truto

Authentication, rate limits, data freshness, and everything else you need to know before you integrate.

How do end users authenticate their Runpod account?

Runpod uses API key authentication. Through Truto, your end users supply a Runpod API key during the connection flow, and Truto stores and injects it on every request — your app never handles the raw credential.

Which Runpod resources can we manage through this integration?

You can manage Pods, Serverless Endpoints, Templates, Network Volumes, and Container Registry Auths, plus submit and control Serverless Jobs (sync, async, status, stream, retry, cancel, queue purge) and read billing data for Pods, Endpoints, and Network Volumes.

Can we stream tokens from a running inference job?

Yes. The serverless job stream endpoint returns accumulated output chunks and hardware metrics for an in-progress job, which you can relay to your end-user's UI as the job progresses.

Is there a unified API for Runpod on Truto?

Not currently. Runpod is exposed as a passthrough integration with first-class tools for each resource, so you get typed access to Runpod-native concepts like Pods, Endpoints, and Serverless Jobs without a normalization layer.

How do we control GPU costs on behalf of users?

You can stop or reset Pods to halt billing, configure endpoint autoscaling with workersMin/workersMax, and read billing endpoints to detect idle workers or endpoints with active capacity but no queued jobs.

How does Truto handle Runpod rate limits and errors?

Truto proxies Runpod's native rate limits and surfaces error responses directly. You can implement retries on transient failures, and for serverless jobs Runpod itself exposes retry and cancellation actions you can call through Truto.

Runpod

Get Runpod integrated into your app

Our team understands what it takes to make a Runpod integration successful. A short, crisp 30 minute call with folks who understand the problem.