Pay-as-you-go compute for AI models and compute-intensive workloads.
Runpod Serverless is a cloud computing platform that lets you run AI models and compute-intensive workloads without managing servers. You only pay for the actual compute time you use, with no idle costs when your application isn’t processing requests.
An endpoint is the access point for your Serverless application. It provides a URL where users or applications can send requests to run your code. Each endpoint can be configured with different compute resources, scaling settings, and other parameters to suit your specific needs.
Workers are the container instances that execute your code when requests arrive at your endpoint. Runpod automatically manages worker lifecycle, starting them when needed and stopping them when idle to optimize resource usage.
Handler functions are the core of your Serverless application. These functions define how a worker processes incoming requests and returns results. They follow a simple pattern:
Copy
import runpod # Requireddef handler(event): # Extract input data from the request input_data = event["input"] # Process the input (replace this with your own code) result = process_data(input_data) # Return the result return resultrunpod.serverless.start({"handler": handler}) # Required
Best for: Creating a custom worker using an existing template.Runpod maintains a collection of worker templates on GitHub that you can use as a starting point:
worker-basic: A minimal template with essential functionality.
worker-template: A more comprehensive template with additional features
Model-specific templates: Specialized templates for common AI tasks (image generation, audio processing, etc.)