How to Automate 100+ Media Pipelines on GCP Using Python and Pulumi

Introduction
In today’s media environment, speed and efficiency of content delivery is critical. For companies in the media industry, this can often be difficult, as more often than not, they have complex manual processes that require human intervention at every stage of their workflow. Many companies require virtual machines to be configured, networks spun up, APIs polled and possessions downloaded. All this contributes to long, error-prone processes that are often hard to scale at even moderate checkpoints (as they typically involve paperwork). We deliver the elixirs or alembic! We automate and define workflow of media pipelines to take out the inefficiency and common errors and scale at will and ease.
For one of our customers, managing hundreds of media pipelines was a daunting, time-consuming task. Each deployment was time-intensive and included all of the following manual tasks: setting up virtual machines, networking, firewalls, and storage. When you need to scale quickly on a daily basis, especially during live events or even simply during seasonal spikes, this slow, manual process turns into a bottleneck, causing delays and missed opportunities for profit. This is when we proposed an automated framework to remove the manual work and free up resources to accelerate the delivery of each of their media pipelines.
We created a custom automation framework that utilized Pulumi and Python and automated their entire media pipeline deployment, making it fully automated. The results were impressive. Processes that took days to accomplish were now being done in just 30 minutes. The client lowered operational costs by 80%. This automation solution provided them with scalability and flexibility while delivering high fidelity in the operation of their media pipeline systems.
Client Challenges
The client faced several obstacles, from time-consuming manual deployments to scalability issues and operational inefficiencies. These challenges were becoming a significant bottleneck in their ability to meet the fast demands of the media industry. So, here are the key challenges they faced:
- Time-Consuming Deployments
The client had over 100 media pipelines, and it could take up to a week to set each one up. It wasn’t just infrastructure provisioning. it was manually configuring VMs and networking, firewalls and storage. Because of the manual nature of the setup, they had a reduced ability to react to the new requirements due to the amount of time and effort required, especially for things like live events or holiday traffic spikes. - Error-Prone Configurations
Manually configuring the Playout Server, SRT Gateway, Recorder, and Transcoder often led to mistakes. Even a small typo in the API settings could break the entire pipeline’s functionality. These errors frequently caused downtime, forced engineers to spend hours troubleshooting, and reduced the overall reliability of the service. - Poor Scalability
The client’s infrastructure wasn’t able to dynamically scale and adapt to growing demand. Any scaling would have had to happen manually and would require over-provisioning resources to account for peak loads and unused capacity, costing the company time and money. This lack of flexibility hindered the ability to utilize resources efficiently and adapt and respond to changing workloads. - High Operational Overhead
Each deployment had multiple teams and steps and there were no automation or version control at a centralized level. Each deployment essentially represented a new starting point where infrastructure, API’s and security rules all had to be reconfigured. This caused high labor costs and made it next to impossible to manage change or rollback.
Our Solution: A Custom Automation Framework
To address these challenges, we designed a custom automation framework that utilized Pulumi with Python, which drastically improved the deployment process, reduced errors, and enabled better scalability. Here’s how we approached the solution:
- Pulumi-Powered Infrastructure-as-Code
We implemented a custom automation framework using Pulumi with Python to automate the infrastructure provisioning on GCP. This allowed us to programmatically set up virtual machines, networking, firewalls, and storage. Using infrastructure-as-code enabled consistent, repeatable deployments and made spinning up new environments as easy as running a script. - Automated API Configuration
To address the configuration bottlenecks, we wrote Python scripts to automate the API setup for all key components, including the Playout Server, SRT Gateway, Recorder, and Transcoder. By defining configuration templates and integrating with the respective APIs, we ensured accurate and consistent deployments, eliminating human error. - Scalable & Modular Architecture
We designed the automation framework to be modular and adaptable. It allowed the client to easily scale resources and deploy new environments by adjusting variables, making it well-suited for dynamic media workloads - Monitoring & Self-Healing
We embedded monitoring and self-healing mechanisms within the infrastructure code. The system could detect if a service or VM failed and automatically restart or recreate it without manual intervention. This significantly improved the reliability and uptime of the client’s media pipelines.
Solution Implementation
In order to facilitate and automate the deployment of media pipeline infrastructure and services in Google Cloud Platform (GCP), we authored an orchestration system in Python. Our orchestration system utilizes Pulumi to provision and configure infrastructure. Our orchestration system can dynamically create a pipeline of the components srt-gateway, playout-pro, esam, recorder, and transcoder, though the specific pipeline attributes can be defined through user configuration files.
Core Components
The implementation is driven by three main configuration files:
- input.json – Defines the topology of the media pipeline:
- Specifies which components (nodes) are required (e.g., srt-gateway, transcoder, etc.)
- Describes node relationships with source-link and destination-link
- Indicates the role (main or backup) and action (create or import)
- Provides instance-specific information like name, ID, and zone for imported resources
- infra_config.yaml – Controls infrastructure parameters:
- Defines VPC and subnet settings, with flags to either reuse or create new ones
- Lists instance types, images, zones, machine specs, and firewall rules for each component
- Supports dynamic setup of boot disks, OS types, and custom user data scripts for provisioning
- config.yaml – Configures the applications deployed on each instance:
- Defines credentials, ports, route definitions for srt-gateway
- Contains application-specific configs for playout-pro, recorder, transcoder, and esam
- Automates service bootstrapping (e.g., starting SRT routes, configuring recorders, enabling transcoding modes, etc.)
Workflow
The automation process is centered around a command-line tool with a user-driven interface, offering three main operations: create, destroy, and update-config.
1. Input Parsing
- The script reads the input.json file to identify which components are required and whether they should be created from scratch or imported.
- It constructs a node topology by mapping inter-node communication using source and destination links.
- Each node’s role (main or backup) is recognized to ensure high availability planning.
2. Infrastructure Provisioning via Pulumi
- Based on definitions in infra_config.yaml, Pulumi dynamically provisions infrastructure:
- Creates or reuses VPCs and subnets as configured.
- Launches compute instances across different zones and machine types.
- Configures boot disks, startup scripts, and GCP project context.
- Applies firewall rules for allowed ports and protocols (e.g., UDP for SRT streaming).
- Pulumi stacks are initialized per instance, with environment-specific settings injected during provisioning.
3. Instance Bootstrapping & Configuration
- After VM provisioning, custom bootstrapping scripts are invoked on each instance:
- These scripts automatically configure application services using values defined in config.yaml.
- Each node sets up its SRT routes, protocol configurations, security credentials, and startup modes.
- Examples of automated setups:
- SRT-Gateway: Sets listener/caller modes, ports, peer connections
- Recorder: Configures auto-recording profiles and schedules
- Transcoder: Applies codec settings and integrates security/authentication
4. Dynamic Configuration File Generation
- A unified configuration file named apis_config.yaml is automatically generated per stack.
- This file merges values from all three config sources to provide a centralized runtime view:
- Node definitions with roles, IPs, and zones.
- Application-level settings such as ports, usernames, SRT modes, etc.
- Inter-node routes with complete protocol and destination information.
- It acts as the foundation for operational orchestration of services across nodes.
5. Multi-Stack Event Handling
- The system supports provisioning multiple stacks under a single event namespace:
- Each event is a folder containing N stacks (e.g., event-x-stack-1, event-x-stack-2, etc.).
- Pulumi is initialized and managed separately for each stack.
- Configurations are automatically cloned and customized per stack.
- This enables high scalability and flexible grouping of services by customer, region, or use case.
Automation Flows
1. create – Bootstrap a New Event
- Prompts user to enter:
- Event name
- Stack prefix
- Number of stacks
- Creates corresponding folders and initializes Pulumi environments.
- Generates runtime configuration (apis_config.yaml).
- Boots and configures all infrastructure and applications end-to-end.
2. destroy – Clean Up Infrastructure
- Prompts user for the event name.
- Iterates through each associated stack.
- Executes pulumi destroy to delete cloud resources.
- Cleans up event folders and configuration artifacts.
3. update-config – Refresh API Configuration
- Allows the user to select an existing event and specific stacks.
- For each selected stack:
- Regenerates apis_config.yaml using updated source config files.
- Optionally applies changes via automated scripts or manual upload.
Highlights
- Fully Configurable: Structured config files define every deployment parameter, from the VPC to the application-level SRT route.
- Reusable Architecture: Teams can import existing component instances with minimal effort, making the architecture easily reusable across multiple events.
- Dynamic Topology Building: The system intelligently builds source-destination links to simulate media flow pipelines.
- Modular Roles: All logic accounts for role-specific behaviors, ensuring separate configurations for main and backup instances.
- Platform-Agnostic Logic: Though implemented for GCP, all logic is abstracted enough to be portable to other cloud environments.
- Secure & Isolated: Instance-level firewall rules limit traffic exposure, with strict port configurations per service.
- Runtime Update Capability: New configurations can be pushed to live environments without redeploying the infrastructure.
Results & Business Impact

- Deployment time decreased from days to 30 minutes.
- 99% accuracy in infrastructure and APIs configurations.
- 80% cost savings by optimizing resources.
- Reduction in operational overhead of 70%.
- Increase in scalability and reliability with minimal manual intervention.
Conclusion
This implementation has demonstrated a clear change from cluttered, manual setups to a standardized, smart automated infrastructure. We took advantage of the configuration logic in Python and Pulumi software, allowing us to deploy faster, have greater reliability, and essentially save costs. The outcome is a scalable, reusable and cloud agnostic approach to quicker time to market with less operational complexity.
References
- Pulumi Official Documentation.
- Pulumi GCP Provider Documentation.
- Haivision Media Gateway REST API Reference (v4.0.1).
- Secure Reliable Transport (SRT) Protocol – Wikipedia.