import subprocess
import time
subprocess
: This module provides a higher-level interface to working with additional processes. It allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. It's a powerful tool for running shell commands directly from Python scripts.time
: This module provides time-related functions. In this script, it's primarily used for the sleep
function, which introduces a delay.wait_for_postgres
Functiondef wait_for_postgres(host, max_retries=5, delay_seconds=5):
...
This function's purpose is to repeatedly check if a PostgreSQL instance is ready to accept connections.
host
: The hostname or IP address of the PostgreSQL server.max_retries
: This sets a limit on how many times the function will attempt to connect to the database before it gives up. It's a way to prevent infinite loops if the database is not available.delay_seconds
: After each failed attempt, the function will wait for this many seconds before trying again.Inside the function:
pg_isready
: A command-line utility that comes with PostgreSQL. It checks the connection status of a PostgreSQL database instance. The function uses this utility to determine if the database is ready.subprocess.run()
: This method is used to execute the pg_isready
command. If the command fails (i.e., the database isn't ready), a CalledProcessError
exception is raised, which the function catches and handles by waiting and retrying.if not wait_for_postgres(host="source_postgres"):
exit(1)
Before the main ELT process starts, the script ensures that the source PostgreSQL database is operational. If the database isn't ready after the specified number of retries, the script terminates with an error code (exit(1)
).