Today, database management often presents a critical bottleneck. As PostgreSQL, a powerful open-source relational database, grows in popularity, containerization tools like Docker allow for simpler database deployment and management.
Developers leverage Docker as a containerization platform to run diverse applications and processes, eliminating the need to install them in various environments. Developers can perform Data Processing Operations using any Database Management System (DBMS) like PostgreSQL by pulling in their respective Docker Image files from the Docker Hub.
In this blog, you will learn about the process of setting up a Docker PostgreSQL environment. We’ll show how you can install, configure, and run Postgres in Docker in 3 simple steps. Read on to get started.
What is Docker?
Docker is a software platform that allows developers to build, deploy, and run applications inside standardized units called containers. Containers package an application and all its dependencies together in a single isolated environment, making it easier to move and run applications consistently across different computing environments (e.g., development, testing, production).
What is PostgreSQL?
PostgreSQL is an open-source, object-relational database management system (ORDBMS) that is widely used for a variety of applications, ranging from small-scale projects to large web applications, data warehousing, and business intelligence systems.
Why should you containerize Postgres?
Install PostgreSQL Docker containers to leverage several advantages:
- Rapid Deployment: Containers allow you to quickly deploy PostgreSQL databases alongside your main application, eliminating the need for complex local installations and configurations. This streamlined process saves time and simplifies the setup procedure.
- Data Isolation: Containerization separates the PostgreSQL database from your application, ensuring that data remains protected even if the application encounters issues or failures. This isolation makes it easy to launch a new container instance while preserving your valuable data.
- Portability: Containers are designed to run consistently across different environments, making it effortless to move your PostgreSQL database from development to production or across various platforms without compatibility concerns.
- Simplified Configuration: Instead of manually configuring PostgreSQL and managing background processes, containers encapsulate the necessary settings and dependencies within a self-contained unit, reducing the need for in-depth technical knowledge.
- Scalability and Flexibility: Containers are lightweight and can be easily scaled up or down based on your application’s changing requirements, providing flexibility in resource allocation and facilitating efficient use of system resources.
- Consistent Development Environment: By using containers, developers can work with a consistent PostgreSQL environment across different machines, minimizing compatibility issues and ensuring a smoother development process.
With Hevo, you can effortlessly migrate your data from 150+ sources, such as PostgreSQL, in just two easy steps. With real-time sync and automated schema mapping, Hevo ensures efficient migration of your PostgreSQL Data, reducing manual intervention.
Take a look at some of the salient features of Hevo:
- Real-Time Data Sync: Ensures immediate replication of PostgreSQL data to destination systems, keeping data up-to-date.
- Schema Mapping and Transformation: Automatically adapts PostgreSQL data schemas to fit target system formats.
- Pre-Built Connectors: Simplifies integration with PostgreSQL through ready-to-use connectors.
Don’t just take our word for it. Check out Hevo’s ratings on Capterra and G2.
Migrate your PostgreSQL Data for Free
How to Install Postgres Docker Container?
In the sections below, a step-by-step guide on how to install Docker, set up PostgreSQL, and run a complete Docker install Postgres environment with ease.
Step 1: Download and Install Docker
Before starting the application setup process for Docker PostgreSQL Environment, you are required to download and install Docker on your local machine.
- To download the application, visit the official website of Docker. On the welcome page of the Docker website, click on the “Get-Started” option. You will be redirected to the download page, where you can download the Docker version according to your operating system specifications.
- Select the preferred option to start the downloading process. After the Docker setup file is downloaded, install it on your local machine by following the installation steps.
- Next, you can sign in on Docker Hub, from where you can access the Docker Image files. This will be used to run external applications like PostgreSQL.
Step 2: Download the Docker PostgreSQL Image
There are two different ways to install PostgreSQL Docker Image that allows you to set up PostgreSQL on Docker. One is by directly accessing it from the Docker Hub’s website. The other method is by pulling the Docker PostgreSQL Image from default Command Line Interface (CLI) tools like Command Prompt or Power Shell.
Method 1: Download Docker Image from Website
- To download the Docker PostgreSQL Image, visit Docker Hub using your previously created user account.
- On the welcome page, you can find a search bar at the top. Type “PostgreSQL” to get the Docker Image of the respective application. You can find various Docker Images related to the PostgreSQL database.
- On clicking on the appropriate file, you will be redirected to a new page where you can find the command used to access the Docker Image file. Copy and make a note of the respective command. Run this into your Command Prompt to install the PostgreSQL instance.
Method 2: Pull Docker Image Using CLI
Another way to pull Docker PostgreSQL Image is by accessing it using the Command Prompt instead of reaching its website. To do so, follow these steps:
- Open a new command window, and run the command given below.
docker pull postgres
- To obtain the list of existing Docker Images, run the following command.
docker images
Step 3: Run the PostgreSQL Container
- In the next step, you can enter the command you copied from the Docker Hub in the Command Prompt.
docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -d postgres
- The above-given command should be customized and added with the necessary parameters to work properly for setting up PostgreSQL on Docker.
docker run --name postgresql -e POSTGRES_USER=myusername -e POSTGRES_PASSWORD=mypassword -p 5432:5432 -v /data:/var/lib/postgresql/data -d postgres
In the command given above,
- PostgreSQL is the name of the Docker Container.
- -e POSTGRES_USER is the parameter that sets a unique username to the Postgres database.
- -e POSTGRES_PASSWORD is the parameter that allows you to set the password of the Postgres database.
- -p 5432:5432 is the parameter that establishes a connection between the Host Port and Docker Container Port. In this case, both ports are given as 5432, which indicates requests sent to the Host Ports will automatically redirect to the Docker Container Port. In addition, 5432 is also the same port where PostgreSQL will be accepting requests from the client.
- -v is the parameter that synchronizes the Postgres data with the local folder. This ensures that Postgres data will be safely present within the Home Directory even if the Docker Container is terminated.
- -d is the parameter that runs the Docker Container in the detached mode, i.e., in the background. If you accidentally close or terminate the Command Prompt, the Docker Container will still run in the background.
- Postgres is the name of the Docker image that was previously downloaded to run the Docker Container.
Step 4: Verify the Container is Running
Now, execute docker ps -a to check the status of the newly created PostgreSQL container.
On executing the command, you get the output, as shown in the above image. It shows that Docker Container is running successfully on port 5432.
Step 5: Starting and Stopping Docker Container
You can start and stop the newly created Docker Container by running the following commands.
- For starting the Docker Container:
docker start postgresqldb
- For stopping the Docker Container:
docker stop postgresqldb
There you have it. You have now successfully created a Docker Container running the PostgreSQL Environment.
How to Extend the Image
There are many ways to extend the Postgres image. Let’s look at an overview of a few of them:
Environmental Variables
There are many Environmental Variables that you can use but the most useful is POSTGRES_PASSWORD
.
POSTGRES_PASSWORD
is needed to use Postgres image. It sets the superuser password. It should not be empty or undefined. POSTGRES_USER
environmental variable defines the default superuser.
A password is not required when connecting from a local host, i.e. the same container as the PostgreSQL image authentication. However, you may require a password when you connect through a different host or container.
The other optional environmental variables are:
- POSTGRES_USER: It is used along with POSTGRES_PASSWORD for setting up user and password
- POSTGRES_DB: It is used to redefine the name of the image
- POSTGRES_INITDB_ARGS: It is used for sending arguments to postgres initdb.
- POSTGRES_INITDB_WALDIR: The subdirectory of the main Postgres data folder (PGDATA) stores the transaction log by default. POSTGRES_INITDB_WALDIR is useful to redefine the transaction log.
- POSTGRES_HOST_AUTH_METHOD: The auth method for host connections for all databases, all users, and all addresses can be controlled by this variable.
- PGDATA: It defines another location for the database files.- like a subdirectory. It is /var/lib/postgresql/data by default.
Docker Secrets
You can use _FILE
for loading values for variables of files in the container. They can be used with some of the environmental variables and are used to pass sensitive information. Passwords from Docker Secret can be loaded using _FILE
.
POSTGRES_PASSWORD
, POSTGRES_INITDB_ARGS
, POSTGRES_DB,
and POSTGRES_USER
are the only ones supporting Docker Secrets.
Initialization Scripts
You can do additional initialization in an image by adding one or more *.sql.gz, *.sql, or *.sh scripts under /docker-entrypoint-initdb.d. You can create a directory if need arises. For creating default Postgres database and user, entry point calls initdb. After this, to carry on with the initialization before starting the service, it runs any *.sql files, runs any *.sh scripts that are executable, and sources any *.sh scripts that are non-executable.
Database Configuration
There are several options for setting PostgreSQL server configuration. Two of these are:
- This can be done using custom config files, which you can then put into a container. The sample provided by PostgreSQL in the container at /usr/share/postgresql/postgresql.conf.sample can be used as the initial place for your config file.
- You can set up database configuration by setting the option on the run line directly. All the options passed to the docker command should be passed throught the Postgres server daemon. For this, the entry point is made. In Postgres, you can use -c to set any option available in a .conf file.
$ docker run -d –name some-postgres -e POSTGRES_PASSWORD=mysecretpassword postgres -c shared_buffers=256MB -c max_connections=200
Locale Customization
You can set a different locale using Dockerfile to extend images that are debian-based. For example, to set the default locale to de_DE.utf8:
FROM postgres:14.3
RUN localedef -i de_DE -c -f UTF-8 -A /usr/share/locale/locale.alias de_DE.UTF-8
ENV LANG de_DE.utf8
Additional Extensions
For default (Debian-based) variants, it is easier to install additional extensions (such as PostGIS) while for Alpine variants, you can compile any postgres extension not listed in postgres-contrib in your own image.
Use Cases of Docker PostgreSQL
- Better Development and Testing: As a developer working on several machines, it becomes tedious to configure the database each time you switch machines. PostgreSQL in Docker enables the spinning up of PostgreSQL containers quickly. This saves time.
- Continuous Integration Pipeline: When you have implemented a continuous integration (CI) pipeline, using linked containers can ensure better database configuration for automated testing.
Limitations of Running PostgreSQL with Docker
There are certain limitations of using docker with PostgreSQL, such as:
- Docker containers support stateless applications effectively. However, the database is stateful, and running it with Docker can cause disruptions in the database application.
- It may appear to be a complex tool for a new user.
Conclusion
In conclusion, containerizing PostgreSQL with Docker offers a flexible and efficient way to manage databases, especially for development and testing. While it provides portability, scalability, and ease of setup, understanding its limitations ensures better use in production environments. By following this guide, you can confidently set up, extend, and optimize your Docker PostgreSQL environment for various use cases. Discover the benefits of integrating Snowflake with Docker for streamlined data processing and enhanced deployment.
To get an understanding of how Docker is set up with other databases, read about How to Deploy & Connect an SQL Server Docker Container, How to Run & Deploy MongoDB Docker Container?, and Setting up MariaDB Docker Deployment: 3 Easy Steps.
Further, if you would like to export data from a source of your choice like PostgreSQL into your desired database/destination like a data warehouse, then Hevo Data is the right choice for you!
Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite firsthand. You can also have a look at the unbeatable Hevo Pricing that will help you choose the right plan for your business needs.
Share your experience of learning how to install and set up a Docker PostgreSQL environment! Let us know in the comments section below!
FAQ
How to build an ETL using Python, Docker, PostgreSQL, and Airflow?
Set Up Docker: Create Docker containers for PostgreSQL and Airflow to streamline the environment.
Extract: Use Python scripts to connect to data sources (APIs, files, etc.) and fetch data.
Transform: Process the extracted data using Python libraries like pandas or custom scripts.
Load: Store the transformed data into PostgreSQL via Python or Airflow tasks.
Orchestrate: Use Airflow to schedule and monitor the ETL pipeline, managing task dependencies.
Should I run PostgreSQL in Docker?
Yes, running PostgreSQL in Docker is a great choice for isolated environments, consistent configurations, and easy portability. It simplifies setup and makes scaling or testing on multiple environments seamless.
How to persist PostgreSQL data in docker?
To persist PostgreSQL data, use Docker volumes. Define a volume in your docker-compose.yml
or Docker command, such as:
volumes:
pg_data:/var/lib/postgresql/data
Veeresh is a skilled professional specializing in JDBC, REST API, Linux, and Shell Scripting. With a knack for resolving complex issues and implementing Python transformations, he plays a crucial role in enhancing Hevo's data integration solutions.
Ishwarya is a skilled technical writer with over 5 years of experience. She has extensive experience working with B2B SaaS companies in the data industry, she channels her passion for data science into producing informative content that helps individuals understand the complexities of data integration and analysis.