I have this strange feeling that around Dakar, there is only few people who knows Postgresql. I guess I don't know enough people. This article is not really a tutorial, nor an introduction to Postgresql. It's more like a list of some answers to the questions I received everytime I introduced Postgresql to somebody.

I'm working on Debian Jessie, and the installed version of Postgresql is 9.4.8

Postgresql installed, how to login to the Postgresql server?

To login to the Postgresql server, you need to:

  • login as the UNIX user.
  • run psql command, the Postgresql interactive terminal.

There is others ways, but we'll talk about it later.

After a fresh install of Postgresql on Debian, a separate user (UNIX user) who own the POSTGRESQL files and processes is created and locked. The user name is "postgres". This user is actually the only one who can access the Postgresql server.

The UNIX user "postgres" is locked, it means it cannot be logged in using a password. While it's perfectly possible to unlock the user and set a password for him with sudo passwd postgres, it's advised to first login as root, then use the su command:


You're logged in as "postgres" user. You can now type psql command to login to the Postgresql server, and once inside the psql console, type \l to list the available databases: psql list databases

Templates databases?

As you can see, I've already created some databases. But after a fresh install, you normally have 3 databases: postgres, template0, template1. The postgres database is meant as a default database for users and applications to connect to. It is simply a copy of template1.

The databases template0 and template1 are templates databases, we know it thanks to the datistemplate attribute, which is set to t (true).

[email protected][local] postgres 18:05:25 # select datname, datistemplate, datacl from pg_database
[more] - > where datname like 'template%';
  datname  | datistemplate |               datacl
-----------+---------------+-------------------------------------
 template1 | t             | {=c/postgres,postgres=CTc/postgres}
 template0 | t             | {=c/postgres,postgres=CTc/postgres}
(2 lignes)

Temps : 58,335 ms
[email protected][local] postgres 18:05:46 #


  1. template1: Every time you run CREATE DATABASE, it uses an already existing database as the "template". Basically, it creates a copy of the template database you specify. If no template is specified, it uses "template1" by default. Suppose you want to have a set of specific objects in all your databases. You can add them in template1. Then, every time you'll create a new database, Those specific objects will automatically be available without any extra action.

  2. template0: The template0 database contains only the standard objects predefined by your version of PostgreSQL. The template0 database is intended to be a "virgin" template thatcontain only standard database objects, the ones created by initializing a cluster. It should never be changed. Leaving template0 unmodified will ensure that the administrator will have a standard unmodified template from which to create new databases.



Postgresql Host Based Authentication?:

The pg_hba.conf file is the one that enables client authentication between the PostgreSQL server and the client application. Depending on your configuration, this file may be stored in the database cluster's data directory, or elsewhere. On my Debian for example, it's under /etc/postgresql/9.4/main directory.

Whenever you'll want to connect to the PostgreSQL server, you'll send a connection request. For every connection request PostgreSQL receives, it will check the pg_hba.conf file to:

  • verify that the machine requesting a connection has rights to connect to the specified database.
  • find what are the conditions that the application must meet in order to successfully authenticate.

Please, follow this link to find a more complete description of the pg_hba.conf file.

After a fresh install of Postgresql, you'll be able to login as the postgres user, because of one line inside pg_hba.conf:

# Database administrative login by Unix domain socket
local   all             postgres                 peer


This line is basically saying, from right to left: "using peer authentication, the postgres user, is able to connect to all databases, via local connection: unix domain sockets".

The peer authentication method works by obtaining the client's operating system user name from the kernel and using it as the allowed database user name. That is why to login to the Postgresql server, we first logged in as the UNIX user named postgres.

Cluster?

Inside a single database server:

  • tables are contained in schemas.
  • a database contains one or more schemas.
  • and databases are grouped in cluster.

A cluster is just a single directory undOBer which all data will be stored on disk, the data directory. The SQL standard uses the term catalog cluster.

The term "cluster" in Postgresql is completely different to the general meaning of cluster, which normally refers to groups of computers that work together and are viewed as a single system. It is also un-related to the PostgreSQL command CLUSTER, which is about organizing tables.

The pg_lsclusters command will show you informations about all PostgreSQL clusters:
psql list clusters

Create a cluster?

If for any reason you want to create a new cluster, it is quite easy.

Create the required folders, and create the cluster: create cluster

Start the cluster: start cluster

And from another shell, access the new cluster with the psql command. To choose the cluster you want to access, you'll specify the port: -p PORT_NUMBER: access cluster

Now, if what you want, is just to quickly create a new cluster for playing, a cluster that you will destroy at the end of your play, you can use the pg_virtualenv command.

In fact, the main purpose of this command is not creating clusters. pg_virtualenv is a shell script that will create a throw-away PostgreSQL environment for running regression tests. When you invoke the command without any argument, you'll be dropped inside a new shell, and when you'll exit this shell, the cluster will be destroyed:
pg virtualenv

Write-Ahead Logging?

I think I should write another complete article just on this topic. Just remember that WAL, is how PostgreSQL named the transaction log, also known as REDO log within Oracle databases.

Let's suppose you have a 1GB file, and you want to change 100kB of that file, from a defined offset. What will happen if you have no UPS, and the power go down while doing the write? That is the purpose of WAL: protect your data from hardware or power failures. if the system crashes, the database can be restored to consistency by "replaying" the log entries made since the last checkpoint.

WAL's central concept is that changes to data files must be written only after log records describing the changes have been flushed to permanent storage. Under Debian, the WAL folder is /var/lib/postgresql/9.4/main/pg_xlog/.


I hope you learned something. Do not forget to: