dev-resources.site
for different kinds of informations.
Auditing PostgreSQL Using pgAudit
Auditing in information technology (IT) is a process of examining an organization’s IT infrastructure to ensure compliance with the requirements imposed by recognized standards or established policies. Data protection rules, such as the new GDPR regulations, are becoming increasingly stringent to protect user data, so it’s important that your database audits are set up properly to ensure both your application and user data is secure from vulnerabilities. In this blog post, we will discuss pgAudit - a tool that generates the audit logs needed to facilitate the auditing of PostgreSQL.
What is pgAudit?
The PostgreSQL Audit Extension, pgAudit, is an open source extension that logs the events in a PostgreSQL database in a detailed audit log. It uses the native PostgreSQL logging facility, so the audit logs will be part of the PostgreSQL logs. The extension is based on the 2ndQuadrant pgAudit project authored by Simon Riggs, Abhijit Menon-Sen, and Ian Barwick, and includes enhancements by David Steele from Crunchy Data.
Why pgAudit over log_statement=all?
We can log all statements in PostgreSQL just by setting log_statement=all
. So why use pgAudit at all? The basic statement logging (using log_statement
) will only list the operations performed against the database. It will not provide the ability to filter operations, and the logs won’t be in the proper formatting required for auditing. pgAudit additionally provides granularity for logging specific classes of statements like READ
(SELECT
and COPY
), WRITE
(INSERT
, UPDATE
, DELETE
, etc.), DDL
etc. Furthermore, it provides object level auditing where only operations on specific relations will be logged.
Another advantage of pgAudit over basic statement logging is that it provides the details of the operation performed instead of just logging the operation requested. For example, consider executing the anonymous code block using a DO statement.
DO $$ BEGIN EXECUTE 'CREATE TABLE import' || 'ant_table (id INT)'; END $$;
The basic statement logging will result in:
2020-12-20 23:40:11 UTC:157.230.232.139(53064):sgpostgres@test:[9091]: LOG: statement: DO $$ BEGIN EXECUTE 'CREATE TABLE import' || 'ant_table (id INT)'; END $$;
pgAudit will log the same operation as:
2020-12-20 23:40:11 UTC:157.230.232.139(53064):sgpostgres@test:[9091]: LOG: AUDIT: SESSION,4,1,FUNCTION,DO,,,"DO $$ BEGIN EXECUTE 'CREATE TABLE import' || 'ant_table (id INT)'; END $$;", 2020-12-20 23:40:11 UTC:157.230.232.139(53064):sgpostgres@test:[9091]: LOG: AUDIT: SESSION,4,2,DDL,CREATE TABLE,TABLE,public.important_table,CREATE TABLE important_table (id INT),
The above clearly indicates the pgAudit functionality that logs the operation and its internals with structured output that eases the search.
How to install pgAudit?
pgAudit is an extension that’s available for download from the PostgreSQL repository, or can be compiled and built from source. As a first step, the package needs to be downloaded and installed on the machine running PostgreSQL (this extension package is preinstalled on all ScaleGrid PostgreSQL deployments).
Once installed, it needs to be loaded into PostgreSQL. This is achieved by adding pgaudit
to the shared_preload_libraries
config parameter. A restart of PostgreSQL is required for this configuration change to be effective. The next step is to enable the extension on the database by running CREATE EXTENSION pgaudit
.
Now that the extension is ready, we need to make sure to set the configuration parameters for the extension to start logging. This can be as simple as setting the parameter pgaudit.log
to value all
and the pgAudit will start logging in session
mode.
Now that we know how to install and enable pgAudit, let’s discuss the two audit logging modes it offers, session and object.
Session Audit Logging
In session mode, pgAudit will log all the operations performed by a user. Setting the pgaudit.log
parameter to any of the defined values, other than NONE
, will enable session audit logging. The pgaudit.log
parameter specifies the classes of statements that will be logged in the session mode. The possible values are: READ
, WRITE
, FUNCTION
, ROLE
, DDL
, MISC
, MISC_SET
, ALL
and NONE
.
Setting the pgaudit.log
parameter to ALL
will log all the statements. The parameter can accept multiple classes using a comma-separated list and specific classes can be excluded with a - sign. For example, if you want to log all statements except MISC
class, the value of pgaudit.log
will be ALL, -MISC, -MISC_SET
. You can also enable pgAudit to create a separate log entry for each relation reference in a statement by setting pgaudit.log_relation
to on.
Consider an example of creating a table. The SQL statement would be:
CREATE TABLE persons(ID SERIAL PRIMARY KEY, LNAME varchar(20), FNAME varchar(20));
The corresponding audit log entries are:
2020-12-21 00:00:11 UTC:157.230.232.139(53178):sgpostgres@test:[11514]: LOG: AUDIT: SESSION,5,1,DDL,CREATE SEQUENCE,SEQUENCE,public.persons_id_seq,"CREATE TABLE persons(ID SERIAL PRIMARY KEY, LNAME varchar(20), FNAME varchar(20));", 2020-12-21 00:00:11 UTC:157.230.232.139(53178):sgpostgres@test:[11514]: LOG: AUDIT: SESSION,5,1,DDL,CREATE TABLE,TABLE,public.persons,"CREATE TABLE persons(ID SERIAL PRIMARY KEY, LNAME varchar(20), FNAME varchar(20));", 2020-12-21 00:00:11 UTC:157.230.232.139(53178):sgpostgres@test:[11514]: LOG: AUDIT: SESSION,5,1,DDL,CREATE INDEX,INDEX,public.persons_pkey,"CREATE TABLE persons(ID SERIAL PRIMARY KEY, LNAME varchar(20), FNAME varchar(20));", 2020-12-21 00:00:11 UTC:157.230.232.139(53178):sgpostgres@test:[11514]: LOG: AUDIT: SESSION,5,1,DDL,ALTER SEQUENCE,SEQUENCE,public.persons_id_seq,"CREATE TABLE persons(ID SERIAL PRIMARY KEY, LNAME varchar(20), FNAME varchar(20));",
Object Audit Logging
In particular cases, it may be required to audit only a specific set of relations. In such cases, using session mode will only result in an unnecessarily large number of audit logs not corresponding to the required relations. Object mode is especially suited for this purpose and can audit only a specific set of relations.
Object audit logging is achieved using the PostgreSQL roles. A role can be created and assigned the permissions to access only a specific set of relations. This role should be specified in the configuration parameter pgaudit.role
. Object mode supports only SELECT
, INSERT
, UPDATE
and DELETE
statements. The classes of statements that are logged depends on the permissions granted to the role. For example, if the role has permissions to perform only SELECT
, then only SELECT
statements will be logged.
Below is an example of object audit logging:
Create a role and grant only SELECT
permissions. Set the pgaudit.role
to that role and run the SELECT
SQL statement:
CREATE ROLE audit_person; GRANT SELECT ON persons TO audit_person; SET pgaudit.role = 'audit_person'; SELECT * FROM persons WHERE ID=404;
The above select statement will be logged as:
2020-12-21 00:27:09 UTC:157.230.232.139(54900):sgpostgres@test:[21835]: LOG: AUDIT: OBJECT,10,1,READ,SELECT,TABLE,public.persons,select * from persons where ID=404;,
How to interpret the audit log entry?
So far, we have provided details on how the audit log entry looks, now let’s take a look at the audit log entry format. Each entry starts with the log_line_prefix mentioned for PostgreSQL logging, and then the rest of the output will be in CSV format. Consider the following simple audit log entry:
2020-12-21 00:27:09 UTC:157.230.232.139(54900):sgpostgres@test:[21835]: LOG: AUDIT: OBJECT,10,1,READ,SELECT,TABLE,public.persons,select * from persons where ID=404;,
In the above entry, the value
2020-12-21 00:27:09 UTC:157.230.232.139(54900):sgpostgres@test:[21835]:
is from the log_line_prefix format %t:%r:%u@%d:[%p]:Â
. The audit entry contents start from LOG: AUDIT:Â
value and it follows CSV format. The value format is of the form:
AUDIT_TYPE,STATEMENT_ID,SUBSTATEMENT_ID,CLASS,COMMAND,OBJECT_TYPE,OBJECT_NAME,STATEMENT,PARAMETER
Let’s take a look at each of the fields one by one:
Field | Description | Value from example audit entry |
---|---|---|
AUDIT_TYPE | Indicates the audit mode: SESSION or OBJECT | OBJECT |
STATEMENT_ID | Unique statement identifier for each session | 10 |
SUBSTATEMENT_ID | An identifier for each sub statement within the main statement | 1 |
CLASS | Indicates the class of statements like READ, WRITE etc that are defined values for pgaudit.log parameter. | READ |
COMMAND | The command used in the SQL statement | SELECT |
OBJECT_TYPE | Can be TABLE, INDEX, VIEW, etc. | TABLE |
OBJECT_NAME | The fully qualified object name | public.persons |
STATEMENT | The actual statement executed | select * from persons where ID=404; |
PARAMETER | When the pgaudit.log_parameter is set to true, the quoted CSV of parameters is listed if present, or "none" if there are no parameters. When the pgaudit.log_parameter is not set, the value will be “<not logged>” | <not logged> |
Inference
pgAudit, with all its capabilities, simplifies the process of auditing by generating the audit trail log. Though there are a few caveats, like logging of renamed objects under the same name, it is still a robust tool that provides the required functionality. However, the audit information written in logs may not be just ideal for the auditing process - the auditing process is even better when those logs can be converted to a database schema, and audit data can be loaded to the database so you can easily query the information. This is where the PostgreSQL Audit Log Analyzer (pgAudit Analyze) is helpful. For more information, refer to the github pages of pgAudit and pgAudit Analyze.
Featured ones: