In this article we will discuss about Postgres OAuth2 Authentication and why do we need an oauth2 on postgres.

Common SSO advantages

SSO (in our case with Keycloak) represents many advantages :

  • More than a unique password, a unique login (advantages over an ldap/pg sync)
  • Standard and secure
  • A single product that is only in charge of authentication (and sometimes authorizations, take a look at https://blog.please-open.it/uma/)
  • Session management

Deploying a SSO in an existing infrastructure needs some adaptations and external modules for legacy services. For example, a nginx proxy that supports SSO we use sometimes for web applications that does not have any authentication method available : https://github.com/please-openit/nginx-openresty-oidc.

70’s style data access control with row level filtering

In the 70’s, most applications (such as databases) had multi-user support. Databases (and application) were used from terminals such as a VT100.

 

File source: https://commons.wikimedia.org/wiki/File:Informatics_General_programmer_at_terminal.jpg

Each user had a login and a password with some kind of authorizations on a database, called « grants ».

I.E :

GRANT UPDATE ON ORDER_BACKLOG TO JONES WITH GRANT OPTION

https://www.ibm.com/docs/en/qmf/12.2.0?topic=privileges-sql-grant-statement

This kind of authorizations, directly in the database, is not often used now. An application connects to a database using a single user, then manages its own authorizations directly in the application.

Views are the result of queries, with a grant management system that allows users to query some data from the database.

https://www.oracletutorial.com/oracle-view/

Row Level Security is the next step, a filter is applied on each request the user executes.

https://docs.oracle.com/middleware/12212/biee/BIEMG/GUID-031DAE26-1CEC-40D0-97E7-6EFA0E87377F.htm#BIEMG-GUID-031DAE26-1CEC-40D0-97E7-6EFA0E87377F

Postgrest

We love postgrest product. @mathieupassenau wrote a blog post 2 years ago about this product, how it works and how to deploy it with Keycloak authentication https://www.mathieupassenaud.fr/codeless_backend/.

This product uses row level security, a native policy from postgres that filters data for the user.

https://postgrest.org/en/stable/auth.html#roles-for-each-web-user

How can we replace this yellow box, and merge the authentication part directly in Postgres ?

It has big advantages :

  • oauth2 for postgres allows postgres to be aware of which user requesting data
  • Advantages of using native postgre’s data access control (especially performance)

OAuth2 for pg does not exist, how to workaround this ?

There is no specific way to integrate oauth2 authentication in postgres. Fortunately postgres supports PAM, and PAM is the authentication module interface for linux systems and has a broad choice of community modules available over the web.

PAM module structure

https://en.wikipedia.org/wiki/Pluggableauthenticationmodule

PAM allows programs that rely on authentication to be written independently of the underlying authentication scheme.

Writing a PAM module is only the integration of 2 headers :

#include <security/pam_modules.h>#include <security/pam_ext.h>

So, implement your own functions :

PAM_EXTERN int pam_sm_authenticate(pam_handle_t *pamh, int flags, int argc, const char **argv)PAM_EXTERN int pam_sm_chauthtok(pam_handle_t *pamh, int flags, int argc, const char **argv)PAM_EXTERN int pam_sm_open_session(pam_handle_t *pamh, int flags, int argc, const char **argv)PAM_EXTERN int pam_sm_close_session(pam_handle_t *pamh, int flags, int argc, const char **argv)PAM_EXTERN int pam_sm_setcred(pam_handle_t *pamh, int flags, int argc, const char **argv)PAM_EXTERN int pam_sm_acct_mgmt(pam_handle_t *pamh, int flags, int argc, const char **argv)

Then, you do whatever you want. Compile it, deploy it to the /lib directory, then you can reference it in /etc/pam.d

I.E :

declare a new file « pg_auth » in /etc/pam.d with this content

auth sufficient pam_oauth2.so https://keycloak/realms/REALMID/protocol/openid-connect/userinfo preferred_usernameaccount sufficient pam_oauth2.so https://keycloak/realms/REALMID/protocol/openid-connect/userinfo preferred_username

  • auth means « authentication » module
  • account means « account retrieving » module

means that the module « pam_oauth2.so » is used for the authentication, with some parameters :

  • an url for the userinfo
  • a field from the result of the userinfo response

How to setup a pam oauth2 with PG (implem & conf)

Oauth2 module for PAM

check the source code : https://github.com/please-openit/pam-oauth2

Authentication will be done by sending an access_token instead of a password. We have to verify this token

The function « query_token_info » does it with a simple « CURL » command to the userinfo endpoint, with the access_token in header :

    int ret = 1;    struct curl_slist *headers = NULL;    char *authorization_header;    CURL *session = curl_easy_init();    if ((authorization_header = malloc(strlen("Authorization: Bearer ") + strlen(authtok) +1))){        strcpy(authorization_header, "Authorization: Bearer ");        strcat(authorization_header, authtok);    }    // else{ check malloc    // ... check session    headers = curl_slist_append(headers, "Content-Type: application/json");    headers = curl_slist_append(headers, authorization_header);    curl_easy_setopt(session, CURLOPT_URL, tokeninfo_url);    curl_easy_setopt(session, CURLOPT_HTTPHEADER, headers);    curl_easy_setopt(session, CURLOPT_WRITEFUNCTION, writefunc);    curl_easy_setopt(session, CURLOPT_WRITEDATA, token_info);    if (curl_easy_perform(session) == CURLE_OK &&            curl_easy_getinfo(session, CURLINFO_RESPONSE_CODE, response_code) == CURLE_OK) {        ret = 0;    }     // ... else { check if curl failed    free(authorization_header);    curl_easy_cleanup(session);

Then we check if the response is 200, and compare the username given with the « preferred_username » field in the « userinfo » response.

If ok, the user is authenticated 🙂

note : you have some components in Python for writing your own PAM module with your favorite language ! https://pypi.org/project/python-pam/

PAM with PG

Postgres can use PAM modules as authentication instead of its own native authentication method ; this is done with configuration in pg_hba.conf file. You can define different authentication methods depending on the host that connects to the database server.

I.E : connect using a custom « pg_auth » pam configuration module

local   all             all             pam pamservice=pg_authhost    all     all     ::1/128         pam pamservice=pg_auth

So, our postgres authentication uses the pam module declared previously, and uses our custom oauth2 implementation.

So now : we authenticate with a username and a valid access_token

Troubleshooting

  • Pg clients will truncate your token most of the time, so you have to verify that the library you use supports long passwords.
  • Postgresql backend prior to 14 does not allow password longer than 995 which is not enough to pass even the smallest access token from keycloak. (you must use postgres 14+)

The magic : connecting to my database with my access token

Now, with the module enabled, we do not use passwords but active access_token from the oauth2 authentication.

Example with PHP :

<?php   $oauthAccessToken    = "ey....="   $host                = "host = 127.0.0.1";   $port                = "port = 5432";   $dbname              = "dbname = testdb";   $credentials         = "user = contact@please-open.it password=$oauthAccessToken";   $db = pg_connect( "$host $port $dbname $credentials"  );   if(!$db) {      echo "Error : Unable to open database\n";   } else {      echo "Opened database successfully\n";   }?>

Conclusion

This technique was set up to address a specific need for one of our clients. It is relevant only if you are already using your postgres to manage data-level security (with grants and row level security) or in a new project if you are planning to do so.

If you are using postgres with a unique user such as 99% of software today, envisaging this technique may include a big refactor of your backend software, so we recommend to use this method only for new projects using postgrest for example.

Loïc Mercier Des Rochettes