In this article we will discuss about Postgres OAuth2 Authentication and why do we need an oauth2 on postgres.

Common SSO advantages

SSO (in our case with Keycloak) represents many advantages :

  • More than a unique password, a unique login (advantages over an ldap/pg sync)
  • Standard and secure
  • A single product that is only in charge of authentication (and sometimes authorizations, take a look at https://blog.please-open.it/uma/)
  • Session management

Deploying a SSO in an existing infrastructure needs some adaptations and external modules for legacy services. For example, a nginx proxy that supports SSO we use sometimes for web applications that does not have any authentication method available : https://github.com/please-openit/nginx-openresty-oidc.

70’s style data access control with row level filtering

In the 70’s, most applications (such as databases) had multi-user support. Databases (and application) were used from terminals such as a VT100.

 

File source: https://commons.wikimedia.org/wiki/File:Informatics_General_programmer_at_terminal.jpg

Each user had a login and a password with some kind of authorizations on a database, called « grants ».

I.E :

GRANT UPDATE ON ORDER_BACKLOG TO JONES WITH GRANT OPTION

https://www.ibm.com/docs/en/qmf/12.2.0?topic=privileges-sql-grant-statement

This kind of authorizations, directly in the database, is not often used now. An application connects to a database using a single user, then manages its own authorizations directly in the application.

Views are the result of queries, with a grant management system that allows users to query some data from the database.

https://www.oracletutorial.com/oracle-view/

Row Level Security is the next step, a filter is applied on each request the user executes.

https://docs.oracle.com/middleware/12212/biee/BIEMG/GUID-031DAE26-1CEC-40D0-97E7-6EFA0E87377F.htm#BIEMG-GUID-031DAE26-1CEC-40D0-97E7-6EFA0E87377F

Postgrest

We love postgrest product. @mathieupassenau wrote a blog post 2 years ago about this product, how it works and how to deploy it with Keycloak authentication https://www.mathieupassenaud.fr/codeless_backend/.

This product uses row level security, a native policy from postgres that filters data for the user.

https://postgrest.org/en/stable/auth.html#roles-for-each-web-user

How can we replace this yellow box, and merge the authentication part directly in Postgres ?

It has big advantages :

  • oauth2 for postgres allows postgres to be aware of which user requesting data
  • Advantages of using native postgre’s data access control (especially performance)

OAuth2 for pg does not exist, how to workaround this ?

There is no specific way to integrate oauth2 authentication in postgres. Fortunately postgres supports PAM, and PAM is the authentication module interface for linux systems and has a broad choice of community modules available over the web.

PAM module structure

https://en.wikipedia.org/wiki/Pluggableauthenticationmodule

PAM allows programs that rely on authentication to be written independently of the underlying authentication scheme.

Writing a PAM module is only the integration of 2 headers :

#include <security/pam_modules.h>#include <security/pam_ext.h>

So, implement your own functions :

PAM_EXTERN int pam_sm_authenticate(pam_handle_t *pamh, int flags, int argc, const char **argv)PAM_EXTERN int pam_sm_chauthtok(pam_handle_t *pamh, int flags, int argc, const char **argv)PAM_EXTERN int pam_sm_open_session(pam_handle_t *pamh, int flags, int argc, const char **argv)PAM_EXTERN int pam_sm_close_session(pam_handle_t *pamh, int flags, int argc, const char **argv)PAM_EXTERN int pam_sm_setcred(pam_handle_t *pamh, int flags, int argc, const char **argv)PAM_EXTERN int pam_sm_acct_mgmt(pam_handle_t *pamh, int flags, int argc, const char **argv)

Then, you do whatever you want. Compile it, deploy it to the /lib directory, then you can reference it in /etc/pam.d

I.E :

declare a new file « pg_auth » in /etc/pam.d with this content

auth sufficient pam_oauth2.so https://keycloak/realms/REALMID/protocol/openid-connect/userinfo preferred_usernameaccount sufficient pam_oauth2.so https://keycloak/realms/REALMID/protocol/openid-connect/userinfo preferred_username

  • auth means « authentication » module
  • account means « account retrieving » module

means that the module « pam_oauth2.so » is used for the authentication, with some parameters :

  • an url for the userinfo
  • a field from the result of the userinfo response

How to setup a pam oauth2 with PG (implem & conf)

Oauth2 module for PAM

check the source code : https://github.com/please-openit/pam-oauth2

Authentication will be done by sending an access_token instead of a password. We have to verify this token

The function « query_token_info » does it with a simple « CURL » command to the userinfo endpoint, with the access_token in header :

    int ret = 1;    struct curl_slist *headers = NULL;    char *authorization_header;    CURL *session = curl_easy_init();    if ((authorization_header = malloc(strlen("Authorization: Bearer ") + strlen(authtok) +1))){        strcpy(authorization_header, "Authorization: Bearer ");        strcat(authorization_header, authtok);    }    // else{ check malloc    // ... check session    headers = curl_slist_append(headers, "Content-Type: application/json");    headers = curl_slist_append(headers, authorization_header);    curl_easy_setopt(session, CURLOPT_URL, tokeninfo_url);    curl_easy_setopt(session, CURLOPT_HTTPHEADER, headers);    curl_easy_setopt(session, CURLOPT_WRITEFUNCTION, writefunc);    curl_easy_setopt(session, CURLOPT_WRITEDATA, token_info);    if (curl_easy_perform(session) == CURLE_OK &&            curl_easy_getinfo(session, CURLINFO_RESPONSE_CODE, response_code) == CURLE_OK) {        ret = 0;    }     // ... else { check if curl failed    free(authorization_header);    curl_easy_cleanup(session);

Then we check if the response is 200, and compare the username given with the « preferred_username » field in the « userinfo » response.

If ok, the user is authenticated 🙂

note : you have some components in Python for writing your own PAM module with your favorite language ! https://pypi.org/project/python-pam/

PAM with PG

Postgres can use PAM modules as authentication instead of its own native authentication method ; this is done with configuration in pg_hba.conf file. You can define different authentication methods depending on the host that connects to the database server.

I.E : connect using a custom « pg_auth » pam configuration module

local   all             all             pam pamservice=pg_authhost    all     all     ::1/128         pam pamservice=pg_auth

So, our postgres authentication uses the pam module declared previously, and uses our custom oauth2 implementation.

So now : we authenticate with a username and a valid access_token

Troubleshooting

  • Pg clients will truncate your token most of the time, so you have to verify that the library you use supports long passwords.
  • Postgresql backend prior to 14 does not allow password longer than 995 which is not enough to pass even the smallest access token from keycloak. (you must use postgres 14+)

The magic : connecting to my database with my access token

Now, with the module enabled, we do not use passwords but active access_token from the oauth2 authentication.

Example with PHP :

<?php   $oauthAccessToken    = "ey....="   $host                = "host = 127.0.0.1";   $port                = "port = 5432";   $dbname              = "dbname = testdb";   $credentials         = "user = contact@please-open.it password=$oauthAccessToken";   $db = pg_connect( "$host $port $dbname $credentials"  );   if(!$db) {      echo "Error : Unable to open database\n";   } else {      echo "Opened database successfully\n";   }?>

Conclusion

This technique was set up to address a specific need for one of our clients. It is relevant only if you are already using your postgres to manage data-level security (with grants and row level security) or in a new project if you are planning to do so.

If you are using postgres with a unique user such as 99% of software today, envisaging this technique may include a big refactor of your backend software, so we recommend to use this method only for new projects using postgrest for example.

Loïc Mercier Des Rochettes
Les derniers articles par Loïc Mercier Des Rochettes (tout voir)