Deployment and Operations

Installing and Configuring TimescaleDB for Storing Audit Data

Learn about TimescaleDB datastore configuration for deployment of Cloudentity platform.

TimescaleDB Datastore Overview

TimescaleDB is a time-series database that is built on top of PostgreSQL. It is designed to handle large amounts of time-series data and provide fast querying and aggregation capabilities. It is an open-source project and can be used on various platforms such as Linux, Windows and MacOS.

One of the key features of TimescaleDB is its ability to scale horizontally. This means that as data grows, you can add more machines to the cluster, rather than upgrading a single machine, to maintain fast query performance. This is achieved by using a technique called “time-series partitioning” where the data is automatically partitioned based on the time interval.

Another beneficial feature of TimescaleDB is its ability to handle large amount of data, it can handle Billion of rows, this is possible thanks to its hybrid storage engine, where the database uses disk-based storage for historical data, and RAM-based storage for recent data. This enables fast queries on recent data while still being able to query and analyze historical data.

Why TimescaleDB

Cloudentity uses TimescaleDB to store audit and analytics/metrics data because it is a powerful and efficient tool for handling large amounts of time-series data. One of the key benefits of TimescaleDB is its ability to scale horizontally, which means that as the volume of audit/analytics/metrics data grows, Cloudentity can add more machines to the cluster, rather than upgrading a single machine, to maintain fast query performance.

Another benefit of TimescaleDB when storing audit data is its ability to handle complex queries efficiently. TimescaleDB provides various time-based aggregate functions, which enables Cloudentity to perform complex queries on the audit/analytics/metrics data with high performance. This makes it easy to perform analysis on the audit data to identify patterns, detect anomalies and extract insights.

TimescaleDB Installation

Important

Out of the three databases that we install and configure, TimescaleDB is the only datastore optional to be installed. Remember that if you choose not to install the TimescaleDB, you won’t be able to use the audit events, analytics, and metrics features and APIs built into our platform.

At Cloudentity, to install and configure TimescaleDB, we use Helm - a popular package manager for Kubernetes that allows users to easily install and configure complex software such as TimescaleDB on a Kubernetes cluster. By using Helm to install TimescaleDB, users can take advantage of several benefits that make the process of deploying and managing TimescaleDB much simpler and more efficient.

Firstly, Helm provides a convenient way to define and manage the configuration of TimescaleDB, including the number of nodes, storage settings, and networking settings, in a single, easy-to-read file called a chart. This makes it easy to understand and modify the configuration of TimescaleDB as needed.

Additionally, Helm provides the ability to manage and upgrade the TimescaleDB deployment in a controlled and repeatable way, this means that any updates or upgrades to the TimescaleDB software can be easily rolled out to the cluster in a predictable manner, avoiding any possible disruption to the service.

When you install the Cloudentity plafrom on Kubernetes using Helm Charts, you can see that the TimescaleDB dependency is included in our kube-acp-stack Helm Chart.

TimescaleDB Version Recommendation

Database: 2.8.0 (with Postgres 14.5) Helm chart: 0.16.3

Supported TimescaleDB Versions

  • 2.8.x (with Postgres 14.5)

Install TimescaleDB in Kubernetes Cluster

  1. Create a namespace for TimescaleDB.
kubectl create namespace acp-db

Prepare configmap

  1. Create create_extra_dbs.sh that will create database for Cloudentity to use. Write the following content to the file:

    #!/bin/bash
    
    psql -d "$1" <<__SQL__
    CREATE ROLE acp WITH LOGIN SUPERUSER;
    CREATE DATABASE acpdb OWNER acp;
    GRANT ALL PRIVILEGES ON DATABASE acpdb TO acp;
    __SQL__
    
  2. Upload create_extra_dbs.sh to kubectl.

    kubectl create configmap timescale-post-init --from-file=create_extra_dbs.sh --namespace acp-db
    

Prepare passwords setup

  1. Create set_passwords.sh file (remember to replace password with your own).

    #!/bin/bash
    psql -d "$1" --file=- --set ON_ERROR_STOP=1 << __SQL__
    SET log_statement TO none;      -- prevent these passwords from being logged
    ALTER USER acp WITH PASSWORD 'PaSsW0rD';
    __SQL__
    
  2. Create a secret.

    kubectl create secret generic timescale-post-init-pw --from-file=set_passwords.sh --namespace acp-db
    

Install TimescaleDB

  1. Prepare the configuration file i.e. config.yaml.

    postInit:
      - configMap:
          name: timescale-post-init
      - secret:
          name: timescale-post-init-pw
    
  2. To install the TimescaleDB database, execute the following command in your terminal:

    helm repo add timescale 'https://charts.timescale.com'
    helm repo update
    helm upgrade --install timescaledb --namespace acp-db timescale/timescaledb-single -f config.yaml --version 0.13.1
    

Configure TimescaleDB Dependency

If you wish to configure the connection between the Cloudentity platform and TimescaleDB configure the values.yaml file for your Cloudentity deployment and apply the changes.

Configure Connection Between Cloudentity Platform and TimescaleDB Datastore

If you need to configure the connection between the Cloudentity platform and TimescaleDB:

  1. Refer to the timescale (timescale client) section of the Cloudentity Platform Configuration Reference to learn about available configuration options.

  2. Change the configuration for the connection in the acp.config.data.timescale section of the Cloudentity Platform values.yaml file for your deployment.

  3. Apply the changes to your deployment.

TimescaleDB Integration Configuration Example

If you chose to deploy the TimescaleDB datastore following the instructions from the Install TimescaleDB in Kubernetes Cluster section, the configuration for the connection between the TimescaleDB and the Cloudentity platform looks like the following:

acp:
  enabled: true
  config:
    data:
      timescale:
          enabled: true
          url: postgres://acp:PaSsW0rD@timescaledb.acp-db.svc.cluster.local/acpdb
          migrations:
              path: ./migrations/timescale
              timeout: 1m0s

Such configuration present in the values.yaml file, results in the following configuration included in the /data/extraconfig.yaml file and passed to your Cloudentity deployment:

timescale:
  enabled: true
  migrations:
    path: ./migrations/timescale
    timeout: 1m0s
  url: postgres://acp:PaSsW0rD@timescaledb.acp-db.svc.cluster.local/acpdb

Troubleshooting

If your TimescaleDB deployment is configured incorrectly, you can see the following error message appearing in Cloudentity logs:

{"error":"failed to create database client: failed to connect to `host=timescale user=postgres database=acp`: hostname resolving error (lookup timescale on 1.0.0.0:1: server misbehaving)","level":"fatal","msg":"failed to connect to timescale database"}

If there is a connection issue between the Cloudentity platform and TimescaleDB, you can see the following error message appearing in Cloudentity logs:

{"error":"failed to create database client: failed to connect to `host=acp-cockroachdb-public user=root database=defaultdb`: dial error (dial tcp 1.0.0.0:1: connect: connection refused)","level":"fatal","msg":"failed to connect to the database"}
Updated: Jan 12, 2023