This is the implementation of the Druid data handler for MindsDB.

Apache Druid is a real-time analytics database designed for fast slice-and-dice analytics (OLAP queries) on large data sets. Most often, Druid powers use cases where real-time ingestion, fast query performance, and high uptime are important.

Prerequisites

Before proceeding, ensure the following prerequisites are met:

  1. Install MindsDB locally via Docker or use MindsDB Cloud.
  2. To connect Apache Druid to MindsDB, install the required dependencies following this instruction.
  3. Install or ensure access to Apache Druid.

Implementation

This handler was implemented using the pydruid library, the Python API for Apache Druid.

The required arguments to establish a connection are as follows:

  • host is the host name or IP address of the Apache Druid database.
  • port is the port that Apache Druid is running on.
  • path is the query path.
  • scheme is the URI schema. This parameter is optional and defaults to http.
  • user is the username used to authenticate with Apache Druid. This parameter is optional.
  • password is the password used to authenticate with Apache Druid. This parameter is optional.

Usage

In order to make use of this handler and connect to Apache Druid in MindsDB, the following syntax can be used:

CREATE DATABASE druid_datasource
WITH
    engine = 'druid',
    parameters = {
      "host": "localhost",
      "port": 8888,
      "path": "/druid/v2/sql/",
      "scheme": "http"
    };

You can use this established connection to query your table as follows:

SELECT *
FROM druid_datasource.example_tbl;