Using a clickhouse-local database
clickhouse-local is a CLI with an embedded version of ClickHouse. It gives users the power of ClickHouse without having to install a server. In this guide, we will learn how to use a clickhouse-local database from chDB.
Setup
Let's first create a virtual environment:
python -m venv .venv
source .venv/bin/activate
And now we'll install chDB. Make sure you have version 2.0.2 or higher:
pip install "chdb>=2.0.2"
And now we're going to install ipython:
pip install ipython
We're going to use ipython
to run the commands in the rest of the guide, which you can launch by running:
ipython
Installing clickhouse-local
Downloading and installing clickhouse-local is the same as downloading and installing ClickHouse. We can do this by running the following command:
curl https://clickhouse.com/ | sh
To launch clickhouse-local with the data being persisted to a directory, we need to pass in a --path
:
./clickhouse -m --path demo.chdb
Ingesting data into clickhouse-local
The default database only stores data in memory, so we'll need to create a named database to make sure any data we ingest is persisted to disk.
CREATE DATABASE foo;
Let's create a table and insert some random numbers:
CREATE TABLE foo.randomNumbers
ORDER BY number AS
SELECT rand() AS number
FROM numbers(10_000_000);
Let's write a query to see what data we've got:
SELECT quantilesExact(0, 0.5, 0.75, 0.99)(number) AS quants
FROM foo.randomNumbers
┌─quants────────────────────────────────┐
│ [69,2147776478,3221525118,4252096960] │
└───────────────────────────────────────┘
Once you've done that, make sure you exit;
from the CLI as only one process can hold a lock on this directory.
If we don't do that, we'll get the following error when we try to connect to the database from chDB:
ChdbError: Code: 76. DB::Exception: Cannot lock file demo.chdb/status. Another server instance in same directory is already running. (CANNOT_OPEN_FILE)
Connecting to a clickhouse-local database
Go back to the ipython
shell and import the session
module from chDB:
from chdb import session as chs
Initialize a session pointing to demo..chdb
:
sess = chs.Session("demo.chdb")
We can then run the same query that returns the quantiles of numbers:
sess.query("""
SELECT quantilesExact(0, 0.5, 0.75, 0.99)(number) AS quants
FROM foo.randomNumbers
""", "Vertical")
Row 1:
──────
quants: [0,9976599,2147776478,4209286886]
We can also insert data into this database from chDB:
sess.query("""
INSERT INTO foo.randomNumbers
SELECT rand() AS number FROM numbers(10_000_000)
""")
Row 1:
──────
quants: [0,9976599,2147776478,4209286886]
We can then re-run the quantiles query from chDB or clickhouse-local.