Talk about the development, compilation and testing of ClickHouse

Posted by evo4ever on Mon, 03 Jan 2022 22:15:16 +0100

ClickHouse, an open source OLAP database with excellent performance, has become more and more popular in recent years. In addition to taking root in major Internet companies, it has also attracted a large number of enthusiastic contributors. As of v21 In version 10, CH has 1064 contributors worldwide.

SELECT count(1)
FROM system.contributors

Query id: 7cdf54f1-cb50-45c0-99b9-14d73d283e39

┌─count()─┐
│    1064 │
└─────────┘

To contribute code to the community, we must first go through a complete development, compilation and testing process. The text will summarize some tools and skills commonly used in the above process, hoping to be helpful to students who are interested in contributing CH code.

development

For development environment configuration, refer to the following articles

compile

Download the latest code before compiling

git clone https://github.com/clickhouse/clickhouse

Then connect git submodule three times and pull the third-party library code

git submodule update --init --recursive 
git submodule foreach git checkout .
git submodule sync     --recursive     

The next step is to compile. Since the community clearly does not support gcc, it is recommended to compile with clang-12 or clang-13.

mkdir -p build_clang
cd build_clang
cmake  -G Ninja "-DCMAKE_C_COMPILER=$(command -v clang-13)" "-DCMAKE_CXX_COMPILER=$(command -v clang++-13)"  -DCMAKE_BUILD_TYPE=Debug -DENABLE_TESTS=0 -DENABLE_UTILS=0  -DENABLE_THINLTO=0 -DENABLE_NURAFT=0 -DDISABLE_HERMETIC_BUILD=1 ..
ninja clickhouse

debugging

For the construction of commissioning environment, please refer to: vscode c + + remote debugging practice

In addition, it is better to ignore SIGUSR1 and SIGUSR2 signals during CH debugging (these signals are used to count some indicators of query), otherwise you will find that the debugging process is always interrupted by these signals.

When the debugger is gdb, set:

handle  SIGUSR1  noprint nostop
handle  SIGUSR2  noprint nostop

When the debugger is lldb, set:

pro hand -p false -s false -n false SIGUSR1
pro hand -p false -s false -n false SIGUSR1

test

Although the community already has github actions to check the newly submitted PR, it is certainly not as fast as the local check. Here are some common test tools:

check-style

Check style is used to check the code style. CH still has high requirements for the code, so it is recommended to run through the check style tool after writing the code to see what does not meet the requirements of the community.

./utils/check-style/check-style | tee style.log

There are two common mistakes for novices:

  • Braces are not wrapped

  • There are spaces at the end of the line

functional test

The most common and commonly used test case in CH. the input is sql or shell file and the output is sql execution result. If it is found that the expected execution result of sql is different from the actual result, it is determined that the functional test fails.

So how to run functional test? There are two cases

For functional test entered as sql:

export CLICKHOUSE_CLIENT="/path/to/clickhouse client --host XX --port XX --user XX --password XX -m"
cat tests/queries/0_stateless/XXX.sql | $CLICKHOUSE_CLIENT

For input as shell:

export CLICKHOUSE_CLIENT="/path/to/clickhouse client --host XX --port XX --user XX --password XX -m"
bash -x  tests/queries/0_stateless/01675_distributed_bytes_to_delay_insert_long.sh

fast test

It is used by the community for rapid verification of PR. it will compile and run some functional tests. When you find that the submitted PR fails the fast test, the best way is to reproduce it in the local environment.

export LLVM_VERSION=13
export PULL_REQUEST_NUMBER=31104
export stage="" # The stage has "" and clone in order_ root, run, clone_ submodules, run_ cmake, build, configure, run_ Test and other options, you can set the stage parameters as needed.
export FASTTEST_WORKSPACE=/path/to/fasttest/workspace #fast test will download PR code in this directory and compile and run it
cd ./docker/test/fasttest
bash -x run.sh  | tee run.log  

integration test

In CH, many tests depend on the cluster environment or other components (mysql, zookeeper, hdfs, etc.). At this time, functional test will not work, so the community introduced integration test

Locally, we can still run integration tests to test_ storage_ Take HDFS as an example:

cd $prefix/tests/integration
sudo ./runner --binary $prefix/build_clang/programs/clickhouse  --odbc-bridge-binary $prefix/build_clang/programs/clickhouse-odbc-bridge --base-configs-dir $prefix/programs/server 'test_storage_hdfs -ss'

Topics: git github