You should know that the warehouse is safe

Posted by pirri on Sat, 29 Jan 2022 15:42:08 +0100

Absrtact: there are two technical paths to prevent data leakage. One is permission management, which adopts the principle of minimizing authorization to authorize users and applications using data. The other is data encryption, including SQL function encryption and transparent encryption.

This article is shared from Huawei cloud community< [security without trifles] you should know the security of data warehouse - encryption function >, original author: zhangkunhn.

preface

Recently, I encountered a customer scenario involving the permission of sharing schema. The scenario is simple and can be described as follows: some users are data producers and need to create tables and write data in the schema; Other users are consumers of data and read the data in the schema for analysis. An implementation method of schema permission management is that the data producer informs the administrator that the user uses grant select on all tables in schema syntax to grant permissions to the consumer after each new table is created. This method has some limitations. If the manufacturer creates some new tables under the schema, in order to authorize the consumer to use these new tables, you also need to inform the administrator that the user can use grant select on all tables in schema again. Is there a simple solution? The answer is yes. You can use alter default privilege. Alter default privilege is used to grant or reclaim permissions for objects created in the future.

Grammar introduction

 ALTER DEFAULT PRIVILEGES
     [ FOR { ROLE | USER } target_role [, ...] ]
     [ IN SCHEMA schema_name [, ...] ]
     abbreviated_grant_or_revoke;

Of which abreviated_ grant_ or_ The revoke clause is used to specify which objects are authorized or reclaimed. The syntax for authorizing tables is:

 GRANT { { SELECT | INSERT | UPDATE | DELETE | TRUNCATE | REFERENCES } 
     [, ...] | ALL [ PRIVILEGES ] }
     ON TABLES 
     TO { [ GROUP ] role_name | PUBLIC } [, ...]

Parameter description

  • target_role

The name of an existing role. If FOR ROLE/USER is omitted, the default value is the current role / user.

Value range: the name of an existing role.

  • schema_name

Name of the existing schema.

target_role must have schema_ CREATE permission for name.

Value range: the name of the existing mode.

  • role_name

The name of the role to which permission is granted or revoked.

Value range: existing role name.

See details Syntax description of ALTER DEFAULT PRIVILEGES

Scenario example

 testdb=# create user creator1 password 'Gauss_234';  
 CREATE USER
 testdb=# create user creator2 password 'Gauss_234';  
 CREATE ROLE
 testdb=# create user user1 password 'Gauss_234';
 CREATE USER
 --Create share schema,grant creator1 and creator2 Create permissions, grant user1 Use permission
 testdb=# create schema shared_schema;  
 CREATE SCHEMA
 testdb=> grant create, usage on schema shared_schema to creator1;
 GRANT
 testdb=> grant create, usage on schema shared_schema to creator2;
 GRANT
 testdb=# grant usage on schema shared_schema to user1;
 GRANT
 --take creator1 and creator2 stay shared_schema To create a table in select Permission granting user1
 testdb=# alter default privileges for user creator1, creator2 in schema shared_schema grant select on tables to user1;
 ALTER DEFAULT PRIVILEGES
 --Cut to creator1,Build table
 testdb=# \c testdb creator1
 You are now connected to database "testdb" as user "creator1".
 testdb=> create table shared_schema.t1 (c1 int);
 CREATE TABLE
 --Cut to creator2,Build table
 testdb=> \c testdb creator2
 You are now connected to database "testdb" as user "creator2".
 testdb=> create table shared_schema.t2 (c1 int);
 CREATE TABLE
 --Cut to user1,query OK
 testdb=> \c testdb user1
 You are now connected to database "testdb" as user "user1".
 testdb=> select * from shared_schema.t1 union select * from shared_schema.t2;
  c1 
 ----
 (0 rows)

View the status quo of granting default permissions

Query system table pg_default_acl can view which schemas are currently granted default permissions. From the defalacl field, you can see that creater1 and creater2 grant user1 to shared respectively_ select permission of objects in schema (r means read).

 testdb=# select r.rolname, n.nspname, a.defaclobjtype, a.defaclacl from
 testdb-#     pg_default_acl a, pg_roles r, pg_namespace n
 testdb-#     where a.defaclrole=r.oid and a.defaclnamespace=n.oid;
  rolname  |    nspname    | defaclobjtype |     defaclacl      
 ----------+---------------+---------------+--------------------
  creator1 | shared_schema | r             | {user1=r/creator1}
  creator2 | shared_schema | r             | {user1=r/creator2}
 (2 rows)

Some details

All users who create objects in the shared schema should appear in the list after alter default privileges for user. Otherwise, if a user creater3 is not in the list, the objects created in the shared schema or those whose Owner is creater3 will not be queried by user1. Because the table created by the creater3 user in the shared schema does not grant the default permission to user1.

 testdb=# create user creator3 password 'Gauss_234';
 CREATE USER
 testdb=# grant create, usage on schema shared_schema to creator3;
 GRANT
 testdb=# \c testdb creator3
 You are now connected to database "testdb" as user "creator3".
 testdb=> create table shared_schema.t3 (c1 int);
 CREATE TABLE
 testdb=> \c testdb user1
 You are now connected to database "testdb" as user "user1".
 testdb=> select * from shared_schema.t3;
 ERROR:  permission denied for relation t3

The administrator can put creater3 into the list through alter default privileges for user and grant user1 the default permission to access the table created by creater3 user, or the creater3 user can authorize user1 through alter default privileges In the previous syntax parameter description, if FOR ROLE/USER is omitted, the default value is the current user.

 testdb=> \c testdb creator3
 You are now connected to database "testdb" as user "creator3".
 testdb=> alter default privileges in schema shared_schema grant select on tables to user1;
 ALTER DEFAULT PRIVILEGES
 testdb=> \c testdb user1
 You are now connected to database "testdb" as user "user1".
 testdb=> select * from shared_schema.t3;
 ERROR:  permission denied for relation t3
 testdb=> \c testdb creator3
 testdb=> create table shared_schema.t4 (c1 int);
 CREATE TABLE
 testdb=> \c testdb user1
 You are now connected to database "testdb" as user "user1".
 testdb=> select * from shared_schema.t4;
  c1 
 ----
 (0 rows)

The third line of the above code shows that the current user is in shared_ The select permission of the table created under schema is granted to user1. In line 7, user1 queries shared_schema.t3 reported insufficient permissions because alter default privileges only deals with future objects. shared_schema.t3 was created before. Let's create a new table shared_schema.t4, user1 user query is normal.

If you want to handle the permissions of an existing table, use the grant statement. See grant syntax description.

 testdb=> \c testdb creator3
 You are now connected to database "testdb" as user "creator3".
 testdb=> grant select on all tables in schema shared_schema to user1;
 ERROR:  permission denied for relation t1
 testdb=> grant select on table shared_schema.t3 to user1;
 GRANT
 testdb=> \c testdb user1
 You are now connected to database "testdb" as user "user1".
 testdb=> select * from shared_schema.t3;
  c1 
 ----
 (0 rows)

Shared in line 3 of the code_ The schema contains three tables created by users, and creator 3 is only the Owner of table t3. Therefore, the permission granted to the whole schema will report an error. After only granting the table t3 with creator 3 as the Owner, the query of user1 user is normal.

Alter default privileges only deals with future objects, grant only deals with existing objects. Further, the objects involved in granting permissions by these two grammars only include the object whose Owner is the current user. If you want to grant permissions to the objects of all owners under the shared schema, you need to use the alter default privileges for user syntax and grant syntax for the administrator user.

Transparent encryption

Application scenario of transparent encryption

Transparent encryption can ensure the security of user data. Replacing disks, disk outflow or illegal direct reading of disk files by O & M will bypass authentication, permission management and audit, resulting in the risk of data leakage. When customers have high confidentiality requirements for business data, it is recommended to use transparent encryption.

Principle of transparent encryption

The transparent encryption function is to encrypt and store the user data stored on the hard disk, and is not aware of the user and the application using SQL on the upper layer. The meaning of transparency means that it is not perceived by customers. Only transparent encryption needs to be configured when creating GaussDB(DWS) clusters. At present, it supports the encrypted storage of row save table and column save table files, and supports the transparent encryption configuration at the cluster level.

Transparent encryption at the cluster level means that all libraries in the cluster and all tables in the library are encrypted storage. Transparent encryption at the cluster level also means that it needs to be configured when creating a cluster. It cannot be modified after the cluster is created. Neither an unencrypted cluster nor an encrypted cluster can be modified to an unencrypted cluster.

encryption algorithm

The core of transparent encryption is algorithm and key. We use AES-128 algorithm and CTR in encryption mode. CTR stream encryption can ensure that the length of plaintext and ciphertext is equal, and will not lead to the expansion of data storage space after encryption.

key management

The use of Huawei public cloud KMS service management ensures the security of users' keys.
The encryption key hierarchy has three layers. These keys are arranged in hierarchical order, including master key (CMK), cluster key (CEK) and database key (DEK).

  • The master key is saved in KMS and used to encrypt CEK.
  • CEK is used to encrypt DEK. The CEK plaintext is saved in the cluster memory and the ciphertext is saved in the service management plane.
  • DEK is used to encrypt the data in the database. The DEK plaintext is saved in the cluster memory and the ciphertext is saved in the service management plane.

Key rotation

For security reasons, users can perform key rotation. Key rotation only rotates the cluster key, regardless of the database secret key.

Subsequent evolution of transparent encryption

The advantage of cluster level transparent encryption is that all data, including user tables and system tables, are encrypted, which is suitable for all encryption requirements. The two sides of a coin tell us that advantages may also be disadvantages. Encrypting all database objects will bring performance overhead to data import and query.

To solve this problem, we will consider supporting fine-grained transparent encryption in the future. For example, table level transparent encryption can be supported. When creating a table, the user specifies the attribute as an encrypted table, and the data of the user table will be encrypted and stored. Users can turn on the encryption attribute in the table containing sensitive data, and do not perceive the encryption and decryption process in the process of query and use. Because the encryption granularity is small, the impact on performance is also small.

Transparent encryption is an effective means to ensure the security of users' core data. This paper introduces the transparent encryption characteristics of GaussDB(DWS) data warehouse from the use scenario and principle, and points out the research direction of subsequent transparent encryption characteristics.

SQL function encryption

Technical background

In cryptography, cryptographic algorithms can be divided into three categories: hash function, symmetric cryptographic algorithm and asymmetric cryptographic algorithm.

  • hash function

Hash function is also called digest algorithm. For data, hash function will generate fixed length data, that is, Hash(data)=result. This process is irreversible, that is, there is no inverse function of hash function, and data cannot be obtained from result. In the scenario where plaintext should not be saved, for example, the password is sensitive information, and the system administrator user should not know the user's plaintext password, the hash algorithm should be used to store the one-way hash value of the password.

In actual use, salt value and iteration times will be added to avoid generating the same hash value with the same password, so as to prevent rainbow table attack.

  • Symmetric cryptographic algorithm

Symmetric cryptographic algorithms use the same key to encrypt and decrypt data. Symmetric cipher algorithm is divided into block cipher algorithm and stream cipher algorithm.

The block cipher algorithm divides the plaintext into fixed length packets and encrypts each packet with a key. Because the packet length is fixed, the plaintext will be filled when the plaintext length is not an integer multiple of the packet length. Due to the existence of padding, the length of ciphertext obtained by block cipher algorithm will be greater than that of plaintext.

The stream cipher algorithm computes the plaintext bit by bit with the key stream. Stream cipher algorithm does not need filling, and the length of ciphertext is equal to the length of plaintext.

  • Asymmetric cryptographic algorithm

Asymmetric cryptographic algorithm, also known as public key cryptographic algorithm. The algorithm uses two keys: public key and private key. The public key is made public to all and the private key is kept confidential. Asymmetric cryptographic algorithms are used in key agreement, digital signature, digital certificate and other fields.

Technical realization

GaussDB(DWS) mainly provides hash function and symmetric cryptographic algorithm. The hash function supports sha256, sha384, sha512 and state secret sm3. Symmetric cryptographic algorithm supports aes128, aes192, aes256 and state secret sm4.

hash function

  • md5(string)

Use MD5 to encrypt the string and take the hexadecimal number as the return value. MD5 has low security and is not recommended.

  • gs_hash(hashstr, hashmethod)

hashmethod algorithm is used to summarize the information of hashstr string and return the information summary string. hashmethod supported: sha256, sha384, sha512, sm3.

testdb=# SELECT gs_hash('GaussDB(DWS)', 'sha256');
                             gs_hash                              
------------------------------------------------------------------
 cc2d1b97c6adfba44bbce7386516f63f16fc6e6a10bd938861d3aba501ac8aab
(1 row)

Symmetric cryptographic algorithm

  • gs_encrypt(encryptstr, keystr, cryptotype, cryptomode, hashmethod)
    The encryption algorithm composed of cryptotype and cryptomode and the HMAC algorithm specified by hashmethod are used to encrypt the encryptstr string with keystr as the key and return the encrypted string.
    Supported cryptotype s: aes128, aes192, aes256, sm4.
    cryptomode supported: cbc.
    hashmethod supported: sha256, sha384, sha512, sm3.
testdb=# SELECT gs_encrypt('GaussDB(DWS)', '1234', 'aes128', 'cbc',  'sha256');
                                                        gs_encrypt                                                        
--------------------------------------------------------------------------------------------------------------------------
 AAAAAAAAAADlzZYiNQK1uB+p1gza4Lu3Moj3HdP4E1uJmqfDYBaXDLMt7RZoE0YVx9h2dMRYBQ5fhFNqqM49sUkeS72o8kX5vWRQvfW3fuocGyp+b+lX9A==
(1 row)
  • gs_decrypt(decryptstr, keystr,cryptotype, cryptomode, hashmethod)

The encryption algorithm composed of cryptotype and cryptomode and the HMAC algorithm specified by hashmethod are used to decrypt the decryptstr string with keystr as the key and return the decrypted string. The keystr used for decryption must be consistent with the keystr used for encryption before normal decryption.

testdb=# SELECT gs_decrypt('AAAAAAAAAADlzZYiNQK1uB+p1gza4Lu3Moj3HdP4E1uJmqfDYBaXDLMt7RZoE0YVx9h2dMRYBQ5fhFNqqM49sUkeS72o8kX5vWRQvfW3fuocGyp+b+lX9A==', '1234', 'aes128', 'cbc',  'sha256');
  gs_decrypt  
--------------
 GaussDB(DWS)
(1 row)

Effect analysis

There is a student table with three attributes: id, name and score. Name can be saved using hash function encryption, and score can be saved using symmetric cryptographic algorithm.

testdb=# create table student (id int, name text, score text);
CREATE TABLE
testdb=# insert into student values (1, gs_hash('alice', 'sha256'), gs_encrypt('95', '12345', 'aes128', 'cbc', 'sha256'));
INSERT 0 1
testdb=# insert into student values (2, gs_hash('bob', 'sha256'), gs_encrypt('92', '12345', 'aes128', 'cbc', 'sha256'));
INSERT 0 1
testdb=# insert into student values (3, gs_hash('peter', 'sha256'), gs_encrypt('98', '12345', 'aes128', 'cbc', 'sha256'));
INSERT 0 1

Users without keys cannot see the name and score columns of encrypted data even if they have the select permission.

testdb=# select * from student;
 id |                               name                               |                                                          score                                                           
----+------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------
  1 | 2bd806c97f0e00af1a1fc3328fa763a9269723c8db8fac4f93af71db186d6e90 | AAAAAAAAAAB26RmKZdGciLdOM1Z0sjsHg6Qh1b8taF3cY5KDVm+faJK5AT9tjufkr3Wogj3tIpFfiIEb6+miGqPHWcmKnFsArAMoBG9pPDawGs1Qze7xGg==
  2 | 81b637d8fcd2c6da6359e6963113a1170de795e4b725b84d1e0b4cfd9ec58ce9 | AAAAAAAAAAB26RmKZdGciLdOM1Z0sjsHZOHH7URkyme6r8Hfh1k0UsVbgbREjFMkgB52w+7GtUGqGgUik07ghajSD9PMIDLd/49wBCVROm2/HSOw6jzbxA==
  3 | 026ad9b14a7453b7488daa0c6acbc258b1506f52c441c7c465474c1a564394ff | AAAAAAAAAAB26RmKZdGciLdOM1Z0sjsHwv6p/OAfDUyVULAqpaHIrYJYMcqLmQSj3K/REyavfMoKB7hgUpEPXfHRutWur37bru68jjt5XcBHFBjZeMgowA==
(3 rows)

Users with keys can view encrypted data through decryption.

testdb=# select id, gs_decrypt(score, '12345', 'aes128', 'cbc', 'sha256') from student;
 id | gs_decrypt 
----+------------
  1 | 95
  2 | 92
  3 | 98
(3 rows)

summary

Data encryption is an effective technology to prevent unauthorized access and protect data leakage. This paper introduces the basic principle of cryptographic algorithm and the encryption function of GaussDB(DWS) data warehouse, including hash function gs_hash, symmetric cryptographic algorithm_ encrypt/gs_ decrypt. An example is given to illustrate the usage scenario of encryption function.

For more information on GuassDB(DWS), WeChat search is welcome to search WeChat DWS for WeChat official account and share with you the latest and most complete PB technology.

 

Click follow to learn about Huawei's new cloud technology for the first time~

Topics: GaussDB