Data Warehouse - Programmer Think - where programmers share thinking

Data Warehouse

Oracle database statement summary

Introduction and description Four traditional mainstream databases: Oracle MySQL SqlServer DB2 Non relational database: Redis MongoDB The mainstream database is relational database: there is an association relationship between tables When we say install database, we mean install database service When creating a database, it refers t ...

Posted by rockobop on Fri, 25 Feb 2022 14:05:34 +0100

4 integrity constraint naming clause 5.6 assertion

5.4 integrity constraint naming clause Integrity constraint naming clause Constraint < integrity constraint name > < integrity constraint > ◾ < integrity constraints > include not null, unique, primary key phrase, foreign key phrase, check phrase, etc [example 5.1] establish a Student registration fo ...

Posted by iceraider on Fri, 04 Feb 2022 10:52:20 +0100

Azkaban of big data task scheduling

Azkaban is a batch workflow task scheduler launched by Linkedin company. It is mainly used to run a group of work and processes in a specific order in a workflow. Its configuration is to set dependencies through simple < key, value > pairs and dependencies in the configuration. Azkaban uses job profiles to establish dependencies betwe ...

Posted by suzuki on Mon, 10 Jan 2022 23:38:51 +0100

Bill data warehouse construction - data warehouse concept and data collection

1 data warehouse concept Data Warehouse can be abbreviated as DW or DWH. Data Warehouse is a strategic set that provides all system data support for all decision-making processes of enterprises. The analysis of data in data warehouse can help enterprises improve business processes, control costs and improve product quality. Data warehouse is n ...

Posted by ddragas on Sat, 01 Jan 2022 01:39:50 +0100

Hive file storage format

Hive supports the following formats for storing data: TEXTFILE (row storage), sequencefile (row storage), ORC (column storage) and PARQUET (column storage) 1: Column storage and row storage The left side of the figure above is a logical table, the first one on the right is row storage, and the second one is column storage. The storage ...

Posted by Lol5916 on Fri, 31 Dec 2021 12:59:52 +0100

MySQL database operation

1. Structure creation create Structure type structure name structure description; 2. Display structure show Structure type (plural) Display structure creation details: show create Structure type and structure name; 3. Data operation (data sheet) Add data: insert into Table name values View data: select from Table name Update data: updat ...

Posted by herreram on Mon, 27 Dec 2021 21:39:55 +0100

Creation of DBLINK from Dameng database DM to SOL SERVER

The DBLINK built this time is to access the SQL SERVER database on the Win side of the Damon database on the Linux side. By configuring ODBC to connect to the SQL SERVER database, DM creates an ODBC DBLINK connection to realize the DBLINK between DM and SQL SERVER. The following are the specific operation steps. DBLINK overview: DBLINK (Datab ...

Posted by Dustin013 on Thu, 23 Dec 2021 18:52:15 +0100

Senior big data Development Engineer - Hive learning notes

Hive improved chapter Use of Hive Hive's bucket table 1. Principle of drum dividing table Bucket splitting is a more fine-grained partition relative to partition. Hive table or partition table can further divide bucketsDivide the bucket, take the hash value of the whole data content according to a column, and determine which bucket th ...

Posted by luisluis on Wed, 08 Dec 2021 08:35:11 +0100

Big data offline processing data project data cleaning ETL writes MapReduce program to realize data cleaning

Introduction: Functions: clean the collected log data, filter invalid data and static resources Method: write MapReduce for processing Classes involved: 1) Entity class Bean Describe various fields of log data, such as client ip, request url, request status, etc 2) Tool class Used to process beans: set the validity or invalidity of log ...

Posted by KRAK_JOE on Fri, 03 Dec 2021 17:01:43 +0100

FlinkCDC+Hudi+Hive big data real-time basic combat into the lake

catalogue The new architecture is integrated with the lake warehouse 1, Version Description 2, Compile and package Hudi version 0.10.0 1. Use git to clone the latest master on github 2. Compilation and packaging 3, Create a flick project 1. Main contents of POM document 2.checkpoint 3.flinkcdc code 4.hudi code (refer to the official ...

Posted by WebbDawg on Fri, 03 Dec 2021 03:42:39 +0100

Hot Topics