Environmental description of this document
centos The server
jupyter of scala nucleus spylon-kernel
Main contents of this paper
spark reads the data of hive table, mainly including direct sql reading of hive table; Read hive table and hive partition table through hdfs file.Initialize the sparksession t ...
Posted by flash99 on Mon, 20 Sep 2021 18:18:29 +0200
4, HQL (hive SQL)
1) Add the following configuration information to the hive-site.xml file to display the header information of the current database and query table.
Oozie is an open source framework based on workflow engine contributed by Cloudera company to Apache. It is an open source workflow scheduling engine of Hadoop platform, which is used to manage Hadoop jobs. This article, the first in a series, introduces oozie's task submission phase.
We deduce the implementation f ...
Posted by cowboy_x on Tue, 30 Jun 2020 05:26:29 +0200
A tool for translating sql statements into mapreduce programs.
Create table statement
CREATE TABLE page_view(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User')
COMMENT 'This is the page view table'
PARTITIONED BY(dt STRING, country STRING)
ROW FORMAT DELIMI ...
Posted by mainewoods on Wed, 29 Apr 2020 14:54:07 +0200
Reference article: apache Impala detailed installation (lying in the most complete pit)
Apache impala detailed installation
impala is an efficient sql query tool provided by cloudera, which provides real-time query results. The official test performance is 10 to 100 times faster than hive, and its sql query is even faster than spark sql. imp ...
Posted by deth4uall on Tue, 21 Apr 2020 09:18:04 +0200
hadoop version is 2.8.3
Today, I found a strange problem, as shown in List-1 below, indicating that two file blocks are missing
There are 2 missing blocks. The following files may be corrupted:
blk_1073857295 /tmp/xxx/b9a11fe8-306a-42cc-b49f-2a7f0 ...
Posted by tomfmason on Wed, 25 Mar 2020 15:31:06 +0100
1. because hostname cannot be resolved
1. Cannot find hadoop installation: $HADOOP_HOME or $HADOOP_PREFIX must be set or hadoop must be in the path
2. Unable to instantiate org.a ...
Posted by a6000000 on Tue, 17 Mar 2020 02:31:41 +0100
I have a Live Android application. I have received the following stack trace information from the market. I don't know why it happened in the application code instead of happening, but caused by some or other events in the application (assumed)
I don't use Fragments, but I still have a reference to the fragment manager. If someone can under ...
Posted by weemee500 on Sun, 01 Mar 2020 05:05:55 +0100
1. Introduction to sqoop:
Sqoop is an open source tool, mainly used in Hadoop(Hive) and traditional databases (mysql, postgresql )
Data can be transferred from one relational database (such as mysql, Oracle, Postgres, etc.) to another
Data can be imported into HDFS of Hadoop or into relational database. ...
Posted by wolfrock on Wed, 26 Feb 2020 07:30:01 +0100