JAVA HDFS API programming II

Design pattern in java: template pattern Define the skeleton (which is abstracted by the general algorithm) and hand over the specific implementation to the subclass. This means that as long as the process is defined in the template and how to implement it, the template method does not pay attention to the specific implementation. The specific ...

Posted by alecapone on Thu, 23 Dec 2021 23:39:28 +0100

HDFS transparent encryption usage, Keystore and Hadoop KMS, encryption area, key concepts and architecture of transparent encryption, KMS configuration

HDFS transparent encryption, Keystore and Hadoop KMS, encryption area The data in HDFS will be saved in the form of blocks in the local disk of each data node, but these blocks are in clear text. If you directly access the directory where the block is located under the operating system, you can directly view the contents through the cat comman ...

Posted by R0CKY on Mon, 13 Dec 2021 01:26:20 +0100

hadoop HDFS folder creation, file upload, file download, folder deletion, file renaming, file details, file type judgment (folder or file)

Absrtact: This article mainly introduces the use of the basic api of hadoop hdfs. Including Windows side dependency configuration and Maven dependency configuration. The last is the actual operation, including: obtaining the remote hadoop hdfs connection and a series of operations on it, including; Folder creation, file upload, file download, f ...

Posted by The_Walrus on Thu, 09 Dec 2021 05:10:08 +0100

Big data offline processing data project website log file data collection log splitting data collection to HDFS and preprocessing

Introduction: This article is about the first process of big data offline data processing project: data collection Main contents: 1) Use flume to collect website log file data to access.log 2) Write shell script: split the collected log data file (otherwise the access.log file is too large) and rename it to access_ Mm / DD / yyyy.log. &nbsp ...

Posted by erth on Tue, 30 Nov 2021 12:59:03 +0100

sqoop principle and basic application

1. Introduction to sqoop (1) Introduction: Sqoop is a tool of Apache for "transferring data between hadoop and relational database server".   import data: import data from MySQL and Oracle to hadoop's hdfs, hive, HBASE and other data storage systems.      Export data: export data from hadoop file system to relation ...

Posted by bouncer on Fri, 26 Nov 2021 14:36:34 +0100

Practice of running Hadoop WordCount program locally

^_^ 1. Configure local hadoop Hadoop 2.7.5 link: https://pan.baidu.com/s/12ef3m0CV21NhjxO7lBH0Eg Extraction code: hhhh Unzip the downloaded hadoop package to disk D for easy search Then right-click the computer and click Properties → click Advanced system settings on the right → click environment variables → select the Path b ...

Posted by hori76 on Fri, 05 Nov 2021 19:53:07 +0100