Big Data [Page 19] - Programmer Think - where programmers share thinking

Big Data

Redis introduction and simple use

1, Introduction (please skip those who don't want to see) Concept: redis is a high-performance NOSQL series non relational database What is Redis? Redis is an open-source high-performance key value database developed in C language. It officially provides test data. 50 concurrent 100000 requests are executed. The reading speed is 110000 ti ...

Posted by covert215 on Wed, 29 Dec 2021 14:57:39 +0100

Big data - how to use Hadoop on Docker

Introduction since Hadoop is a software designed for clustering, it is inevitable to configure Hadoop on multiple machines in the process of learning and using, which will cause many obstacles for beginners. There are two main obstacles; Expensive computer clusters. A cluster environment composed of multiple computers requires ex ...

Posted by titoni on Tue, 28 Dec 2021 23:21:47 +0100

Flume cluster installation and deployment, flume entry operation cases: Official cases of monitoring port data and real-time monitoring of multiple additional files in the specified directory

Introduction: This is a learning note blog about the installation and deployment of flume. The main contents include: flume installation and deployment and two entry cases of flume. They are: the official case of monitoring port data and the file changes tracked by multiple files in the specified directory in real time. If there are mistakes, p ...

Posted by 3.grosz on Tue, 28 Dec 2021 09:57:51 +0100

Hudi Log file format and read / write process

Hudi Log file format and read / write process background Readers who have a certain understanding of Hudi should know that Hudi has two table types: COW and MOR. The MOR table records files through Log files. It can be observed that the data of an MOR table is stored in three files: Log file (* Log. *), partition metadata (. hoodie_partition_ ...

Posted by kamsmartx on Tue, 28 Dec 2021 07:14:16 +0100

Python data analysis and mining - statistics and visualization of vocabulary frequency of a real topic in postgraduate entrance examination English in recent ten years 2012-2021

Python data analysis and mining - statistics and visualization of vocabulary frequency of a real topic in postgraduate entrance examination English in recent ten years 2012-2021 statement This article is only published in CSDN, and others are pirated. Please support genuine! Genuine link: https://blog.csdn.net/meenr/article/details/1190393 ...

Posted by rish1103 on Mon, 27 Dec 2021 16:52:50 +0100

Flink status management 1

1, What is the status? 1.1 stateful and stateless: Flink is going to stream. When a data stream comes, the first data will be executed by the operator in the Flink, and an execution result will be generated after the execution is completedThe result of this execution, for example, is output. The subsequent data, such as the calculation of the ...

Posted by arun_desh on Sun, 26 Dec 2021 19:37:55 +0100

Learning notes for the fourth week of SQL

This note is the learning note of coursera website course Databases and SQL for Data Science with Python In this module you will learn the basic concepts related to using Python to connect to databases. In a Jupyter Notebook, you will create tables, load data, query data using SQL, and analyze data using Python After completing this module, y ...

Posted by the7soft.com on Sun, 26 Dec 2021 14:56:58 +0100

Hadoop environment installation

Hadoop distributed environment 0. Preliminary preparation Create normal user # Create fzk user useradd fzk # Modify fzk user's password passwd fzk # The configuration fzk user has root permission, which is convenient for sudo to execute the command with root permission later (/ etc/sudoers file, added under% wheel) fzk ALL=(ALL) ...

Posted by phelpsa on Sun, 26 Dec 2021 10:23:42 +0100

26 data analysis cases -- the fourth station: web server log data collection based on Flume and Kafka

26 data analysis cases -- the fourth station: web server log data collection based on Flume and Kafka Experimental environment Python: Python 3.x；Hadoop2.7.2 environment;Kafka_2.11;Flume-1.9.0. Data package Link: https://pan.baidu.com/s/1oZcqAx0EIRF7Aj1xxm3WNw Extraction code: kohe Experimental steps Step 1: install and start the httpd s ...

Posted by cemzafer on Sun, 26 Dec 2021 07:37:24 +0100

Detailed use of HDFS

HDFS 1. Shell operation upload -moveFromLocal: cut and paste from local to HDFS hadoop fs -moveFromLocal local file HDFS directory -Copy from local: copy files from the local file system to the HDFS path hadoop fs -copyFromLocal local file HDFS directory -Put: equivalent to copyFromLocal, the production environment is more used to ...

Posted by alcedema on Sat, 25 Dec 2021 17:56:36 +0100

Hot Topics