Redis first day class notes
Course objectives
- Be able to master the operation of different data types in Redis
- Redis can be operated using Java API
- Be able to understand the two persistence methods of Redis
- Understand the master-slave replication architecture of Redis
- Be able to understand the Sentinel architecture of Redis
- Understand Redis cluster architecture
NoSQL+MQ phase
- Massive data storage (HDFS) cannot cover all big data storage application scenarios, so we should learn more about distributed storage to solve some specific scenarios of big data storage
- Knock on the door of a stage through NoSQL + Mq stage
Introduction to the development history of NoSQL database 4
NoSQL introduction
-
Historical development of Web
- Web 1.0 - represented by portal websites, Netease and Sina mainly focus on users' reading and browsing, without too much performance problems
- Web 2.0 - with many users participating and more and more interactivity (microblog, Tianya, maopu), many users join, the pressure on the website is increasing, and the pressure on the database is great
-
Database performance bottleneck
- The performance of stand-alone database is limited, and it can support hundreds to 1000 concurrent databases,
- Therefore, Redis cache needs to be used to solve the high concurrency problem
-
NoSQL
- Not Relational - there is no support for SQL and the relationship between tables cannot be established
- characteristic:
- Good scalability
- No schema can be used to store structured, unstructured and semi-structured data
- Fast speed
- Support the storage of massive data
-
NewSQL: Based on NoSQL, it can support some SQL related operations
- HBase -> Phoenix
-
DBEngines website will make statistics on the current ranking of the database in the world.
Redis introduction 7
Basic introduction to Redis 7
- Website behavior analysis / traffic analysis
- Bone ash index
- PV - Page View, page views, which will be accumulated once a user browses a page
- IP - IP address. An IP is only calculated once a day
- UV - Unique Visit, the unique number of user visits, which is only calculated once a day for each user
- Redis is a NoSQL storage engine based on key value pairs
Redis application scenario 7
- Counter
- TopN, ranking (micro-blog's hot search list, hot topics, tiktok live broadcast live, Taobao's list of electricity providers)
- De duplication count
- Real time system for storing some rules
- Some applications with timed expiration (SMS verification code)
- Cache (protect the database from being overwhelmed by high concurrency)
Redis features 8
- The speed is very fast, and the single machine can support concurrency and read-write speed of more than 10W (Kafka faster - 80W-150W)
- It supports a variety of data structure types, and the operation is very flexible
- string
- list
- set
- hash
- zset
- ...
Redis stand-alone environment installation 9
Redis installation for Linux 10
- Download Redis source package
- Because Redis code needs to be compiled, the Redis compiler needs to be downloaded
- The tcl program needs to be installed online. This program can control the execution process of commands in Linux (for example, execute script 1 first and then script 2)
- Use the make command to install Redis (the make command is similar to Maven in Java and can control the compilation and packaging of the whole project). General C/C + + projects will have a file MakeFile
- C program compilation process
- First compile the C program into an. O file (object code) and the instructions composed of 01 that can be recognized on the operating system
- Then connect the object code to some C language libraries to form the final executable file
- After the software is compiled, some tests will be run to ensure that the program can be executed correctly on the operating system
be careful:
- In the production environment, you do not need to use kill -9 to shut down redis. Instead, you should use redis cli - H hostname - p port shutdown
- If you directly kill -9, some data of redis may be lost
Redis data type 13
Operation on string 14
be careful:
- When performing some accumulator operations, you must not use set/get to operate
- To use INCR/DESC/INCRBY
Operation on hash/list/set/zset
# 3, Action list type # push inserts data into the header of the list node1.itcast.cn:6379> LPUSH list 1 2 3 4 (integer) 4 # Range means to get the elements in the specified range (0 -- 1 means to get the elements of data) node1.itcast.cn:6379> LRANGE list 0 -1 # 4, Operation SET type # 4.1 adding elements SADD set_test 1 SADD set_test 1 2 3 4 # SMEMBERS key # Returns all members in the collection # SCARD key # Gets the number of members of the collection # 4.2 get all elements SMEMBERS set_test # 4.3 get the number of elements SCARD set_test # To use the SET structure to save UV s for a web site SADD uv:2020-01-01 001 002 003 SCARD uv:2020-01-01 # 5, Operation for key # 5.1 delete a key and the corresponding data structure DEL list # 5.2 judgment_ Test whether this key exists EXISTS set_test # Returning 1 means existence, and returning 0 means nonexistence # node1.itcast.cn:6379> EXISTS set_test # (integer) 1 # node1.itcast.cn:6379> EXISTS set_test1 # (integer) 0 # 6, Operation for ZSET (ordered SET) # 6.1 add PV value of page to ZSet ZADD pv 100 page1.html 200 page2.html 300 page3.html # 6.2 how many pages are there # ZCARD key ZCARD pv # 6.3 add pv value to page1.html page # ZINCRBY key increment member ZINCRBY pv 10 page1.html # 6.4 create two zsets to save PV: ZADD pv_zset1 10 page1.html 20 page2.html ZADD pv_zset2 5 page1.html 10 page2.html ZINTERSTORE pv_zset_result 2 pv_zset1 pv_zset2 # 6.7 get all members in ZSET # ZRANGE key start stop [WITHSCORES] ZRANGE pv_zset_result 0 -1 WITHSCORES # 6.8 find the ranking of page1.html in page PV (minimum) # By default, 0, 1, 2, 3... Are sorted in ascending order # ZRANK key member ZRANK pv_zset_result page1.html # 6.9 find the ranking of page1.html in page PV (maximum) # ZREVRANK key member # Note: this operation is very efficient. It is not reordering, but just reversing ZSET ZREVRANK pv_zset_result page1.html # be careful: # 1. ZRANK is ranked from small to large, and ZREVRANK is ranked from large to small # 2. The ranking starts from 0, and 0 represents the first place
Operations on BitMaps 24
- The value stored in Bitmap is 0 or 1
- Understand Bitmaps: it is a one-dimensional array that stores 0 and 1
- When operating Bitmaps, you must bring an offset
# 7, Bitmap operation # 6.10 uid=0,5,11,15,19 # Use Bitmap to save whether the user has visited the website SETBIT unique:users:2020-01-01 0 1 SETBIT unique:users:2020-01-01 5 1 SETBIT unique:users:2020-01-01 11 1 SETBIT unique:users:2020-01-01 15 1 SETBIT unique:users:2020-01-01 19 1 #SETBIT unique:users:2020-01-01 1000000 1 # 6.11 get whether the specified user has visited the website # GETBIT key offset GETBIT unique:users:2020-01-01 0 # 6.12 statistics on how many users visited the website on January 1, 2020 # BITCOUNT key [start end] BITCOUNT unique:users:2020-01-01 0 -1 # 6.13 calculate all users accessing the website on January 1, 2020 and January 2, 2020 # BITOP operation destkey key [key, ...] SETBIT unique:users:2020-01-02 0 1 SETBIT unique:users:2020-01-02 6 1 SETBIT unique:users:2020-01-02 12 1 # Take or operation BITOP or unique:users:or:2020-01-01_02 unique:users:2020-01-01 unique:users:2020-01-02 # Take quantity BITCOUNT unique:users:or:2020-01-01_02 0 -1
Operations on the hyperlog structure 27
Application scenario:
- It is only used to do some de duplication and large amount of data statistics (website UV value)
Limitations:
- There is a certain error in this hyperlog structure, and the error is very small, 0.81%. Therefore, it is not suitable for statistics requiring particularly high accuracy. UV operation, which requires high accuracy, can be applied to this structure
- Hyperlog structure does not store data details. For UV scenes, in order to save space resources, it only stores the base value of the data calculated by the algorithm and makes statistics based on the base value
# pfadd key userid, userid... # pfcount key # Requirements: # Find the UV value of a website pfadd taobao:uv:2020-01-01 1 pfadd taobao:uv:2020-01-01 2 pfadd taobao:uv:2020-01-01 1 # Get the UV value (after de duplication) pfcount taobao:uv:2020-01-01
Redis Java API operation 30
Connecting and closing redis client 31
- The connection pool of Jedis is generally used to operate Redis, which can effectively reuse connection resources
@BeforeTest public void beforeTest() { // JedisPoolConfig configuration object JedisPoolConfig config = new JedisPoolConfig(); // Specify a maximum of 10 free connections config.setMaxIdle(10); // Minimum 5 idle connections config.setMinIdle(5); // The maximum waiting time is 3000 milliseconds config.setMaxWaitMillis(3000); // The maximum number of connections is 50 config.setMaxTotal(50); jedisPool = new JedisPool(config, "node1.itcast.cn", 6379); }
be careful:
- In IDEA, sometimes the prompt may be incomplete. In fact, the Jedis connection pool can specify the port number
Operation string type data 32
- Redis operation string is actually the same as SHELL command
- When writing the Flink program / Spark Streaming program to operate Redis in the future, pay attention to execute close after the Redis operation to return the connection to the connection pool.
@Test public void stringTest() { // Get Jedis connection Jedis jedis = jedisPool.getResource(); // 1. Add a string type data. The key is pv. It is used to save the value of pv. The initial value is 0 jedis.set("pv", "0"); // 2. Query the data corresponding to the key System.out.println("pv:" + jedis.get("pv")); // 3. Modify pv to 1000 jedis.set("pv", "1000"); // 4. Auto increment operation of shaping data atom + 1 jedis.incr("pv"); // 5. Realize the auto increment operation of shaping the data atom + 1000 jedis.incrBy("pv", 1000); System.out.println(jedis.get("pv")); // Put the jedis object back into the connection pool jedis.close(); }
Operation hash list type data 33
be careful:
- When we use Java to operate Redis when writing Flink and Spark Streaming stream handlers later, it involves the accumulation of some numbers
- Be sure to use incr and hincrBy
@Test public void hashTest() { // Get Jedis connection Jedis jedis = jedisPool.getResource(); // one Add the following inventory items to the Hash structure // a) iphone11 => 10000 // b) macbookpro => 9000 jedis.hset("goods", "iphone11", "10000"); jedis.hset("goods", "macbookpro", "9000"); // two Get all items in Hash Set<String> goodSet = jedis.hkeys("goods"); System.out.println("All items:"); for (String good : goodSet) { System.out.println(good); } // three Add 3000 macbookpro inventory // String storeMacBook = jedis.hget("goods", "macbookpro"); // long longStore = Long.parseLong(storeMacBook); // long addStore = longStore + 3000; // jedis.hset("goods", "macbookpro", addStore + ""); jedis.hincrBy("goods", "macbookpro", 3000); // four Delete the data of the entire Hash // jedis.del("goods"); jedis.close(); }
Operation list type data 34
- List can be used to store duplicate elements and is ordered
- Get all elements, lrange (key, 0, - 1)
@Test public void listTest() { // Get Jedis connection Jedis jedis = jedisPool.getResource(); // one Insert the following three mobile phone numbers to the left of the list: 18511310001, 18912301231, 18123123312 jedis.lpush("tel_list", "18511310001", "18912301231", "18123123312"); // two Remove a mobile number from the right jedis.rpop("tel_list"); // three Get all the values of the list List<String> telList = jedis.lrange("tel_list", 0, -1); for (String tel : telList) { System.out.println(tel); } jedis.close(); }
Operate on data of type set 34
- The calculation of UV is mainly de duplication
- In the future, Set can be used for all business scenarios that require efficient de duplication
@Test public void setTest(){ // Get Jedis connection Jedis jedis = jedisPool.getResource(); // Seeking UV s is to find how many are independent (not repeated) // one Add the uv of page page1 to a set, and user user1 accesses the page once jedis.sadd("uv", "user1"); // jedis.sadd("uv", "user3"); // jedis.sadd("uv", "user1"); // two user2 visits this page once jedis.sadd("uv", "user2"); // three user1 visits the page again jedis.sadd("uv", "user1"); // four Finally, get the uv value of page1 System.out.println("uv:" + jedis.scard("uv")); jedis.close(); }
Redis persistence 35
RDB persistence scheme 35
- RDB is a full backup, which saves all data in the whole redis to an RDB file
AOF persistence scheme 37
- AOF is equivalent to saving Redis write operations in the form of commands in a file named appendonly.aof
- However, when the redis cluster is restarted, it can be played back according to the appendonly.aof file (similar to metadata recovery of NameNode in HDFS - FSImage and editlog)
Compare the characteristics of the two:
- RDB is more efficient because it directly saves the latest data in redis. However, if the amount of data stored in redis is large, it will also have a certain performance consumption, because it needs to create additional processes to persist the data. Therefore, a reasonable strategy should be specified. Redis also has corresponding recommendation policies
- Aof is less efficient than RDB because it will perform all historical write operations to recover, and store each instruction when performing write operations. However, AOF can be considered when we have high requirements for data security.
Redis advanced usage 39
Redis transaction 39
- Redis can use the following instructions to start a transaction
- MULTI
- EXEC
- DISCARD
- The "rollback" of Redis transaction means that if a syntax error occurs after Redis starts the transaction, Redis can check it and rollback it
- However, if some Redis problems occur only at runtime, such as type mismatch, transaction rollback cannot be performed
- Redis transactions do not support isolation and atomicity
Redis expiration policy 42
- How to remove the key s whose expiration time is set
- Each key has a timer scan. It consumes a lot of CPU, because if there are many keys with expiration time set, each key must start a timer. It is memory friendly. As soon as the key fails, the memory space will be released immediately.
- Remove the key. It will not scan regularly, but when the client accesses the key, if it finds that the key has expired, it will be removed. If there is no client access, it is not removed. CPU friendly, memory unfriendly (there may be many keys that cannot be cleared quickly within a certain period of time)
- Timing scanning is a compromise method. Check the field of expire key, scan the period regularly, and remove the expired key
Memory obsolescence strategy 43
- When the memory in redis is insufficient, the corresponding memory space will be cleared according to the memory elimination strategy
- LRU -- priority is given to clearing the least recently accessed Key
- Recommended strategy: allkeys LRU