When developing the short video recommendation function in the short video system, it needs to be designed not to repeatedly recommend the videos that users have seen. If this function needs to be realized, it must be necessary to save the viewing records. The viewing records can be stored in the collection of database and Redis cache, and then the viewed data can be removed when obtaining the data, Then query in the database and return to the customer.
Firstly, we consider that the development of short video system saves the viewing records to the database. In this way, if the hash table is not divided according to the user id, the database of a single table will be very large and the query speed will be very slow. However, if the hash table is divided according to the user id, the viewing records can reach tens of millions after the number of users and videos increases, Even split table data query will be very slow, which can not meet the requirements of fast query.
The second is Redis cache, which saves the key value as the set set according to the user id and prefix of the short video system development system. This scheme is feasible, but 1G memory can only remove tens of millions of data, which is not a more efficient and low memory scheme.
Next, let's introduce how to use the short video system to develop the plug-in redisbloom (HTTPS? / GitHub. COM / redislabsmodules / redisbloom) and Bloom Filter of PHP+Redis to realize recommended de duplication.
1, First, we download the source code and compile it to generate reboom so
2, Use redis server to load the module. Redis server -- loadmodule / path / to / reboot so
3, Test whether BF is integrated ADD newFilter foo (integer) 1
4, Import the video id in the database into Redis cache with PHP
/** * //Insert all v id eo IDS */ function addAllVideoIdCache($videos) { $key = 'vids'; $total = 0; DI()->redis->delete($key); foreach($videos as $k=>$v) { $vid = $v['id']; $time = $v['addtime']; $res = DI()->redis->zAdd($key,$time,$vid); $total = $total+$res; } $total = DI()->redis->zCount($key,'0','+inf'); return $total; }
5, Add viewing record
/** * Set a video watched by the user * @desc Set a video watched by the user * @para uid User id * @para vid Video id * @return res 1 0 was not processed successfully */ protected function setUserViewCache($uid,$vid) { $key = 'video_filter_'.$uid; $value = $vid; $res = DI()->redis->rawCommand("bf.add",$key ,$value); $data['cmd'] = 'bf.add'.' '.$key.' '.$value; return $res; }
6, Get the list of v id eo IDs that the user has not watched
/** * Get the id list of videos not viewed by the user * @desc Get the id list of videos not viewed by the user * @para uid User id * @para count Return quantity * @return int code Operation code, 0 indicates success, 1001 has no video id cache * @return array info * @return string info[0].vids id list of videos not viewed * @return string msg Prompt information */ protected function getUserVidsCache($uid,$count) { $rs = array('code' => 0, 'msg' => '', 'info' => array()); $video_filter_key = 'video_filter_'.$uid; $max_count = $count; //$config = $this->getConfigPri(); //$lua_script_sha = $config['lua_script_sha']; $lua_script_sha = '62c0f5411ea862bfd4d97fc383d2bb0d6a06289b'; //sha of lua script $vids_list_key = 'vids'; $vids_count = DI()->redis->zCount($vids_list_key,'-inf','+inf'); if($vids_count==0) { $rs['code']=1001; $rs['msg'] ='Caching: Video id The list is empty'; } $args = array($video_filter_key,$max_count); $res =DI()->redis->evalSha($lua_script_sha,$args,2); $rs['info']['vids'] = $res; return $rs; }
The above is the short video system development. Add the viewing record to the bloom filter of Redis, and then use the filter to filter out the videos you haven't seen, so as to realize the purpose of short video recommendation and de duplication. Of course, the specific implementation still needs to maintain the video list in the cache, increase the video id and delete the video id, and save the user's viewing record to the database, Import again after the server is started, or use the backup of Redis or even the high availability feature of the cluster to restore the viewing data before.
In short, the core of short video recommendation de duplication in the development of short video system is to use the algorithm provided by Bloom filter to achieve high efficiency and low memory. Of course, short video recommendation may also involve paging, black videos or users, short video recommendation weight, recent videos and other business scenarios, Readers can study the relevant cache implementation to realize the relevant requirements, but the PHP environment will continue to be heavy. The author believes that the de duplication combined with Redis's bloom filter plug-in is a better scheme, which can be regarded as throwing away the jade. I hope you can discuss it together and find a more suitable scheme.