data migration
target
- Be able to describe the scheme of project data migration
- Understand the characteristics of hbase
- Be familiar with data packaging and transformation in data migration
- Be able to complete the full and incremental migration of article data
- Be able to complete the migration of hot article data
1 Why do I need automatic synchronization
Because mysql stores the data we crawl and self built, the amount of data we crawl is relatively large, and the use of mysql storage will affect the performance of mysql. In addition, we need to stream the data and make various statistics on the data. mysq can not meet our needs, so we synchronize the full amount of data in mysql to HBASE, and HBASE saves a large amount of data, The full amount of data in mysql will be deleted regularly.
HBASE stores a large amount of data. We need to calculate the hot data and synchronize the data to mysql and MONGODB. mysql stores the subject relationship data, and MONGODB stores the specific data information.
Because the hot data will also fail. Today is hot data, but not tomorrow. We also need to delete the hot data regularly. We regularly delete the hot data one month ago to maintain the hot data of this month.
2. Migration scheme
2.1 demand analysis
2.1.1 functional requirements
With the foundation of a large number of data sets, the hot data after real-time calculation needs to be saved. Because saving a large amount of article data in mysql will affect the performance of mysql, mysql+mongoDB is used for storage.
2.1.1 full data migration scheme
Synchronize the crawled or self built articles in mysql to HBASE through scheduled tasks, and change the synchronized data status to synchronized. These data will not be synchronized again during the next synchronization.
For the study of hbase, please refer to the materials related to hbase in the Resources folder
2.1.2 thermal data migration scheme
There is a full amount of data in HBASE. The big data end calculates the hot data. These hot data need to be synchronized to MYSQL and MONGDB for page display
[the transfer of external chain pictures fails. The source station may have anti-theft chain mechanism. It is recommended to save the pictures and upload them directly (img-tqUahjTM-1624346032054)(img72772597196.png)]
2.2 design ideas
-
Regularly read out the full amount of data in mysql database, package multiple objects into one object and save it in HBASE. After saving successfully, update the status in the database to synchronized, and the data will not be synchronized next time.
-
Use KAFKA to monitor the calculation results of hotspot data. After receiving the hotspot data information, get the packaged data from HBASE, split the data, save the relational data in mysql and the specific data in mongodb.
-
Because the hotspot data will become invalid, the expired data in mysql and mongodb will be cleared regularly
2.3 problems needing attention in data synchronization
-
HBASE data is mainly queried by rowkey. The design of rowkey uses the primary key ID in mysql as the rowkey. When querying, the data is directly obtained according to the rowkey
-
Because the data that needs to be synchronized to HBASE is the data of multiple data tables. A piece of data is composed of multiple objects. When storing, the column family is used to distinguish different objects and store different fields.
3. Integrate hbase and Mongodb in the project
Integration in Heima leadnews common
hbase.properties
hbase.zookeeper.quorum=172.16.1.52:2181 hbase.client.keyvalue.maxsize=500000
mongo.properties
#mongoDB connection mongo.host=47.94.7.85 mongo.port=27017 mongo.dbname=mongoDB
pom.xml
<!--mongoDB--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-mongodb</artifactId> </dependency> <!--HBase--> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-client</artifactId> <version>2.1.5</version> <exclusions> <exclusion> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> </exclusion> <exclusion> <groupId>log4j</groupId> <artifactId>log4j</artifactId> </exclusion> <exclusion> <groupId>junit</groupId> <artifactId>junit</artifactId> </exclusion> <exclusion> <artifactId>slf4j-log4j12</artifactId> <groupId>org.slf4j</groupId> </exclusion> </exclusions> </dependency>
host file configuration
Configure the domain name in the server host file and change it according to your server address
172.16.1.52 itcast
4 Introduction to common components
4.1 Hbase related operations
The Hbase operation tool class is used to store data in Hbase, and some methods are used to store or delete.
4.1.1 project import
Import the item Heima leadnews migration in the data folder
4.1.2 public storage class
(1)StorageData
The common storage data table consists of multiple storageentities
StorageData is the most important storage object. It is a class that stores bean information and is responsible for storing bean information and converting and reverse converting beans.
This class uses an important tool class, ReflectUtils reflection tool class and DataConvertUtils data type conversion tool class, which are mainly used for date type conversion. These two classes are located in
Code location: com heima. common. common. storage. StorageData
/** * Store Data */ @Setter @Getter @ToString public class StorageData { /** * Target class name */ private String targetClassName; /** * List of stored fields */ private List<StorageEntry> entryList = new ArrayList<StorageEntry>(); /** * Add an entry * * @param entry */ public void addStorageEntry(StorageEntry entry) { entryList.add(entry); } /** * Add an entry * * @param key * @param value */ public void addStorageEntry(String key, String value) { entryList.add(new StorageEntry(key, value)); } /** * Add entry according to Map * * @param map */ public void putHBaseEntry(Map<String, String> map) { if (null != map && !map.isEmpty()) { map.forEach((k, v) -> addStorageEntry(new StorageEntry(k, v))); } } /** * Get all Column arrays * * @return */ public String[] getColumns() { List<String> columnList = entryList.stream().map(StorageEntry::getKey).collect(Collectors.toList()); if (null != columnList && !columnList.isEmpty()) { return columnList.toArray(new String[columnList.size()]); } return null; } /** * Get all numeric values * * @return */ public String[] getValues() { List<String> valueList = entryList.stream().map(StorageEntry::getValue).collect(Collectors.toList()); if (null != valueList && !valueList.isEmpty()) { return valueList.toArray(new String[valueList.size()]); } return null; } /** * Get a Map * * @return */ public Map<String, Object> getMap() { Map<String, Object> entryMap = new HashMap<String, Object>(); entryList.forEach(entry -> entryMap.put(entry.getKey(), entry.getValue())); return entryMap; } /** * Convert the current StorageData into concrete objects * * @return */ public Object getObjectValue() { Object bean = null; if (StringUtils.isNotEmpty(targetClassName) && null != entryList && !entryList.isEmpty()) { bean = ReflectUtils.getClassForBean(targetClassName); if (null != bean) { for (StorageEntry entry : entryList) { Object value = DataConvertUtils.convert(entry.getValue(), ReflectUtils.getFieldAnnotations(bean, entry.getKey())); ReflectUtils.setPropertie(bean, entry.getKey(), value); } } } return bean; } /** * Converting a Bean to StorageData * * @param bean * @return */ public static StorageData getStorageData(Object bean) { StorageData hbaseData = null; if (null != bean) { hbaseData = new StorageData(); hbaseData.setTargetClassName(bean.getClass().getName()); hbaseData.setEntryList(getStorageEntryList(bean)); } return hbaseData; } /** * Get the entry list according to the bean * * @param bean * @return */ private static List<StorageEntry> getStorageEntryList(Object bean) { PropertyDescriptor[] propertyDescriptorArray = ReflectUtils.getPropertyDescriptorArray(bean); return Arrays.asList(propertyDescriptorArray).stream().map(propertyDescriptor -> { String key = propertyDescriptor.getName(); Object value = ReflectUtils.getPropertyDescriptorValue(bean, propertyDescriptor); value = DataConvertUtils.unConvert(value, ReflectUtils.getFieldAnnotations(bean, propertyDescriptor)); return new StorageEntry(key, DataConvertUtils.toString(value)); }).collect(Collectors.toList()); } }
The main methods
-
Add StorageEntry method
public void addStorageEntry(StorageEntry entry)
This method has several overloaded methods to StorageEntry Add to list StorageEntry Object - Get the corresponding Object object ```java public Object getObjectValue()
This method is used to convert the stored entity data into the entity of Bean, and the ReflectUtils reflection tool class is used for operation
- Convert Bean to storage structure of StorageData
public static StorageData getStorageData(Object bean)
This method is used to convert different bean s into the same storage structure for storage
(2)StorageEntity
Entities of common code storage
Code location: com heima. common. common. storage. StorageEntity
/** * A stored entity */ @Setter @Getter public class StorageEntity { /** * List of storage types * An entity can store multiple data lists */ private List<StorageData> dataList = new ArrayList<StorageData>(); /** * Add a storage data * @param storageData */ public void addStorageData(StorageData storageData) { dataList.add(storageData); } }
(3)StorageEntry
A key value field of the public storage object
Code location: com heima. common. common. storage. StorageEntry
/** * Store Entry * k-v Structure holds the field name and value of the field of an object */ @Setter @Getter public class StorageEntry { /** * Empty construction method */ public StorageEntry() { } /** * Construction method * * @param key * @param value */ public StorageEntry(String key, String value) { this.key = key; this.value = value; } /** * Key of field */ private String key; /** * Value of field */ private String value; }
4.1.3 tools related to HBase operation
(1)HBaseConstants class
Table name stored by configuration class Hbase
Code location: com heima. hbase. constants. HBaseConstants
public class HBaseConstants { public static final String APARTICLE_QUANTITY_TABLE_NAME = "APARTICLE_QUANTITY_TABLE_NAME"; }
(2)HBaseInvok
Callback operation class of hbase
Code location: com heima. hbase. entity. HBaseInvok
/** * Hbase Callback class for * Call back when it is used for our operation */ public interface HBaseInvok { /** * Callback method */ public void invok(); }
(3)HBaseStorage
The storage object of hbase is inherited from StorageEntity
Code location: com heima. hbase. entity. HBaseStorage
/** * Hbase Storage object inherits StorageEntity * Used to store various objects */ @Setter @Getter public class HBaseStorage extends StorageEntity { /** * Primary key * */ private String rowKey; /** * Hbase Callback interface, which is used to transfer the callback method */ private HBaseInvok hBaseInvok; /** * Get class cluster array * @return */ public List<String> getColumnFamily() { return getDataList().stream().map(StorageData::getTargetClassName).collect(Collectors.toList()); } /** * Callback */ public void invok() { if (null != hBaseInvok) { hBaseInvok.invok(); } } }
(4)HBaseClent
Tool class for hbase client operation
Code location: com heima. common. hbase. HBaseClent
/** * HBase Related basic operations * * @since 1.0.0 */ @Log4j2 public class HBaseClent { /** * Declare static configuration */ private Configuration conf = null; private Connection connection = null; public HBaseClent(Configuration conf) { this.conf = conf; try { connection = ConnectionFactory.createConnection(conf); } catch (IOException e) { log.error("obtain HBase connection failed"); } } public boolean tableExists(String tableName) { if (StringUtils.isNotEmpty(tableName)) { Admin admin = null; try { admin = connection.getAdmin(); return admin.tableExists(TableName.valueOf(tableName)); } catch (Exception e) { log.debug("Check whether the table is abnormal", e); } finally { close(admin, null, null); } } return false; } /** * Create table * * @param tableName Table name * @param columnFamily Family name * @return void * @since 1.0.0 */ public boolean creatTable(String tableName, List<String> columnFamily) { Admin admin = null; try { admin = connection.getAdmin(); List<ColumnFamilyDescriptor> familyDescriptors = new ArrayList<ColumnFamilyDescriptor>(columnFamily.size()); columnFamily.forEach(cf -> { familyDescriptors.add(ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes(cf)).build()); }); TableDescriptor tableDescriptor = TableDescriptorBuilder.newBuilder(TableName.valueOf(tableName)) .setColumnFamilies(familyDescriptors) .build(); if (admin.tableExists(TableName.valueOf(tableName))) { log.debug("table Exists!"); } else { admin.createTable(tableDescriptor); log.debug("create table Success!"); } } catch (IOException e) { log.error(MessageFormat.format("Create table{0}fail", tableName), e); return false; } finally { close(admin, null, null); } return true; } /** * Pre partition create table * * @param tableName Table name * @param columnFamily Collection of family names * @param splitKeys Pre staging region * @return Created successfully */ public boolean createTableBySplitKeys(String tableName, List<String> columnFamily, byte[][] splitKeys) { Admin admin = null; try { if (StringUtils.isBlank(tableName) || columnFamily == null || columnFamily.size() == 0) { log.error("===Parameters tableName|columnFamily should not be null,Please check!==="); return false; } admin = connection.getAdmin(); if (admin.tableExists(TableName.valueOf(tableName))) { return true; } else { List<ColumnFamilyDescriptor> familyDescriptors = new ArrayList<>(columnFamily.size()); columnFamily.forEach(cf -> { familyDescriptors.add(ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes(cf)).build()); }); TableDescriptor tableDescriptor = TableDescriptorBuilder.newBuilder(TableName.valueOf(tableName)) .setColumnFamilies(familyDescriptors) .build(); //Specify splitkeys admin.createTable(tableDescriptor, splitKeys); log.info("===Create Table " + tableName + " Success!columnFamily:" + columnFamily.toString() + "==="); } } catch (IOException e) { log.error("", e); return false; } finally { close(admin, null, null); } return true; } /** * Custom get partition splitKeys */ public byte[][] getSplitKeys(String[] keys) { if (keys == null) { //The default is 10 partitions keys = new String[]{"1|", "2|", "3|", "4|", "5|", "6|", "7|", "8|", "9|"}; } byte[][] splitKeys = new byte[keys.length][]; //Ascending sort TreeSet<byte[]> rows = new TreeSet<byte[]>(Bytes.BYTES_COMPARATOR); for (String key : keys) { rows.add(Bytes.toBytes(key)); } Iterator<byte[]> rowKeyIter = rows.iterator(); int i = 0; while (rowKeyIter.hasNext()) { byte[] tempRow = rowKeyIter.next(); rowKeyIter.remove(); splitKeys[i] = tempRow; i++; } return splitKeys; } /** * Press startKey and endKey to get the number of partitions */ public static byte[][] getHexSplits(String startKey, String endKey, int numRegions) { byte[][] splits = new byte[numRegions - 1][]; BigInteger lowestKey = new BigInteger(startKey, 16); BigInteger highestKey = new BigInteger(endKey, 16); BigInteger range = highestKey.subtract(lowestKey); BigInteger regionIncrement = range.divide(BigInteger.valueOf(numRegions)); lowestKey = lowestKey.add(regionIncrement); for (int i = 0; i < numRegions - 1; i++) { BigInteger key = lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i))); byte[] b = String.format("%016x", key).getBytes(); splits[i] = b; } return splits; } /** * Get table * * @param tableName Table name * @return Table * @throws IOException IOException */ private Table getTable(String tableName) throws IOException { return connection.getTable(TableName.valueOf(tableName)); } /** * Query the table names of all tables in the library */ public List<String> getAllTableNames() { List<String> result = new ArrayList<>(); Admin admin = null; try { admin = connection.getAdmin(); TableName[] tableNames = admin.listTableNames(); for (TableName tableName : tableNames) { result.add(tableName.getNameAsString()); } } catch (IOException e) { log.error("Failed to get the table names of all tables", e); } finally { close(admin, null, null); } return result; } /** * Query the table names of all tables in the library */ public List<String> getAllFamilyNames(String tableName) { ColumnFamilyDescriptor[] familyDescriptorList = null; List<String> familyNameList = new ArrayList<String>(); try { Table table = getTable(tableName); familyDescriptorList = table.getDescriptor().getColumnFamilies(); } catch (IOException e) { log.error("Failed to get all column clusters of the table", e); } finally { // close(admin, null, null); } if (null != familyDescriptorList && familyDescriptorList.length > 0) { familyNameList = Arrays.stream(familyDescriptorList).map(ColumnFamilyDescriptor::getNameAsString).collect(Collectors.toList()); } return familyNameList; } /** * Traverses all data in the specified table of the query * * @param tableName Table name * @return java.util.Map<java.lang.String, java.util.Map < java.lang.String, java.lang.String>> * @since 1.0.0 */ public Map<String, Map<String, String>> getResultScanner(String tableName) { Scan scan = new Scan(); return this.queryData(tableName, scan); } /** * Traverse and query all data in the specified table according to startRowKey and stopRowKey * * @param tableName Table name * @param startRowKey Start rowKey * @param stopRowKey End rowKey * @return java.util.Map<java.lang.String, java.util.Map < java.lang.String, java.lang.String>> * @since 1.0.0 */ public Map<String, Map<String, String>> getResultScanner(String tableName, String startRowKey, String stopRowKey) { Scan scan = new Scan(); if (StringUtils.isNoneBlank(startRowKey) && StringUtils.isNoneBlank(stopRowKey)) { scan.withStartRow(Bytes.toBytes(startRowKey)); scan.withStopRow(Bytes.toBytes(stopRowKey)); } return this.queryData(tableName, scan); } /** * Query data through row prefix filter * * @param tableName Table name * @param prefix Line keys starting with prefix * @return java.util.Map<java.lang.String, java.util.Map < java.lang.String, java.lang.String>> * @since 1.0.0 */ public Map<String, Map<String, String>> getResultScannerPrefixFilter(String tableName, String prefix) { Scan scan = new Scan(); if (StringUtils.isNoneBlank(prefix)) { Filter filter = new PrefixFilter(Bytes.toBytes(prefix)); scan.setFilter(filter); } return this.queryData(tableName, scan); } /** * Query data through column prefix filter * * @param tableName Table name * @param prefix Column names starting with prefix * @return java.util.Map<java.lang.String, java.util.Map < java.lang.String, java.lang.String>> * @since 1.0.0 */ public Map<String, Map<String, String>> getResultScannerColumnPrefixFilter(String tableName, String prefix) { Scan scan = new Scan(); if (StringUtils.isNoneBlank(prefix)) { Filter filter = new ColumnPrefixFilter(Bytes.toBytes(prefix)); scan.setFilter(filter); } return this.queryData(tableName, scan); } /** * Query the data containing specific characters in the row key * * @param tableName Table name * @param keyword Contains the row key for the specified keyword * @return java.util.Map<java.lang.String, java.util.Map < java.lang.String, java.lang.String>> * @since 1.0.0 */ public Map<String, Map<String, String>> getResultScannerRowFilter(String tableName, String keyword) { Scan scan = new Scan(); if (StringUtils.isNoneBlank(keyword)) { Filter filter = new RowFilter(CompareOperator.GREATER_OR_EQUAL, new SubstringComparator(keyword)); scan.setFilter(filter); } return this.queryData(tableName, scan); } /** * Query data containing specific characters in column names * * @param tableName Table name * @param keyword The column name that contains the specified keyword * @return java.util.Map<java.lang.String, java.util.Map < java.lang.String, java.lang.String>> * @since 1.0.0 */ public Map<String, Map<String, String>> getResultScannerQualifierFilter(String tableName, String keyword) { Scan scan = new Scan(); if (StringUtils.isNoneBlank(keyword)) { Filter filter = new QualifierFilter(CompareOperator.GREATER_OR_EQUAL, new SubstringComparator(keyword)); scan.setFilter(filter); } return this.queryData(tableName, scan); } /** * Query data by table name and filter criteria * * @param tableName Table name * @param scan Filter condition * @return java.util.Map<java.lang.String, java.util.Map < java.lang.String, java.lang.String>> * @since 1.0.0 */ private Map<String, Map<String, String>> queryData(String tableName, Scan scan) { //< rowkey, corresponding row data > Map<String, Map<String, String>> result = new HashMap<>(); ResultScanner rs = null; // Get table Table table = null; try { table = getTable(tableName); rs = table.getScanner(scan); for (Result r : rs) { //Each row of data Map<String, String> columnMap = new HashMap<>(); String rowKey = null; for (Cell cell : r.listCells()) { if (rowKey == null) { rowKey = Bytes.toString(cell.getRowArray(), cell.getRowOffset(), cell.getRowLength()); } columnMap.put(Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength()), Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength())); } if (rowKey != null) { result.put(rowKey, columnMap); } } } catch (IOException e) { log.error(MessageFormat.format("Failed to traverse all data in the specified table,tableName:{0}" , tableName), e); } finally { close(null, rs, table); } return result; } /** * Accurately query the data of a row according to tableName and rowKey * * @param tableName Table name * @param rowKey Row key * @return java.util.Map<java.lang.String, java.lang.String> Returns a row of data * @since 1.0.0 */ public Map<String, String> getRowData(String tableName, String rowKey) { //Returned key value pair Map<String, String> result = new HashMap<>(); Get get = new Get(Bytes.toBytes(rowKey)); // Get table Table table = null; try { table = getTable(tableName); Result hTableResult = table.get(get); if (hTableResult != null && !hTableResult.isEmpty()) { for (Cell cell : hTableResult.listCells()) { // System.out.println("family:" + Bytes.toString(cell.getFamilyArray(), cell.getFamilyOffset(), cell.getFamilyLength())); // System.out.println("qualifier:" + Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength())); // System.out.println("value:" + Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength())); // System.out.println("Timestamp:" + cell.getTimestamp()); // System.out.println("-------------------------------------------"); result.put(Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength()), Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength())); } } } catch (IOException e) { log.error(MessageFormat.format("Failed to query the data of one row,tableName:{0},rowKey:{1}" , tableName, rowKey), e); } finally { close(null, null, table); } return result; } public Result getHbaseResult(String tableName, String rowKey) { Get get = new Get(Bytes.toBytes(rowKey)); // Get table Table table = null; Result hTableResult = null; try { table = getTable(tableName); hTableResult = table.get(get); } catch (IOException e) { log.error(MessageFormat.format("Failed to query the data of one row,tableName:{0},rowKey:{1}" , tableName, rowKey), e); } finally { close(null, null, table); } return hTableResult; } /** * Accurately query the data of a row according to tableName and rowKey * * @param tableName Table name * @param rowKey Row key * @return java.util.Map<java.lang.String, java.lang.String> Returns a row of data * @since 1.0.0 */ public StorageData getStorageData(String tableName, String rowKey, String familyName) { Result hTableResult = getHbaseResult(tableName, rowKey); return getStorageDataForfamilyName(hTableResult, familyName); } public List<StorageData> getStorageDataList(String tableName, String rowKey, List<String> familyNameList) { List<StorageData> hBaseDataList = new ArrayList<StorageData>(); Result hTableResult = getHbaseResult(tableName, rowKey); for (String familyName : familyNameList) { StorageData hBaseData = getStorageDataForfamilyName(hTableResult, familyName); if (null != hBaseData) { hBaseDataList.add(hBaseData); } } return hBaseDataList; } public StorageData getStorageDataForfamilyName(Result hTableResult, String familyName) { StorageData storageData = null; if (hTableResult != null && !hTableResult.isEmpty()) { Map<byte[], byte[]> familyMap = hTableResult.getFamilyMap(Bytes.toBytes(familyName)); if (null != familyMap && !familyMap.isEmpty()) { storageData = new StorageData(); for (Map.Entry<byte[], byte[]> entry : familyMap.entrySet()) { storageData.setTargetClassName(familyName); storageData.addStorageEntry(Bytes.toString(entry.getKey()), Bytes.toString(entry.getValue())); } } } return storageData; } /** * Query the data of the specified cell according to tableName, rowKey, familyName and column * * @param tableName Table name * @param rowKey rowKey * @param familyName Family name * @param columnName Listing * @return java.lang.String * @since 1.0.0 */ public String getColumnValue(String tableName, String rowKey, String familyName, String columnName) { String str = null; Get get = new Get(Bytes.toBytes(rowKey)); // Get table Table table = null; try { table = getTable(tableName); Result result = table.get(get); if (result != null && !result.isEmpty()) { Cell cell = result.getColumnLatestCell(Bytes.toBytes(familyName), Bytes.toBytes(columnName)); if (cell != null) { str = Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength()); } } } catch (IOException e) { log.error(MessageFormat.format("Failed to query the data of the specified cell,tableName:{0},rowKey:{1},familyName:{2},columnName:{3}" , tableName, rowKey, familyName, columnName), e); } finally { close(null, null, table); } return str; } /** * Query the data of multiple versions of the specified cell according to tableName, rowKey, familyName and column * * @param tableName Table name * @param rowKey rowKey * @param familyName Family name * @param columnName Listing * @param versions Number of versions to query * @return java.util.List<java.lang.String> * @since 1.0.0 */ public List<String> getColumnValuesByVersion(String tableName, String rowKey, String familyName, String columnName, int versions) { //Return data List<String> result = new ArrayList<>(versions); // Get table Table table = null; try { table = getTable(tableName); Get get = new Get(Bytes.toBytes(rowKey)); get.addColumn(Bytes.toBytes(familyName), Bytes.toBytes(columnName)); //How many versions are read get.readVersions(versions); Result hTableResult = table.get(get); if (hTableResult != null && !hTableResult.isEmpty()) { for (Cell cell : hTableResult.listCells()) { result.add(Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength())); } } } catch (IOException e) { log.error(MessageFormat.format("Failed to query the data of multiple versions of the specified cell,tableName:{0},rowKey:{1},familyName:{2},columnName:{3}" , tableName, rowKey, familyName, columnName), e); } finally { close(null, null, table); } return result; } /** * Add or update data for table * * @param tableName Table name * @param rowKey rowKey * @param familyName Family name * @param columns Column name array * @param values Column value array * @since 1.0.0 */ public void putData(String tableName, String rowKey, String familyName, String[] columns, String[] values) { // Get table Table table = null; try { table = getTable(tableName); putData(table, rowKey, tableName, familyName, columns, values); } catch (Exception e) { log.error(MessageFormat.format("Add to table or Failed to update data,tableName:{0},rowKey:{1},familyName:{2}" , tableName, rowKey, familyName), e); } finally { close(null, null, table); } } /** * Add or update data for table * * @param table Table * @param rowKey rowKey * @param tableName Table name * @param familyName Family name * @param columns Column name array * @param values Column value array * @since 1.0.0 */ private void putData(Table table, String rowKey, String tableName, String familyName, String[] columns, String[] values) { try { //Set rowkey Put put = new Put(Bytes.toBytes(rowKey)); if (columns != null && values != null && columns.length == values.length) { for (int i = 0; i < columns.length; i++) { if (columns[i] != null && values[i] != null) { put.addColumn(Bytes.toBytes(familyName), Bytes.toBytes(columns[i]), Bytes.toBytes(values[i])); } else { throw new NullPointerException(MessageFormat.format("Neither column name nor column data can be empty,column:{0},value:{1}" , columns[i], values[i])); } } } table.put(put); log.debug("putData add or update data Success,rowKey:" + rowKey); table.close(); } catch (Exception e) { log.error(MessageFormat.format("Add to table or Failed to update data,tableName:{0},rowKey:{1},familyName:{2}" , tableName, rowKey, familyName), e); } } /** * Assign a value to a cell in the table * * @param tableName Table name * @param rowKey rowKey * @param familyName Family name * @param column1 Listing * @param value1 Column value * @since 1.0.0 */ public void setColumnValue(String tableName, String rowKey, String familyName, String column1, String value1) { Table table = null; try { // Get table table = getTable(tableName); // Set rowKey Put put = new Put(Bytes.toBytes(rowKey)); put.addColumn(Bytes.toBytes(familyName), Bytes.toBytes(column1), Bytes.toBytes(value1)); table.put(put); log.debug("add data Success!"); } catch (IOException e) { log.error(MessageFormat.format("Failed to assign a value to a cell of the table,tableName:{0},rowKey:{1},familyName:{2},column:{3}" , tableName, rowKey, familyName, column1), e); } finally { close(null, null, table); } } /** * Deletes the specified cell * * @param tableName Table name * @param rowKey rowKey * @param familyName Family name * @param columnName Listing * @return boolean * @since 1.0.0 */ public boolean deleteColumn(String tableName, String rowKey, String familyName, String columnName) { Table table = null; Admin admin = null; try { admin = connection.getAdmin(); if (admin.tableExists(TableName.valueOf(tableName))) { // Get table table = getTable(tableName); Delete delete = new Delete(Bytes.toBytes(rowKey)); // Set columns to be deleted delete.addColumns(Bytes.toBytes(familyName), Bytes.toBytes(columnName)); table.delete(delete); log.debug(MessageFormat.format("familyName({0}):columnName({1})is deleted!", familyName, columnName)); } } catch (IOException e) { log.error(MessageFormat.format("Failed to delete the specified column,tableName:{0},rowKey:{1},familyName:{2},column:{3}" , tableName, rowKey, familyName, columnName), e); return false; } finally { close(admin, null, table); } return true; } /** * Delete the specified row according to the rowKey * * @param tableName Table name * @param rowKey rowKey * @return boolean * @since 1.0.0 */ public boolean deleteRow(String tableName, String rowKey) { Table table = null; Admin admin = null; try { admin = connection.getAdmin(); if (admin.tableExists(TableName.valueOf(tableName))) { // Get table table = getTable(tableName); Delete delete = new Delete(Bytes.toBytes(rowKey)); table.delete(delete); log.debug(MessageFormat.format("row({0}) is deleted!", rowKey)); } } catch (IOException e) { log.error(MessageFormat.format("Failed to delete the specified row,tableName:{0},rowKey:{1}" , tableName, rowKey), e); return false; } finally { close(admin, null, table); } return true; } /** * Deletes the specified column family based on columnFamily * * @param tableName Table name * @param columnFamily Row family * @return boolean * @since 1.0.0 */ public boolean deleteColumnFamily(String tableName, String columnFamily) { Admin admin = null; try { admin = connection.getAdmin(); if (admin.tableExists(TableName.valueOf(tableName))) { admin.deleteColumnFamily(TableName.valueOf(tableName), Bytes.toBytes(columnFamily)); log.debug(MessageFormat.format("familyName({0}) is deleted!", columnFamily)); } } catch (IOException e) { log.error(MessageFormat.format("Failed to delete the specified column family,tableName:{0},columnFamily:{1}" , tableName, columnFamily), e); return false; } finally { close(admin, null, null); } return true; } /** * Delete table * * @param tableName Table name * @since 1.0.0 */ public boolean deleteTable(String tableName) { Admin admin = null; try { admin = connection.getAdmin(); if (admin.tableExists(TableName.valueOf(tableName))) { admin.disableTable(TableName.valueOf(tableName)); admin.deleteTable(TableName.valueOf(tableName)); log.debug(tableName + "is deleted!"); } } catch (IOException e) { log.error(MessageFormat.format("Failed to delete the specified table,tableName:{0}" , tableName), e); return false; } finally { close(admin, null, null); } return true; } /** * Close flow */ private void close(Admin admin, ResultScanner rs, Table table) { if (admin != null) { try { admin.close(); } catch (IOException e) { log.error("close Admin fail", e); } } if (rs != null) { rs.close(); } if (table != null) { try { table.close(); } catch (IOException e) { log.error("close Table fail", e); } } } }
(5)HBaseConfig
Used to configure the related configuration of the HbaseClient object
Code location: com heima. common. hbase. HBaseConfig
/** * HBase Configuration class, read HBase Properties configuration file */ @Setter @Getter @Configuration @PropertySource("classpath:hbase.properties") public class HBaseConfig { /** * hBase Company Registered Address */ @Value("${hbase.zookeeper.quorum}") private String zookip_quorum; /** * Timeout */ @Value("${hbase.client.keyvalue.maxsize}") private String maxsize; /** * Create HBaseClien * * @return */ @Bean public HBaseClent getHBaseClient() { org.apache.hadoop.conf.Configuration hBaseConfiguration = getHbaseConfiguration(); return new HBaseClent(hBaseConfiguration); } /** * Get the HbaseConfiguration object * @return */ private org.apache.hadoop.conf.Configuration getHbaseConfiguration() { org.apache.hadoop.conf.Configuration hBaseConfiguration = new org.apache.hadoop.conf.Configuration(); hBaseConfiguration.set("hbase.zookeeper.quorum", zookip_quorum); hBaseConfiguration.set("hbase.client.keyvalue.maxsize", maxsize); return hBaseConfiguration; } }
(6)HBaseStorageClient
The Hbase storage client tool class encapsulates the Hbase client tool class
This class is its own encapsulated storage client
This class is located in com.com under Heima leadnews common package heima. hbase. HBaseStorageClient
The HBaseClent client tool is used. It is an operation tool class. We don't need to write and take it for use
Code location: com heima. common. hbase. HBaseStorageClient
/** * HBase Storage client */ @Component @Log4j2 public class HBaseStorageClient { /** * Injected HBaseClent tool class */ @Autowired private HBaseClent hBaseClent; /** * Add a storage list to Hbase * * @param tableName indicate * @param hBaseStorageList Storage list */ public void addHBaseStorage(final String tableName, List<HBaseStorage> hBaseStorageList) { if (null != hBaseStorageList && !hBaseStorageList.isEmpty()) { hBaseStorageList.stream().forEach(hBaseStorage -> { addHBaseStorage(tableName, hBaseStorage); }); } } /** * Add a storage to Hbase * * @param tableName indicate * @param hBaseStorage storage */ public void addHBaseStorage(String tableName, HBaseStorage hBaseStorage) { if (null != hBaseStorage && StringUtils.isNotEmpty(tableName)) { hBaseClent.creatTable(tableName, hBaseStorage.getColumnFamily()); String rowKey = hBaseStorage.getRowKey(); List<StorageData> storageDataList = hBaseStorage.getDataList(); boolean result = addStorageData(tableName, rowKey, storageDataList); if (result) { hBaseStorage.invok(); } } } /** * Add data to Hbase * * @param tableName indicate * @param rowKey Primary key * @param storageDataList Storage data set * @return */ public boolean addStorageData(String tableName, String rowKey, List<StorageData> storageDataList) { long currentTime = System.currentTimeMillis(); log.info("Start adding StorageData reach Hbase,tableName:{},rowKey:{}", tableName, rowKey); if (null != storageDataList && !storageDataList.isEmpty()) { storageDataList.forEach(hBaseData -> { String columnFamliyName = hBaseData.getTargetClassName(); String[] columnArray = hBaseData.getColumns(); String[] valueArray = hBaseData.getValues(); if (null != columnArray && null != valueArray) { hBaseClent.putData(tableName, rowKey, columnFamliyName, columnArray, valueArray); } }); } log.info("add to StorageData reach Hbase complete,tableName:{},rowKey:{},duration:{}", tableName, rowKey, System.currentTimeMillis() - currentTime); return true; } /** * Get an object according to the indication and rowKey * * @param tableName indicate * @param rowKey Primary key * @param tClass Object type to get * @param <T> Generic T * @return Returns the number to return */ public <T> T getStorageDataEntity(String tableName, String rowKey, Class<T> tClass) { T tValue = null; if (StringUtils.isNotEmpty(tableName)) { StorageData hBaseData = hBaseClent.getStorageData(tableName, rowKey, tClass.getName()); if (null != hBaseData) { tValue = (T) hBaseData.getObjectValue(); } } return tValue; } /** * According to the type list, it indicates that rowkey returns a list of data types * * @param tableName indicate * @param rowKey rowKey * @param typeList Type list * @return List of returned objects */ public List<Object> getStorageDataEntityList(String tableName, String rowKey, List<Class> typeList) { List<Object> entityList = new ArrayList<Object>(); List<String> strTypeList = typeList.stream().map(x -> x.getName()).collect(Collectors.toList()); List<StorageData> storageDataList = hBaseClent.getStorageDataList(tableName, rowKey, strTypeList); for (StorageData storageData : storageDataList) { entityList.add(storageData.getObjectValue()); } return entityList; } /** * Get HBaseClent client * * @return */ public HBaseClent gethBaseClent() { return hBaseClent; } }
4.1.4 test code
@SpringBootTest @RunWith(SpringRunner.class) public class HbaseTest { @Autowired private HBaseClent hBaseClent; @Test public void testCreateTable(){ List<String> columnFamily = new ArrayList<>(); columnFamily.add("test_cloumn_family1"); columnFamily.add("test_cloumn_family2"); boolean ret = hBaseClent.creatTable("hbase_test_table_name", columnFamily); } @Test public void testDelTable(){ hBaseClent.deleteTable("hbase_test_table_name"); } @Test public void testSaveData(){ String []columns ={"name","age"}; String [] values = {"zhangsan","28"}; hBaseClent.putData("hbase_test_table_name","test_row_key_001","test_cloumn_family1",columns,values); } @Test public void testFindByRowKey(){ Result hbaseResult = hBaseClent.getHbaseResult("hbase_test_table_name", "test_row_key_001"); System.out.println(hbaseResult); } }
4.2 MongoDB operation tool class
mongoDB is a document database, which also needs to store multiple different objects. We also use the StorageEntity storage structure used in HBASE, which we will talk about below
We use Spring MongoTemplate to operate the database
Introduce the following our entities
(1)MongoConstant
The constant of mongodb operation defines the table name of mongodb operation
Code location: com heima. common. mongo. constants. MongoConstant
public class MongoConstant { public static final String APARTICLE_MIGRATION_TABLE = "APARTICLE_MIGRATION_TABLE"; }
(2)MongoStorageEntity
MongoStorageEntity is our storage structure for storing MongoDB data, which is mainly based on the StorageEntity structure
The entity class of mongoDB operation inherits StorageEntity and formulates the indication and entity type
Code location: com heima. common. mongo. entity. MongoStorageEntity
/** * mongoDB Storage entity * Document Indicates what it is */ @Document(collection = "mongo_storage_data") @Setter @Getter public class MongoStorageEntity extends StorageEntity { /** * Key of primary key * * @Id Indicates that the field is the primary key */ @Id private String rowKey; }
(3)MongoDBconfigure
Configuration class for mongdb operation
Code location: com heima. common. mongo. MongoDBconfigure
@Configuration @PropertySource("classpath:mongo.properties") public class MongoDBconfigure { @Value("${mongo.host}") private String host; @Value("${mongo.port}") private int port; @Value("${mongo.dbname}") private String dbName; @Bean public MongoTemplate getMongoTemplate() { return new MongoTemplate(getSimpleMongoDbFactory()); } public SimpleMongoDbFactory getSimpleMongoDbFactory() { return new SimpleMongoDbFactory(new MongoClient(host, port), dbName); } }
(4) Test code
@SpringBootTest(classes = MigrationApplication.class) @RunWith(SpringJUnit4ClassRunner.class) public class MongoTest { @Autowired private MongoTemplate mongotemplate; @Autowired private HBaseStorageClient hBaseStorageClient; @Test public void test() { Class<?>[] classes = new Class<?>[]{ApArticle.class, ApArticleContent.class, ApAuthor.class}; //List<Object> entityList = hBaseStorageClient.getHbaseDataEntityList(HBaseConstants.APARTICLE_QUANTITY_TABLE_NAME, "1", Arrays.asList(classes)); List<String> strList = Arrays.asList(classes).stream().map(x -> x.getName()).collect(Collectors.toList()); List<StorageData> storageDataList = hBaseStorageClient.gethBaseClent().getStorageDataList(HBaseConstants.APARTICLE_QUANTITY_TABLE_NAME, "1", strList); MongoStorageEntity mongoStorageEntity = new MongoStorageEntity(); mongoStorageEntity.setDataList(storageDataList); mongoStorageEntity.setRowKey("1"); MongoStorageEntity tmp = mongotemplate.findById("1", MongoStorageEntity.class); if (null != tmp) { mongotemplate.remove(tmp); } MongoStorageEntity tq = mongotemplate.insert(mongoStorageEntity); System.out.println(tq); } @Test public void test1() { MongoStorageEntity mongoStorageEntity = mongotemplate.findById("1", MongoStorageEntity.class); if (null != mongoStorageEntity && null != mongoStorageEntity.getDataList()) { mongoStorageEntity.getDataList().forEach(x -> { System.out.println(x.getObjectValue()); }); } } }
7 business layer code
7.1 Habse operation entity class
(1)ArticleCallBack
Tool class of Hbase related callback operation
Code location: com heima. migration. entity. ArticleCallBack
public interface ArticleCallBack { public void callBack(ApArticle apArticle); }
(2)ArticleHBaseInvok
Hbase encapsulates the callback object and the invok e execution object of the callback
Code location: com heima. migration. entity. ArticleHBaseInvok
/** * callback object */ @Setter @Getter public class ArticleHBaseInvok implements HBaseInvok { /** * Callback the object to be transferred */ private ApArticle apArticle; /** * The callback requires a corresponding callback interface */ private ArticleCallBack articleCallBack; public ArticleHBaseInvok(ApArticle apArticle, ArticleCallBack articleCallBack) { this.apArticle = apArticle; this.articleCallBack = articleCallBack; } /** * Execute callback method */ @Override public void invok() { if (null != apArticle && null != articleCallBack) { articleCallBack.callBack(apArticle); } } }
(3)ArticleQuantity
Encapsulation of the whole object to be stored
Code location: com heima. migration. entity. ArticleQuantity
/** * Article Tool class for encapsulating data */ @Setter @Getter public class ArticleQuantity { /** * Article relational data entity */ private ApArticle apArticle; /** * Article configuration entity */ private ApArticleConfig apArticleConfig; /** * Article content entity */ private ApArticleContent apArticleContent; /** * Article author entity */ private ApAuthor apAuthor; /** * Callback interface */ private HBaseInvok hBaseInvok; public Integer getApArticleId() { if (null != apArticle) { return apArticle.getId(); } return null; } /** * Convert ArticleQuantity object to HBaseStorage object * * @return */ public HBaseStorage getHbaseStorage() { HBaseStorage hbaseStorage = new HBaseStorage(); hbaseStorage.setRowKey(String.valueOf(apArticle.getId())); hbaseStorage.setHBaseInvok(hBaseInvok); StorageData apArticleData = StorageData.getStorageData(apArticle); if (null != apArticleData) { hbaseStorage.addStorageData(apArticleData); } StorageData apArticleConfigData = StorageData.getStorageData(apArticleConfig); if (null != apArticleConfigData) { hbaseStorage.addStorageData(apArticleConfigData); } StorageData apArticleContentData = StorageData.getStorageData(apArticleContent); if (null != apArticleContentData) { hbaseStorage.addStorageData(apArticleContentData); } StorageData apAuthorData = StorageData.getStorageData(apAuthor); if (null != apAuthorData) { hbaseStorage.addStorageData(apAuthorData); } return hbaseStorage; } /** * Get StorageData list * * @return */ public List<StorageData> getStorageDataList() { List<StorageData> storageDataList = new ArrayList<StorageData>(); StorageData apArticleStorageData = StorageData.getStorageData(apArticle); if (null != apArticleStorageData) { storageDataList.add(apArticleStorageData); } StorageData apArticleContentStorageData = StorageData.getStorageData(apArticleContent); if (null != apArticleContentStorageData) { storageDataList.add(apArticleContentStorageData); } StorageData apArticleConfigStorageData = StorageData.getStorageData(apArticleConfig); if (null != apArticleConfigStorageData) { storageDataList.add(apArticleConfigStorageData); } StorageData apAuthorStorageData = StorageData.getStorageData(apAuthor); if (null != apAuthorStorageData) { storageDataList.add(apAuthorStorageData); } return storageDataList; } public ApHotArticles getApHotArticles() { ApHotArticles apHotArticles = null; if (null != apArticle) { apHotArticles = new ApHotArticles(); apHotArticles.setArticleId(apArticle.getId()); apHotArticles.setReleaseDate(apArticle.getPublishTime()); apHotArticles.setScore(1); // apHotArticles.setTagId(); apHotArticles.setTagName(apArticle.getLabels()); apHotArticles.setCreatedTime(new Date()); } return apHotArticles; } }
7.2 article configuration interface
7.2.1 mapper
New method in ApArticleConfigMapper
List<ApArticleConfig> selectByArticleIds(List<String> articleIds);
ApArticleConfigMapper.xml
<select id="selectByArticleIds" resultMap="BaseResultMap"> select <include refid="Base_Column_List"/> from ap_article_config where article_id in <foreach item="item" index="index" collection="list" open="(" separator="," close=")"> #{item} </foreach> </select>
7.2.2 service
service for article configuration operation
Interface location: com heima. migration. service. ApArticleConfigService
public interface ApArticleConfigService { List<ApArticleConfig> queryByArticleIds(List<String> ids); ApArticleConfig getByArticleId(Integer id); }
ApArticleConfigServiceImpl
This is an operation on ApArticleConfig
Code location: com heima. migration. service. impl. ApArticleConfigServiceImpl
@Service public class ApArticleConfigServiceImpl implements ApArticleConfigService { @Autowired private ApArticleConfigMapper apArticleConfigMapper; @Override public List<ApArticleConfig> queryByArticleIds(List<String> ids) { return apArticleConfigMapper.selectByArticleIds(ids); } @Override public ApArticleConfig getByArticleId(Integer id) { return apArticleConfigMapper.selectByArticleId(id); } }
7.3 article content interface
7.3.1 mapper definition
ApArticleContentMapper new method
List<ApArticleContent> selectByArticleIds(List<String> articleIds);
ApArticleContentMapper.xml
<select id="selectByArticleIds" resultMap="BaseResultMap"> select <include refid="Base_Column_List"/> , <include refid="Blob_Column_List"/> from ap_article_content where article_id IN <foreach collection="list" item="id" index="index" open="(" close=")" separator=","> #{id} </foreach> </select>
7.3.2 service
Service for article content operation
Interface location: com heima. migration. service. ApArticleContenService
public interface ApArticleContenService { List<ApArticleContent> queryByArticleIds(List<String> ids); ApArticleContent getByArticleIds(Integer id); }
ApArticleContenServiceImpl
Operations related to aparticlecontent
Code location: com heima. migration. service. impl. ApArticleContenServiceImpl
@Service public class ApArticleContenServiceImpl implements ApArticleContenService { @Autowired private ApArticleContentMapper apArticleContentMapper; @Override public List<ApArticleContent> queryByArticleIds(List<String> ids) { return apArticleContentMapper.selectByArticleIds(ids); } @Override public ApArticleContent getByArticleIds(Integer id) { return apArticleContentMapper.selectByArticleId(id); } }
7.4 article interface
7.4.1 mapper definition
ApArticleMapper new method
/** * query * * @param apArticle * @return */ List<ApArticle> selectList(ApArticle apArticle); /** * to update * @param apArticle */ void updateSyncStatus(ApArticle apArticle);
ApArticleMapper.xml
<sql id="Base_Column_Where"> <where> <if test="title!=null and title!=''"> and title = #{title} </if> <if test="authorId!=null and authorId!=''"> and author_id = #{authorId} </if> <if test="authorName!=null and authorName!=''"> and author_name = #{authorName} </if> <if test="channelId!=null and channelId!=''"> and channel_id = #{channelId} </if> <if test="channelName!=null and channelName!=''"> and channel_name = #{channelName} </if> <if test="layout!=null and layout!=''"> and layout = #{layout} </if> <if test="flag!=null and flag!=''"> and flag = #{flag} </if> <if test="views!=null and views!=''"> and views = #{views} </if> <if test="syncStatus!=null"> and sync_status = #{syncStatus} </if> </where> </sql> <select id="selectList" resultMap="resultMap"> select <include refid="Base_Column_List"/> from ap_article <include refid="Base_Column_Where"/> </select> <update id="updateSyncStatus"> UPDATE ap_article SET sync_status = #{syncStatus} WHERE id=#{id} </update>
7.4.2 service
Service for ApArticle operation
Interface location: com heima. migration. service. ApArticleService
public interface ApArticleService { public ApArticle getById(Long id); /** * Get unsynchronized data * * @return */ public List<ApArticle> getUnsyncApArticleList(); /** * Update synchronization status * * @param apArticle */ void updateSyncStatus(ApArticle apArticle); }
ApArticleServiceImpl
Operations related to ApArticleService
Code location: com heima. migration. service. impl. ApArticleServiceImpl
@Log4j2 @Service public class ApArticleServiceImpl implements ApArticleService { @Autowired private ApArticleMapper apArticleMapper; public ApArticle getById(Long id) { return apArticleMapper.selectById(id); } /** * Get unsynchronized data * * @return */ public List<ApArticle> getUnsyncApArticleList() { ApArticle apArticleQuery = new ApArticle(); apArticleQuery.setSyncStatus(false); return apArticleMapper.selectList(apArticleQuery); } /** * Update data synchronization status * * @param apArticle */ public void updateSyncStatus(ApArticle apArticle) { log.info("Start updating data synchronization status, apArticle: {}", apArticle); if (null != apArticle) { apArticle.setSyncStatus(true); apArticleMapper.updateSyncStatus(apArticle); } } }
7.5 article author interface
7.5.1 mapper definition
ApAuthorMapper
List<ApAuthor> selectByIds(List<Integer> ids);
ApAuthorMapper.xml
<select id="selectByIds" resultMap="BaseResultMap"> select * from ap_author where id in <foreach item="item" index="index" collection="list" open="(" separator="," close=")"> #{item} </foreach> </select>
7.5.2 service
Service for ApAuthor operation
Interface location: com heima. migration. service. ApAuthorService
public interface ApAuthorService { List<ApAuthor> queryByIds(List<Integer> ids); ApAuthor getById(Long id); }
ApAuthorServiceImpl
Operations related to ApAuthor
Code location: com heima. migration. service. impl. ApAuthorServiceImpl
@Service public class ApAuthorServiceImpl implements ApAuthorService { @Autowired private ApAuthorMapper apAuthorMapper; @Override public List<ApAuthor> queryByIds(List<Integer> ids) { return apAuthorMapper.selectByIds(ids); } @Override public ApAuthor getById(Long id) { if (null != id) { return apAuthorMapper.selectById(id.intValue()); } return null; } }
7.6 integrated migration interface
ArticleQuantityService
The Service ArticleQuantity object that operates the ArticleQuantity object encapsulates the data related to the article
Interface location: com heima. migration. service. ArticleQuantityService
public interface ArticleQuantityService { /** * Get ArticleQuantity list * @return */ public List<ArticleQuantity> getArticleQuantityList(); /** * Get ArticleQuantity according to ArticleId * @param id * @return */ public ArticleQuantity getArticleQuantityByArticleId(Long id); /** * Get ArticleQuantity from Hbase according to ByArticleId * @param id * @return */ public ArticleQuantity getArticleQuantityByArticleIdForHbase(Long id); /** * Database to Hbase synchronization */ public void dbToHbase(); /** * Synchronize the data of the database to Hbase according to the articleId * @param articleId */ public void dbToHbase(Integer articleId); }
ArticleQuantityServiceImpl
Operations related to ArticleQuantity
Code location: com heima. migration. service. impl. ArticleQuantityServiceImpl
/** * Query the unsynchronized data and encapsulate it into an ArticleQuantity object */ @Service @Log4j2 public class ArticleQuantityServiceImpl implements ArticleQuantityService { @Autowired private ApArticleContenService apArticleContenService; @Autowired private ApArticleConfigService apArticleConfigService; @Autowired private ApAuthorService apAuthorService; @Autowired private HBaseStorageClient hBaseStorageClient; @Autowired private ApArticleService apArticleService; /** * Query the list of bit synchronization data * * @return */ public List<ArticleQuantity> getArticleQuantityList() { log.info("generate ArticleQuantity list"); //Query unsynchronized data List<ApArticle> apArticleList = apArticleService.getUnsyncApArticleList(); if (apArticleList.isEmpty()) { return null; } //Get the list of ArticleId List<String> apArticleIdList = apArticleList.stream().map(apArticle -> String.valueOf(apArticle.getId())).collect(Collectors.toList()); //Get the list of AuthorId List<Integer> apAuthorIdList = apArticleList.stream().map(apAuthor -> apAuthor.getAuthorId() == null ? null : apAuthor.getAuthorId().intValue()).filter(x -> x != null).collect(Collectors.toList()); //Batch query the content list according to apArticleIdList List<ApArticleContent> apArticleContentList = apArticleContenService.queryByArticleIds(apArticleIdList); //Batch query the configuration list according to apArticleIdList List<ApArticleConfig> apArticleConfigList = apArticleConfigService.queryByArticleIds(apArticleIdList); //Batch query the author column according to apAuthorIdList List<ApAuthor> apAuthorList = apAuthorService.queryByIds(apAuthorIdList); //Convert different objects to ArticleQuantity objects List<ArticleQuantity> articleQuantityList = apArticleList.stream().map(apArticle -> { return new ArticleQuantity() {{ //Set apArticle object setApArticle(apArticle); // According to aparticle Getid() filters out the ApArticleContent object that meets the requirements List<ApArticleContent> apArticleContents = apArticleContentList.stream().filter(x -> x.getArticleId().equals(apArticle.getId())).collect(Collectors.toList()); if (null != apArticleContents && !apArticleContents.isEmpty()) { setApArticleContent(apArticleContents.get(0)); } // According to aparticle Getid filter out ApArticleConfig object List<ApArticleConfig> apArticleConfigs = apArticleConfigList.stream().filter(x -> x.getArticleId().equals(apArticle.getId())).collect(Collectors.toList()); if (null != apArticleConfigs && !apArticleConfigs.isEmpty()) { setApArticleConfig(apArticleConfigs.get(0)); } // According to aparticle getAuthorId(). Intvalue() filters out the ApAuthor object List<ApAuthor> apAuthors = apAuthorList.stream().filter(x -> x.getId().equals(apArticle.getAuthorId().intValue())).collect(Collectors.toList()); if (null != apAuthors && !apAuthors.isEmpty()) { setApAuthor(apAuthors.get(0)); } //Set the callback method. The callback of the user method is used to modify the synchronization status. After inserting the Hbase successfully, the synchronization status is changed to synchronized setHBaseInvok(new ArticleHBaseInvok(apArticle, (x) -> apArticleService.updateSyncStatus(x))); }}; }).collect(Collectors.toList()); if (null != articleQuantityList && !articleQuantityList.isEmpty()) { log.info("generate ArticleQuantity List complete, size:{}", articleQuantityList.size()); } else { log.info("generate ArticleQuantity List complete, size:{}", 0); } return articleQuantityList; } public ArticleQuantity getArticleQuantityByArticleId(Long id) { if (null == id) { return null; } ArticleQuantity articleQuantity = null; ApArticle apArticle = apArticleService.getById(id); if (null != apArticle) { articleQuantity = new ArticleQuantity(); articleQuantity.setApArticle(apArticle); ApArticleContent apArticleContent = apArticleContenService.getByArticleIds(id.intValue()); articleQuantity.setApArticleContent(apArticleContent); ApArticleConfig apArticleConfig = apArticleConfigService.getByArticleId(id.intValue()); articleQuantity.setApArticleConfig(apArticleConfig); ApAuthor apAuthor = apAuthorService.getById(apArticle.getAuthorId()); articleQuantity.setApAuthor(apAuthor); } return articleQuantity; } public ArticleQuantity getArticleQuantityByArticleIdForHbase(Long id) { if (null == id) { return null; } ArticleQuantity articleQuantity = null; List<Class> typeList = Arrays.asList(ApArticle.class, ApArticleContent.class, ApArticleConfig.class, ApAuthor.class); List<Object> objectList = hBaseStorageClient.getStorageDataEntityList(HBaseConstants.APARTICLE_QUANTITY_TABLE_NAME, DataConvertUtils.toString(id), typeList); if (null != objectList && !objectList.isEmpty()) { articleQuantity = new ArticleQuantity(); for (Object value : objectList) { if (value instanceof ApArticle) { articleQuantity.setApArticle((ApArticle) value); } else if (value instanceof ApArticleContent) { articleQuantity.setApArticleContent((ApArticleContent) value); } else if (value instanceof ApArticleConfig) { articleQuantity.setApArticleConfig((ApArticleConfig) value); } else if (value instanceof ApAuthor) { articleQuantity.setApAuthor((ApAuthor) value); } } } return articleQuantity; } /** * Database to Hbase synchronization */ public void dbToHbase() { long cutrrentTime = System.currentTimeMillis(); List<ArticleQuantity> articleQuantitList = getArticleQuantityList(); if (null != articleQuantitList && !articleQuantitList.isEmpty()) { log.info("Start scheduled database to HBASE Synchronize, filter out the amount of unsynchronized data:{}", articleQuantitList.size()); if (null != articleQuantitList && !articleQuantitList.isEmpty()) { List<HBaseStorage> hbaseStorageList = articleQuantitList.stream().map(ArticleQuantity::getHbaseStorage).collect(Collectors.toList()); hBaseStorageClient.addHBaseStorage(HBaseConstants.APARTICLE_QUANTITY_TABLE_NAME, hbaseStorageList); } } else { log.info("Scheduled database to HBASE Synchronize filtered data for"); } log.info("Scheduled database to HBASE Synchronization ends, time consuming:{}", System.currentTimeMillis() - cutrrentTime); } @Override public void dbToHbase(Integer articleId) { long cutrrentTime = System.currentTimeMillis(); log.info("Start asynchronous database to HBASE Synchronization, articleId: {}", articleId); if (null != articleId) { ArticleQuantity articleQuantity = getArticleQuantityByArticleId(articleId.longValue()); if (null != articleQuantity) { HBaseStorage hBaseStorage = articleQuantity.getHbaseStorage(); hBaseStorageClient.addHBaseStorage(HBaseConstants.APARTICLE_QUANTITY_TABLE_NAME, hBaseStorage); } } log.info("Asynchronous database to HBASE Synchronization ends, articleId: {},time consuming:{}", articleId, System.currentTimeMillis() - cutrrentTime); } }
7.7 hot article interface
ApHotArticleService
Service operation on aphhotarticle
Interface location: com heima. migration. service. ApHotArticleService
public interface ApHotArticleService { List<ApHotArticles> selectList(ApHotArticles apHotArticlesQuery); void insert(ApHotArticles apHotArticles); /** * Hot data Hbase synchronization * * @param apArticleId */ public void hotApArticleSync(Integer apArticleId); void deleteById(Integer id); /** * Query expired data * * @return */ public List<ApHotArticles> selectExpireMonth(); void deleteHotData(ApHotArticles apHotArticle); }
ApHotArticleServiceImpl
Related operations on aphhotarticle
Code location: com heima. migration. service. impl. ApHotArticleServiceImpl
/** * Hotspot data operation Service class */ @Service @Log4j2 public class ApHotArticleServiceImpl implements ApHotArticleService { @Autowired private ApHotArticlesMapper apHotArticlesMapper; @Autowired private MongoTemplate mongoTemplate; @Autowired private ArticleQuantityService articleQuantityService; @Autowired private HBaseStorageClient hBaseStorageClient; @Override public List<ApHotArticles> selectList(ApHotArticles apHotArticlesQuery) { return apHotArticlesMapper.selectList(apHotArticlesQuery); } /** * Delete by ID * * @param id */ @Override public void deleteById(Integer id) { log.info("Delete thermal data, apArticleId: {}", id); apHotArticlesMapper.deleteById(id); } /** * Query data one month ago * * @return */ @Override public List<ApHotArticles> selectExpireMonth() { return apHotArticlesMapper.selectExpireMonth(); } /** * Delete past thermal data * * @param apHotArticle */ @Override public void deleteHotData(ApHotArticles apHotArticle) { deleteById(apHotArticle.getId()); String rowKey = DataConvertUtils.toString(apHotArticle.getId()); hBaseStorageClient.gethBaseClent().deleteRow(HBaseConstants.APARTICLE_QUANTITY_TABLE_NAME, rowKey); MongoStorageEntity mongoStorageEntity = mongoTemplate.findById(rowKey, MongoStorageEntity.class); if (null != mongoStorageEntity) { mongoTemplate.remove(mongoStorageEntity); } } /** * Insert operation * * @param apHotArticles */ @Override public void insert(ApHotArticles apHotArticles) { apHotArticlesMapper.insert(apHotArticles); } /** * Hotspot data synchronization method * * @param apArticleId */ @Override public void hotApArticleSync(Integer apArticleId) { log.info("Start synchronizing hot data, apArticleId: {}", apArticleId); ArticleQuantity articleQuantity = getHotArticleQuantity(apArticleId); if (null != articleQuantity) { //Synchronize hotspot data to DB hotApArticleToDBSync(articleQuantity); //Synchronize hotspot data to MONGO hotApArticleMongoSync(articleQuantity); log.info("Thermal data synchronization is completed, apArticleId: {}", apArticleId); } else { log.error("The corresponding thermal data cannot be found, apArticleId: {}", apArticleId); } } /** * Get the ArticleQuantity object of thermal data * * @param apArticleId * @return */ private ArticleQuantity getHotArticleQuantity(Integer apArticleId) { Long id = Long.valueOf(apArticleId); ArticleQuantity articleQuantity = articleQuantityService.getArticleQuantityByArticleId(id); if (null == articleQuantity) { articleQuantity = articleQuantityService.getArticleQuantityByArticleIdForHbase(id); } return articleQuantity; } /** * Synchronization of hot data to Mysql database * * @param articleQuantity */ public void hotApArticleToDBSync(ArticleQuantity articleQuantity) { Integer apArticleId = articleQuantity.getApArticleId(); log.info("Start transferring heat data from Hbase Sync to mysql,apArticleId: {}", apArticleId); if (null == apArticleId) { log.error("apArticleId Does not exist and cannot be synchronized"); return; } ApHotArticles apHotArticlesQuery = new ApHotArticles() {{ setArticleId(apArticleId); }}; List<ApHotArticles> apHotArticlesList = apHotArticlesMapper.selectList(apHotArticlesQuery); if (null != apHotArticlesList && !apHotArticlesList.isEmpty()) { log.info("Mysql The data has been synchronized, so it doesn't need to be synchronized again,apArticleId:{}", apArticleId); } else { ApHotArticles apHotArticles = articleQuantity.getApHotArticles(); apHotArticlesMapper.insert(apHotArticles); } log.info("Transfer heat data from Hbase Sync to mysql Done, apArticleId: {}", apArticleId); } /** * Hot data synchronization from Hbase to Mongodb * * @param articleQuantity */ public void hotApArticleMongoSync(ArticleQuantity articleQuantity) { Integer apArticleId = articleQuantity.getApArticleId(); log.info("Start transferring heat data from Hbase Sync to MongoDB,apArticleId: {}", apArticleId); if (null == apArticleId) { log.error("apArticleId Does not exist and cannot be synchronized"); return; } String rowKeyId = DataConvertUtils.toString(apArticleId); MongoStorageEntity mongoStorageEntity = mongoTemplate.findById(rowKeyId, MongoStorageEntity.class); if (null != mongoStorageEntity) { log.info("MongoDB The data has been synchronized, so it doesn't need to be synchronized again,apArticleId:{}", apArticleId); } else { List<StorageData> storageDataList = articleQuantity.getStorageDataList(); if (null != storageDataList && !storageDataList.isEmpty()) { mongoStorageEntity = new MongoStorageEntity(); mongoStorageEntity.setDataList(storageDataList); mongoStorageEntity.setRowKey(rowKeyId); mongoTemplate.insert(mongoStorageEntity); } } log.info("Transfer heat data from Hbase Sync to MongoDB Done, apArticleId: {}", apArticleId); } }
8 timing synchronization data
8.1 full data synchronization from mysql to HBase
@Component @DisallowConcurrentExecution @Log4j2 /** * Full data synchronization from mysql to HBase */ public class MigrationDbToHBaseQuartz extends AbstractJob { @Autowired private ArticleQuantityService articleQuantityService; @Override public String[] triggerCron() { /** * 2019/8/9 10:15:00 * 2019/8/9 10:20:00 * 2019/8/9 10:25:00 * 2019/8/9 10:30:00 * 2019/8/9 10:35:00 */ return new String[]{"0 0/5 * * * ?"}; } @Override protected void executeInternal(JobExecutionContext jobExecutionContext) throws JobExecutionException { log.info("Start database to HBASE Synchronization task"); articleQuantityService.dbToHbase(); log.info("Database to HBASE Synchronization task completed"); } }
8.2 regularly delete expired data
/** * Delete expired data regularly */ @Component @Log4j2 public class MigrationDeleteHotDataQuartz extends AbstractJob { @Autowired private ApHotArticleService apHotArticleService; @Override public String[] triggerCron() { /** * 2019/8/9 22:30:00 * 2019/8/10 22:30:00 * 2019/8/11 22:30:00 * 2019/8/12 22:30:00 * 2019/8/13 22:30:00 */ return new String[]{"0 30 22 * * ?"}; } @Override protected void executeInternal(JobExecutionContext jobExecutionContext) throws JobExecutionException { long cutrrentTime = System.currentTimeMillis(); log.info("Start deleting database expired data"); deleteExpireHotData(); log.info("It takes time to delete the expired data of the database:{}", System.currentTimeMillis() - cutrrentTime); } /** * Delete expired thermal data */ public void deleteExpireHotData() { List<ApHotArticles> apHotArticlesList = apHotArticleService.selectExpireMonth(); if (null != apHotArticlesList && !apHotArticlesList.isEmpty()) { for (ApHotArticles apHotArticle : apHotArticlesList) { apHotArticleService.deleteHotData(apHotArticle); } } } }
9 message receiving synchronization data
9.1 article review successful synchronization
9.1.1 message sending
(1) Message name definition and message sending method declaration
maven_test.properties
kafka.topic.article-audit-success=kafka.topic.article.audit.success.sigle.test
kafka.properties
kafka.topic.article-audit-success=${kafka.topic.article-audit-success}
com.heima.common.kafka.KafkaTopicConfig new attribute
/** * Audit successful */ String articleAuditSuccess;
com.heima.common.kafka.KafkaSender
/** * Send audit success message */ public void sendArticleAuditSuccessMessage(ArticleAuditSuccess message) { ArticleAuditSuccessMessage temp = new ArticleAuditSuccessMessage(); temp.setData(message); this.sendMesssage(kafkaTopicConfig.getArticleAuditSuccess(), UUID.randomUUID().toString(), temp); }
(2) Modify the automatic audit code. Both crawler and we media should be modified
After the audit is successful, send a message
Reptile
//Article reviewed successfully ArticleAuditSuccess articleAuditSuccess = new ArticleAuditSuccess(); articleAuditSuccess.setArticleId(apArticle.getId()); articleAuditSuccess.setType(ArticleAuditSuccess.ArticleType.CRAWLER); articleAuditSuccess.setChannelId(apArticle.getChannelId()); kafkaSender.sendArticleAuditSuccessMessage(articleAuditSuccess);
We media
//Article reviewed successfully ArticleAuditSuccess articleAuditSuccess = new ArticleAuditSuccess(); articleAuditSuccess.setArticleId(apArticle.getId()); articleAuditSuccess.setType(ArticleAuditSuccess.ArticleType.MEDIA); articleAuditSuccess.setChannelId(apArticle.getChannelId()); kafkaSender.sendArticleAuditSuccessMessage(articleAuditSuccess);
9.1.2 message reception
/** * Hot article monitoring class */ @Component @Log4j2 public class MigrationAuditSucessArticleListener implements KafkaListener<String, String> { /** * General conversion mapper */ @Autowired ObjectMapper mapper; /** * kafka Theme configuration */ @Autowired KafkaTopicConfig kafkaTopicConfig; @Autowired private ArticleQuantityService articleQuantityService; @Override public String topic() { return kafkaTopicConfig.getArticleAuditSuccess(); } /** * Listen for messages * * @param data * @param consumer */ @Override public void onMessage(ConsumerRecord<String, String> data, Consumer<?, ?> consumer) { log.info("kafka Approval passed message received:{}", data); String value = (String) data.value(); if (null != value) { ArticleAuditSuccessMessage message = null; try { message = mapper.readValue(value, ArticleAuditSuccessMessage.class); } catch (IOException e) { e.printStackTrace(); } ArticleAuditSuccess auto = message.getData(); if (null != auto) { //Call the method to synchronize the thermal data in HBAESE Integer articleId = auto.getArticleId(); if (null != articleId) { articleQuantityService.dbToHbase(articleId); } } } } }
9.2 hot article synchronization
Create listening class: com heima. migration. kafka. listener. MigrationHotArticleListener
/** * Hot article monitoring class */ @Component @Log4j2 public class MigrationHotArticleListener implements KafkaListener<String, String> { /** * General conversion mapper */ @Autowired ObjectMapper mapper; /** * kafka Theme configuration */ @Autowired KafkaTopicConfig kafkaTopicConfig; /** * Hot article service injection */ @Autowired private ApHotArticleService apHotArticleService; @Override public String topic() { return kafkaTopicConfig.getHotArticle(); } /** * Listen for messages * * @param data * @param consumer */ @Override public void onMessage(ConsumerRecord<String, String> data, Consumer<?, ?> consumer) { log.info("kafka Hot data synchronization message received:{}", data); String value = (String) data.value(); if (null != value) { ApHotArticleMessage message = null; try { message = mapper.readValue(value, ApHotArticleMessage.class); } catch (IOException e) { e.printStackTrace(); } Integer articleId = message.getData().getArticleId(); if (null != articleId) { //Call the method to synchronize the thermal data in HBAESE apHotArticleService.hotApArticleSync(articleId); } } } }