ActiveMQ Consumer Plug-in Development Notes

Posted by galmar on Sat, 22 Jan 2022 15:09:53 +0100

ActiveMQ Consumer Plug-in Development Notes

source code

https://github.com/tangwenixng/soyuan-activemq-plugin

overview

premise

The plug-in is developed based on kettle 8.1.0.0-365

If it is another version, it is not guaranteed to be available. (because the inherited parent classes such as BaseStreamingDialog will change with the version)

This plug-in is written in imitation of the official Kafka plug-in source code:

https://github.com/pentaho/big-data-plugin/tree/master/kettle-plugins/kafka

topic is not supported for the time being. You can modify the source code if necessary (the engineering quantity should be small).

Required template

First, you must create the following four classes:

  • ActiveMQConsumer extends BaseStreamStep implements StepInterface
  • ActiveMQConsumerData extends TransExecutorData implements StepDataInterface
  • ActiveMQConsumerDialog extends BaseStreamingDialog implements StepDialogInterface
  • ActiveMQConsumerMeta extends BaseStreamStepMeta implements StepMetaInterface

Note that the parent class inherited by these four classes is special, which is different from the general steps. The plug-in inherits BaseStep***

Then create a multilingual (resource) configuration file: the structure is shown in the following figure

Next, we will explain the four classes just listed.

ActiveMQConsumerMeta

ActiveMQConsumerMeta is a very important class.

  1. The attribute values (such as Text box) seen in the visual Dialog will be saved to the corresponding member variables in ActiveMQConsumerMeta when you click the confirm button. When the step interface Dialog is opened for the first time (i.e. when the open method is used - as will be described later), the member variable is also read from ActiveMQConsumerMeta and assigned to the Text box.
  2. When the Save button is clicked in the kettle editing interface, the attributes in ActiveMQConsumerMeta will be written to the file (ktr) through the getXML() method. When the run button is clicked, kettle will call loadXML() to read the contents of the ktr file into the ActiveMQConsumerMeta member variable. Similarly, readRep and saveRep.

The main work of the Meta class is described above, and then the points needing attention in the code are described in detail:

Step annotation

@Step(
        id = "ActiveMQConsumer",
        name = "ActiveMQConsumer.TypeLongDesc",
        description = "ActiveMQConsumer.TypeTooltipDesc",
        image = "com/soyuan/steps/activemq/resources/activemq.svg",
        categoryDescription = "i18n:org.pentaho.di.trans.step:BaseStep.Category.Streaming",
        i18nPackageName = "com.soyuan.steps.activemq",
        documentationUrl = "ActiveMQConsumer.DocumentationURL",
        casesUrl = "ActiveMQConsumer.CasesURL",
        forumUrl = "ActiveMQConsumer.ForumURL"
)
@InjectionSupported(localizationPrefix = "ActiveMQConsumerMeta.Injection.")

@The step annotation is a specification for defining steps. kettle will automatically scan this annotation and inject it into the plug-in container.

  • id must be globally unique
  • Name: that is, the plug-in name we see in the visual interface. Followed by activemqconsumer Typelongdesc refers to the properties in the configuration file properties
  • @InjectionSupported(localizationPrefix = "ActiveMQConsumerMeta.Injection.") Activemqconsumermeta. In Injection. It needs to be used in conjunction with the member variable in activemqconsumermeta. For example:
/**
     * Connection address
     */
@Injection( name = "BROKER_URL" )
private String brokerUrl;

Broker here_ URL and activemqconsumermeta Injection. Together, it becomes activemqconsumer Injection. BROKER_URL.

This property is also configured in the configuration file properties

Construction method

public ActiveMQConsumerMeta() {
  super();
  ...
  setSpecificationMethod(ObjectLocationSpecificationMethod.FILENAME);
}
  • Note: specify setSpecificationMethod(ObjectLocationSpecificationMethod.FILENAME); Objectlocationspecificationmethod. Is set here The filename value is displayed in activemqconsumerdialog Used by getdata()

Interface method

@Override
public StepInterface getStep(StepMeta stepMeta, StepDataInterface stepDataInterface, int copyNr, TransMeta transMeta, Trans trans) {
  return new ActiveMQConsumer(stepMeta, stepDataInterface, copyNr, transMeta, trans);
}

@Override
public StepDataInterface getStepData() {
  return new ActiveMQConsumerData();
}

These two methods must be implemented by the interface. Just follow the template

Member variable

Look at the code comments

//Fixed usage. Read the configuration from the configuration file with BaseMessages class
private static Class<?> PKG = ActiveMQConsumerMeta.class;

/**
 * The following static variables are used to define the tag tag in the xml
 */
public static final String BROKER_URL = "brokerUrl";
public static final String QUEUE_NAME = "queue";

public static final String TRANSFORMATION_PATH = "transformationPath";
public static final String BATCH_SIZE = "batchSize";
public static final String BATCH_DURATION = "batchDuration";

public static final String OUTPUT_FIELD_TAG_NAME = "OutputField";
public static final String INPUT_NAME_ATTRIBUTE = "input";
public static final String TYPE_ATTRIBUTE = "type";

public static final String ADVANCED_CONFIG = "advancedConfig" ;
private static final String CONFIG_OPTION = "option";
private static final String OPTION_PROPERTY = "property";
private static final String OPTION_VALUE = "value";


/**
     * Connection address
     */
@Injection( name = "BROKER_URL" )
private String brokerUrl;

/**
 * Queue name
 */
@Injection(name="QUEUE")
private String queue;

/**
 * Injected configuration: Note: transient
 * Where is the assigned value in the - Dialog
 */
@Injection(name = "NAMES", group = "CONFIGURATION_PROPERTIES")
protected transient List<String> injectedConfigNames;

@Injection(name = "VALUES", group = "CONFIGURATION_PROPERTIES")
protected transient List<String> injectedConfigValues;

private ActiveMQConsumerField msgIdField;
private ActiveMQConsumerField msgField;
private ActiveMQConsumerField timestampField;

/**
     * Save advancedConfig option in xml
     */
private Map<String, String> config = new LinkedHashMap<>();

brokerUrl queue config msgIdField config and other variables are the core. They flow in Dialog and ActiveMQConsumer(StepInterface).

injectedConfigNames and injectedConfigValues are used to assist in generating config variables (they can be discarded)

The config variable corresponds to the attributes in the Options Tab and can be changed (deleted or added)

msgField is encapsulated into ActiveMQConsumerField enumeration class, which is easy to expand and flow. (more details later)

Other methods

@Override
public RowMeta getRowMeta(String origin, VariableSpace space) throws KettleStepException {
  RowMeta rowMeta = new RowMeta();
  putFieldOnRowMeta(getMsgIdField(), rowMeta, origin, space);
  putFieldOnRowMeta(getMsgField(), rowMeta, origin, space);
  putFieldOnRowMeta(getTimestampField(), rowMeta, origin, space);
  return rowMeta;
}

private void putFieldOnRowMeta(ActiveMQConsumerField field, RowMetaInterface rowMeta,
                               String origin, VariableSpace space) throws KettleStepException {
  if (field != null && !Utils.isEmpty(field.getOutputName())) {
    try {
      String value = space.environmentSubstitute(field.getOutputName());
      ValueMetaInterface v = ValueMetaFactory.createValueMeta(value,
                                                              field.getOutputType().getValueMetaInterfaceType());
      //Why set the step name here
      v.setOrigin(origin);
      rowMeta.addValueMeta(v);
    } catch (KettlePluginException e) {
      throw new KettleStepException(BaseMessages.getString(
        PKG,
        "ActiveMQConsumerInputMeta.UnableToCreateValueType",
        field
      ), e);
    }
  }
}

public List<ActiveMQConsumerField> getFieldDefinitions() {
  return Lists.newArrayList(getMsgIdField(), getMsgField(), getTimestampField());
}

protected void setField(ActiveMQConsumerField field) {
  field.getInputName().setFieldOnMeta(this, field);
}

  • getRowMeta is used to obtain the output fields, that is, which columns a row of data consists of. Called during step initialization (ActiveMQConsumer#init).
  • putFieldOnRowMeta assembles a column of data (data name and type)
  • getFieldDefinitions get the list of output fields (just a simple list of member variables)
  • Setfield (activemqconsumerfield) is more flexible here – it will be described later

ActiveMQConsumerDialog

ActiveMQConsumerDialog inherits BaseStreamingDialog. BaseStreamingDialog implements the open method, so you don't need to duplicate the open method. You just need to rewrite the following methods.

  • getDialogTitle() - set title
  • buildSetup(Composite wSetupComp) - implement the startup page (necessary information - server address, queue name)
  • getData() - override this method to set the information in the meta to the Text or other tabs (if any) of the element and parent class of the startup page
  • createAdditionalTabs() creates additional tabs in this method
  • additionalOks(BaseStreamStepMeta meta): click OK to save the data in Dialog to meta. Save startup page, additional Tab page data
  • getFieldNames() - if a Field Tab is created, it corresponds to output name (column 2)
  • getFieldTypes() - if a Field Tab is created, the corresponding Field Tab here is type (column 3)

Construction method

public ActiveMQConsumerDialog(Shell parent, Object in, TransMeta tr, String sname) {
  super(parent, in, tr, sname);
  this.consumerMeta = (ActiveMQConsumerMeta) in;
}

Note that the second parameter is object (actually ActiveMQConsumerMeta object)

getData()

@Override
protected void getData() {
  ...
  switch ( specificationMethod ) {
    case FILENAME:
      wTransPath.setText(Const.NVL(meta.getFileName(), ""));
      break;
    case REPOSITORY_BY_NAME:
      String fullPath = Const.NVL(meta.getDirectoryPath(), "") + "/" + Const.NVL(meta.getTransName(), "");
      wTransPath.setText(fullPath);
      break;
    case REPOSITORY_BY_REFERENCE:
      referenceObjectId = meta.getTransObjectId();
      getByReferenceData(referenceObjectId);
      break;
    default:
      break;
  }
  ...
}

This paragraph can be copied directly.

additionalOks()

Save the data in Dialog to meta. Save startup page, additional Tab page data

@Override
protected void additionalOks(BaseStreamStepMeta meta) {
  consumerMeta.setBrokerUrl(wBrokerUrl.getText());
  consumerMeta.setQueue(wQueue.getText());
  //Set the field value to meta
  setFieldsFromTable();
  //Set the value in option to meta
  setOptionsFromTable();
}

Notice that the setFieldsFromTable() method = > save field

/**
 * Set the field value to meta
 */
private void setFieldsFromTable() {
  int itemCount = fieldsTable.getItemCount();
  for (int rowIndex = 0; rowIndex < itemCount; rowIndex++) {
    TableItem row = fieldsTable.getTable().getItem(rowIndex);
    String inputName = row.getText(1);
    String outputName = row.getText(2);
    String outputType = row.getText(3);

    final ActiveMQConsumerField.Name ref = ActiveMQConsumerField.Name.valueOf(inputName.toUpperCase());

    final ActiveMQConsumerField field = new ActiveMQConsumerField(ref, outputName,
                                                                ActiveMQConsumerField.Type.valueOf(outputType));
    consumerMeta.setField(field);
  }
}

Instantiate each row of data in the Field Table into an ActiveMQConsumerField object, and then set it into the meta.

consumerMeta.setField(field); It will eventually call something like consumermeta For specific set methods such as setmsgfield, you can carefully study the ActiveMQConsumerField class

getFieldNames()

From the description, getFieldNames() and getFieldTypes() actually extract the values in Field Tab, but what are their actual functions?

As shown in the above figure, after clicking New and saving, the value in Field Tab will be displayed in the Get records from stream step in the new file

ActiveMQConsumerData

ActiveMQConsumerData inherits from transexecutiordata and has only one member variable rowmetainterface outputrowmeta = > to store [row metadata]

ActiveMQConsumer

ActiveMQConsumer inherits from BaseStreamStep, so there is no need to override processRow(), just the init() method.

@Override
public boolean init(StepMetaInterface stepMetaInterface, StepDataInterface stepDataInterface) {
  ActiveMQConsumerMeta meta = (ActiveMQConsumerMeta) stepMetaInterface;
  ActiveMQConsumerData data = (ActiveMQConsumerData) stepDataInterface;
  if (!super.init(meta,data)){
    logError(BaseMessages.getString(PKG, "ActiveMQConsumer.Error.InitFailed"));
    return false;
  }
  try {
    //Create [row metadata] - that is, which fields are exported
    data.outputRowMeta = meta.getRowMeta(getStepname(), this);
  } catch (KettleStepException e) {
    log.logError(e.getMessage(), e);
  }

  //Create activemq connection
  final Connection connection;
  try {
    connection = ActiveMQFactory.getConn(meta.getActiveMQEntity());
    //subtransExecutor: subtransformer executor
    window = new FixedTimeStreamWindow<>(
      subtransExecutor,
      data.outputRowMeta,
      getDuration(),
      getBatchSize());

    source = new ActiveMQStreamSource(connection, meta, data, this);
  } catch (JMSException e) {
    log.logError(e.getMessage(),e);
    return false;
  }
  return true;
}

The above is all about the init method. Let's look at it in sections.

try {
  //Create [row metadata] - that is, which fields are exported
  data.outputRowMeta = meta.getRowMeta(getStepname(), this);
} catch (KettleStepException e) {
  log.logError(e.getMessage(), e);
}

meta.getRowMeta(getStepname(), this); It has just been introduced in ActiveMQConsumerMeta. It mainly constructs [row data] - that is, column name and type.

connection = ActiveMQFactory.getConn(meta.getActiveMQEntity()); Obtain the server address, queue name and other information from the meta to obtain the connection.

//subtransExecutor: subtransformer executor
window = new FixedTimeStreamWindow<>(
  subtransExecutor,
  data.outputRowMeta,
  getDuration(),
  getBatchSize());

Fixed write like this, put data Outputrowmeta [row metadata] can be passed to the sub window

source = new ActiveMQStreamSource(connection, meta, data, this);

Source is a member variable of the parent BaseStreamStep, protected streamsource < list < Object > > source, so our ActiveMQStreamSource is the implementation class of streamsource < list < Object > >.

The main responsibility is to consume ActiveMQ data and then transfer it to the sub window. You don't need to care about how to transfer it.

Let's now look at the ActiveMQStreamSource code.

ActiveMQStreamSource

There is such a piece of code in the open() method:

final List<ValueMetaInterface> valueMetas = consumerData.outputRowMeta.getValueMetaList();
positions = new HashMap<>(valueMetas.size());

for (int i = 0; i < valueMetas.size(); i++) {
  for (ActiveMQConsumerField.Name name : ActiveMQConsumerField.Name.values()) {
    final ActiveMQConsumerField field = name.getFieldFromMeta(consumerMeta);
    String outputName = field.getOutputName();
    if (outputName != null && outputName.equals(valueMetas.get(i).getName())) {
      positions.putIfAbsent(name, i);
    }
  }
}

The purpose is to find the position of a column. If: Message-1 MessageId-2

callable = new ActiveMQConsumerCallable(connection, super::close);
future = executorService.submit(callable);

The specific consumer thread is activemqconsumercalable

while (!closed.get()) {
  final TextMessage msg = (TextMessage) consumer.receive(1000L);
  if (msg != null) {
    List<List<Object>> rows = new ArrayList<>(1);

    final List<Object> row = processMessageAsRow(msg);
    rows.add(row);

    acceptRows(rows);

    session.commit();
  }
}

activemq has been trying to pull the data, if there is data, call processMessageAsRow(msg) to process the data, and then call acceptRows(rows) to the subsequent step processing.

List<Object> processMessageAsRow(TextMessage msg) throws JMSException {
  Object[] rowData = RowDataUtil.allocateRowData(consumerData.outputRowMeta.size());

  if (positions.get(ActiveMQConsumerField.Name.MESSAGEID) != null) {
    rowData[positions.get(ActiveMQConsumerField.Name.MESSAGEID)] = msg.getJMSMessageID();
  }

  if (positions.get(ActiveMQConsumerField.Name.MESSAGE) != null) {
    rowData[positions.get(ActiveMQConsumerField.Name.MESSAGE)] = msg.getText();
  }

  if (positions.get(ActiveMQConsumerField.Name.TIMESTAMP) != null) {
    rowData[positions.get(ActiveMQConsumerField.Name.TIMESTAMP)] = msg.getJMSTimestamp();
  }

  return Arrays.asList(rowData);
}

processMessageAsRows actually inserts the data obtained from active mq into the corresponding column (which is why positions = new HashMap < > (valuemeta. Size()) is needed at the beginning).

So far, the main steps of ActiveMQ Consumer plug-in development have been introduced.

Topics: Java kettle