What is model deployment?
In typical machine learning and deep learning projects, the conventional process of modeling is to define problems, data collection, data understanding, data processing and build models. However, if we want to provide the model to end users so that users can use it, we need to deploy the model. The work of model deployment is how to transfer the machine learning model to customers / stakeholders. The deployment of the model is roughly divided into the following three steps:
- Model persistence;
Persistence, in popular terms, means that temporary data (such as data in memory, which cannot be saved permanently) is persisted into persistent data (such as persistent into the database, which can be saved for a long time). The models we have trained are generally stored in memory. At this time, we need to use the persistence method. In Python, the commonly used model persistence method is generally in the form of files. - Select a suitable server to load the persistent model;
- Improve the service interface and facilitate the data exchange between the front and rear ends;
Introduction to model deployment tools
MLflow
MLeap
PMML
Dependent packages:
- sklearn
- sklearn2pmml
The trained machine learning model is transformed into PMML format for Java call.
Python code is as follows
from sklearn import tree from sklearn.datasets import load_iris from sklearn2pmml.pipeline import PMMLPipeline from sklearn2pmml import sklearn2pmml if __name__ == '__main__': iris = load_iris() # Classic data X = iris.data # Sample characteristics y = iris.target # Classification target pipeline = PMMLPipeline([("classifier", tree.DecisionTreeClassifier())]) # Classification by decision tree pipeline.fit(X, y) # train sklearn2pmml(pipeline, "iris.pmml", with_repr=True) # Output PMML file
Java reads the model file and predicts. The specific code is as follows:
import org.dmg.pmml.FieldName; import org.dmg.pmml.PMML; import org.jpmml.evaluator.*; import org.xml.sax.SAXException; import javax.xml.bind.JAXBException; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStream; import java.util.*; public class TestPmml { public static void main(String args[]) throws Exception { String fp = "iris.pmml"; TestPmml obj = new TestPmml(); Evaluator model = obj.loadPmml(fp); List<Map<String, Object>> inputs = new ArrayList<>(); inputs.add(obj.getRawMap(5.1, 3.5, 1.4, 0.2)); inputs.add(obj.getRawMap(4.9, 3, 1.4, 0.2)); for (int i = 0; i < inputs.size(); i++) { Map<String, Object> output = obj.predict(model, inputs.get(i)); System.out.println("X=" + inputs.get(i) + " -> y=" + output.get("y")); } } private Evaluator loadPmml(String fp) throws FileNotFoundException, JAXBException, SAXException { InputStream is = new FileInputStream(fp); PMML pmml = org.jpmml.model.PMMLUtil.unmarshal(is); try { is.close(); } catch (IOException e) { e.printStackTrace(); } ModelEvaluatorFactory factory = ModelEvaluatorFactory.newInstance(); return factory.newModelEvaluator(pmml); } private Map<String, Object> getRawMap(Object a, Object b, Object c, Object d) { Map<String, Object> data = new HashMap<String, Object>(); data.put("x1", a); data.put("x2", b); data.put("x3", c); data.put("x4", d); return data; } /** * Run the model and get the results. */ private Map<String, Object> predict(Evaluator evaluator, Map<String, Object> data) { Map<FieldName, FieldValue> input = getFieldMap(evaluator, data); Map<String, Object> output = evaluate(evaluator, input); return output; } /** * Convert the original input to PMML format input. */ private Map<FieldName, FieldValue> getFieldMap(Evaluator evaluator, Map<String, Object> input) { List<InputField> inputFields = evaluator.getInputFields(); Map<FieldName, FieldValue> map = new LinkedHashMap<FieldName, FieldValue>(); for (InputField field : inputFields) { FieldName fieldName = field.getName(); Object rawValue = input.get(fieldName.getValue()); FieldValue value = field.prepare(rawValue); map.put(fieldName, value); } return map; } /** * Run the model and get the results. */ private Map<String, Object> evaluate(Evaluator evaluator, Map<FieldName, FieldValue> input) { Map<FieldName, ?> results = evaluator.evaluate(input); List<TargetField> targetFields = evaluator.getTargetFields(); Map<String, Object> output = new LinkedHashMap<String, Object>(); for (int i = 0; i < targetFields.size(); i++) { TargetField field = targetFields.get(i); FieldName fieldName = field.getName(); Object value = results.get(fieldName); if (value instanceof Computable) { Computable computable = (Computable) value; value = computable.getResult(); } output.put(fieldName.getValue(), value); } return output; } }
Pyspark
Sklearn
ONNX introduction
Can pass Example – convert python model to ONNX format Understand simple model transformation. For details, please refer to ONNX official tutorial
TensorRT
TensorFlow Serving
Web service deployment
This method mainly packages the prediction model into the form of Web service interface through some Web frameworks. It is a common online deployment method. Common Web frameworks are as follows:
Docker rookie tutorial
reference resources
- Deep learning model deployment technology scheme
- On the deployment of machine learning model
- MLflow: a machine learning life cycle management platform
- Summarize several ways of model engineering deployment
- Cross platform online implementation of machine learning model with PMML
- PMML of machine learning model