1, Code encapsulation and external interfaces
In the process of code encapsulation, it should be noted that in the whole structure, many settlement results are dump ed locally to prevent repeated calculation each time. Therefore, the result of laod should be loaded into the content in advance, rather than calling the load semantic every time
1. Complete intent identification code encapsulation
Complete the code to judge the user's intention, that is, judge the classification of the user's input sentences using the fasttext model
import fastText import re from lib import jieba_cut fc_word_mode = fastText.load_model("./classify/data/ft_classify.model") fc_word_mode = fastText.load_model("./classify/data/ft_classify_words.model") def is_QA(sentence_info): python_qs_list = [" ".join(sentence_info["cuted_sentence"])] result = fc_word_mode.predict(python_qs_list) python_qs_list = [" ".join(sentence_info["cuted_word_sentence"])] words_result = fc_word_mode.predict(python_qs_list) for index, (label,acc,word_label,word_acc) in enumerate(zip(*result,*words_result)): label = label[0] acc = acc[0] word_label = word_label[0] word_acc = word_acc[0] #Label_ If the prediction result is QA, it shall prevail_ Chat, then label_ Probability of QA = 1-label_chat if label == "__label__chat": label = "__label__QA" acc = 1-acc if word_label == "__label__chat": word_label = "__label__QA" word_acc = 1 - word_acc if acc>0.95 or word_acc>0.95: #It's QA return True else: return False
2. Complete the encapsulation of chatbot code
Provide the interface of predict
""" Ready to chat model """ import pickle from lib import jieba_cut import numpy as np from chatbot import Sequence2Sequence class Chatbot: def __init__(self,ws_path="./chatbot/data/ws.pkl",save_path="./chatbot/model/seq2seq_chatbot.ckpt"): self.ws_chatbot = pickle.load(open(ws_path, "rb")) self.save_path = save_path #TODO ..... def predict(self,s): """ :param s:Without participle :param ws: :param ws_words: :return: """ #TODO ... return ans
3. Complete the packaging of the recall of the question and answer system
""" Method of recall """ import os import pickle class Recall: def __init__(self,topk=20): # Prepare modules such as mode for Q & A self.topk = topk def predict(self,sentence): """ :param sentence: :param debug: :return: [recall list],[entity] """ #TODO recall return recall_list def get_answer(self,s): return self.QA_dict[s]
4. Complete the encapsulation of the question and answer ranking model
""" Deep learning ranking """ import tensorflow as tf import pickle from DNN2 import SiamsesNetwork from lib import jieba_cut class DNNSort(): def __init__(self): #The mean value of word and word models is used as the final result self.dnn_sort_words = DNNSortWords() self.dnn_sort_single_word = DNNSortSingleWord() def predict(self,s,c_list): sort1 = self.dnn_sort_words.predict(s,c_list) sort2 = self.dnn_sort_single_word.predict(s,c_list) for i in sort1: sort1[i] = (sort1[i]+ sort2[i])/2 sorts = sorted(sort1.items(),key=lambda x:x[-1],reverse=True) return sorts[0][0],sorts[0][1] class DNNSortWords: def __init__(self,ws_path="./DNN2/data/ws_80000.pkl",save_path="./DNN2/model_keras/esim_model_softmax.ckpt"): self.ws = pickle.load(open(ws_path, "rb")) self.save_path = save_path #TOOD ... def predict(self,s,c_list): """ :param s:Without participle :param c_list: List with comparison :param ws: :param ws_words: :return: """ #TOOD ... return sim_dict class DNNSortSingleWord: def __init__(self,ws_path="./DNN2/data/ws_word.pkl",save_path="./DNN2/data/esim_word_model_softmax.ckpt"): self.ws = pickle.load(open(ws_path, "rb")) self.save_path = save_path #TOOD ... def predict(self,s,c_list): """ :param s:Without participle :param c_list: List with comparison :param ws: :param ws_words: :return: """ #TOOD ... return sim_dict
5. Realize the saving of chat records
For different users, a conversation within 10 minutes in a row is considered as one round of conversation. If there is no next conversation after 10 minutes, it is considered as the end of this round of conversation. If the conversation starts after 10 minutes, it is considered as the next round of conversation. In order to save the chat topics in different rounds, basic conversation management can be realized in the follow-up. For example, the user has just asked a question about python. If there is no subject in the subsequent question, then take Python in redis as its subject
The main implementation logic is:
- redis is used to store basic user data
- Using mongodb to store conversation records
The specific ideas are as follows:
- Obtain the dialogue id according to the user id, and judge whether the current dialogue exists according to the dialogue id
- If conversation id exists:
- Update the entity of the conversation, the time of the last conversation, and set the expiration time of the conversation id
- Save data to mongodb
- If the conversation id does not exist:
- Create user's basic information (user_id,entity, conversation time)
- Store the user's basic information in redis, and set the conversation id and expiration time at the same time
- Save data to mongodb
""" Get and update user information """ from pymongo import MongoClient import redis from uuid import uuid1 import time import json """ ### redis { user_id:"id", user_background:{} last_entity:[] last_conversation_time:int(time): } userid_conversation_id:"" ### monodb stores conversation records {user_id:,conversion_id:,from:user/bot,message:"",create_time,entity:[],attention:[]} """ HOST = "localhost" CNVERSION_EXPERID_TIME = 60 * 10 # 10 minutes. If there is no communication for 10 consecutive minutes, it means that the session is over class MessageManager: def __init__(self): self.client = MongoClient(host=HOST) self.m = self.client["toutiao"]["dialogue"] self.r = redis.Redis(host=HOST, port=6379, db=10) def last_entity(self, user_id): """Last time entity""" return json.loads(self.r.hget(user_id, "entity")) def gen_conversation_id(self): return uuid1().hex def bot_message_pipeline(self, user_id, message): """Save the reply record of the robot""" conversation_id_key = "{}_conversion_id".format(user_id) conversation_id = self.user_exist(conversation_id_key) if conversation_id: # Update conversation_ Expiration time of ID self.r.expire(conversation_id_key, CNVERSION_EXPERID_TIME) data = {"user_id": user_id, "conversation_id": conversation_id, "from": "bot", "message": message, "create_time": int(time.time()), } self.m.save(data) else: raise ValueError("No session id,But the robot tried to reply....") def user_message_pipeline(self, user_id, message, create_time, attention, entity=[]): # Identify user related information # 1. Does the user exist # 2.1 if the user exists, return the latest entity of the user and save the latest conversation # 3.1 judge whether it is a new conversation. If it is a new conversation, open a new reply and update the conversation information of the user # 3.2 if it is not a new conversation, update the conversation information of the user # 3. Update the user's basic information # 4 return user related information # 5. Call the prediction interface and send the dialog structure # The data to be saved is missing conversation_id data = { "user_id": user_id, "from": "user", "message": message, "create_time": create_time, "entity": json.dumps(entity), "attention": attention, } conversation_id_key = "{}_conversion_id".format(user_id) conversation_id = self.user_exist(conversation_id_key) print("conversation_id",conversation_id) if conversation_id: if entity: # Update the current user's last_entity self.r.hset(user_id, "last_entity", json.dumps(entity)) # Update last conversation time self.r.hset(user_id, "last_conversion_time", create_time) # Setting the expiration time of conversation self.r.expire(conversation_id_key, CNVERSION_EXPERID_TIME) # Save chat records to mongodb data["conversation_id"] = conversation_id self.m.save(data) print("mongodb Data saved successfully") else: # non-existent user_basic_info = { "user_id": user_id, "last_conversion_time": create_time, "last_entity": json.dumps(entity) } self.r.hmset(user_id, user_basic_info) print("redis Deposit user_basic_info success") conversation_id = self.gen_conversation_id() print("generate conversation_id",conversation_id) # Set the id of the session self.r.set(conversation_id_key, conversation_id, ex=CNVERSION_EXPERID_TIME) # Save chat records to mongodb data["conversation_id"] = conversation_id self.m.save(data) print("mongodb Data saved successfully") def user_exist(self, conversation_id_key): """ Determine whether the user exists :param user_id:user id :return: """ conversation_id = self.r.get(conversation_id_key) if conversation_id: conversation_id = conversation_id.decode() print("load conversation_id",conversation_id) return conversation_id
2, External interface
1. Use GRPC to provide external services
1.1 environment related to grpc installation
gRPC Installation of:`pip install grpcio` install ProtoBuf dependent python Dependent Library:`pip install protobuf` install python grpc of protobuf Compilation tool:`pip install grpcio-tools`
1.2 define the interface of GRPC
//chatbot.proto file syntax = "proto3"; message ReceivedMessage { string user_id = 1; //User id string user_message = 2; //Messages delivered by the current user int32 create_time = 3; //The time when the current message was sent } message ResponsedMessage { string user_response = 1; //Messages returned to users int32 create_time = 2; //Time returned to the user } service ChatBotService { rpc Chatbot (ReceivedMessage) returns (ResponsedMessage); }
1.3 compile and generate protobuf file
Compile with the following command to get chatbot_pb2.py and chatbot_pb2_grpc.py file
python -m grpc_tools.protoc -I. –python_out=. –grpc_python_out=. ./chatbot.proto
1.4 using grpc to provide services
import dialogue from classify import is_QA from dialogue.process_sentence import process_user_sentence from chatbot_grpc import chatbot_pb2_grpc from chatbot_grpc import chatbot_pb2 import time class chatServicer(chatbot_pb2_grpc.ChatBotServiceServicer): def __init__(self): #Load various models in advance self.recall = dialogue.Recall(topk=20) self.dnnsort = dialogue.DNNSort() self.chatbot = dialogue.Chatbot() self.message_manager = dialogue.MessageManager() def Chatbot(self, request, context): user_id = request.user_id message = request.user_message create_time = request.create_time #Basic processing of user's output, such as word segmentation message_info = process_user_sentence(message) if is_QA(message_info): attention = "QA" #Save dialog data self.message_manager.user_message_pipeline(user_id, message, create_time, attention, entity=message_info["entity"]) recall_list,entity = self.recall.predict(message_info) line, score = self.dnnsort.predict(message,recall_list) if score > 0.7: ans = self.recall.get_answer(line) user_response = ans["ans"] else: user_response = "Sorry, I haven't learned this problem yet..." else: attention = "chat" # Save dialog data self.message_manager.user_message_pipeline(user_id,message,create_time,attention,entity=message_info["entity"]) user_response = self.chatbot.predict(message) self.message_manager.bot_message_pipeline(user_id,user_response) user_response = user_response create_time = int(time.time()) return chatbot_pb2.ResponsedMessage(user_response=user_response,create_time=create_time) def serve(): import grpc from concurrent import futures # Multithreaded server server = grpc.server(futures.ThreadPoolExecutor(max_workers=10)) # Register local service chatbot_pb2_grpc.add_ChatBotServiceServicer_to_server(chatServicer(), server) # Listening port server.add_insecure_port("[::]:9999") # Start receiving requests for service server.start() # Use ctrl+c to exit the service try: time.sleep(1000) except KeyboardInterrupt: server.stop(0) if __name__ == '__main__': serve()
2. Use supervisor to complete the management of services
2.1 write simple execution script
#!/bin/bash cd `$dirname`|exit 0 #source activate ds python grpc_predict.py
Add executable permission: chmod +x file name
2.2 installation and configuration of supervisor
The current official version of supervisor is still python2, but you can use the following command to install the python3 version
pip3 install git+https://github.com/Supervisor/supervisor
-
Complete the preparation of supervisor configuration file, and use semicolon as annotation symbol in conf
;conf.d [program:chat_service] command=/root/chat_service/run.sh ;Commands executed stdout_logfile=/root/chat_service/log/out.log ;log Location of stderr_logfile=/root/chat_service/log/error.log ;error log Location of directory=/root/chat_service ;route autostart=true ;Auto start autorestart=true ;Whether to restart automatically startretries=10 ;Maximum number of failed attempts
-
Add the above configuration file to the basic configuration of supervisor
;/etc/supervisord/supervisor.conf [include] files=/root/chat_service/conf.d
-
Run Supervisor
supervisord -c /etc/supervisord/supervisor.conf