JUCS - Journal of Universal Computer Science 31(8): 788-830, doi: 10.3897/jucs.121782
Enhancing Chatbot Responses through Improved T5 Model Incorporating Aggregated Multi-Head Attention Mechanism and Bidirectional Long Short-Term Memory
expand article infoMuthukumaran N., Vignesh A.
‡ Sri Eshwar College of Engineering, Coimbatore, India
Open Access
Abstract
Artificial Intelligence (AI) chatbots have become indispensable for natural language interaction, with transformer-based models driving advances in conversational agent (CA) systems. While state-of-the-art models like RoBERTa, ALSI-Transformer, MEDN-Transformer, SG-Net Transformer, BART, and GPT-3 have achieved remarkable context understanding and response generation, they still face limitations. These include challenges with context retention over extended interactions, syntactic ambiguities, and bias propagation from training data, raising concerns for ethical and interpretable AI systems. This research proposes an advanced transformer model, the Improved T5 (IT5), designed to address these issues. IT5 integrates Aggregated Multi-Head Attention (AMHA) and Bidirectional Long Short-Term Memory (BiLSTM) into the T5 framework to improve context retention, response nuance, and bias reduction. Additionally, a retraining mechanism updates IT5’s knowledge base with every 50 new question-answer pairs, ensuring fairness and relevance in chatbot responses. The model's performance was rigorously tested on the NarrativeQA, SQuAD, MS MARCO, and InsuranceQA datasets, where IT5 achieved top BLEU scores of 0.7533, 0.7012, 0.7155, and 0.7373, respectively. It consistently demonstrated lower WER scores of 0.1957, 0.2106, 0.2254, and 0.1953, and higher ROUGE-L scores of 0.8875, 0.8991, 0.8731, and 0.8933 across these datasets. IT5 also exceeded in accuracy (0.98, 0.97, 0.96, 0.97), precision (0.96, 0.98, 0.97, 0.96), recall (0.95, 0.96, 0.97, 0.98), and F1 scores (0.95, 0.98, 0.96, 0.96), surpassing six state-of-the-art models, namely RoBERTa, ALSI-Transformer, MEDN-Transformer, SG-Net Transformer, BART, and GPT-3. The findings demonstrate IT5’s superior ability to generate meaningful, fair, and high-quality responses, establishing it as a frontrunner for robust and ethical conversational AI across various applications.
Keywords
Artificial intelligence, Chatbot, Improved T5 Model, Aggregated Multi-Head Attention Mechanism, Bidirectional Long Short-Term Memory
login to comment