JUCS - Journal of Universal Computer Science 29(9): 1069-1089, doi: 10.3897/jucs.94657
PlantKViT: A Combination Model of Vision Transformer and KNN for Forest Plants Classification
expand article infoNguyen Van Hieu, Ngo Le Huy Hien§, Luu Van Huy, Nguyen Huy Tuong, Pham Thi Kim Thoa
‡ The University of Danang - University of Science and Technology, Da nang, Vietnam§ Leeds Beckett University, Leeds, United Kingdom
Open Access
The natural ecosystem incorporates thousands of plant species and distinguishing them is normally manual, complicated, and time-consuming. Since the task requires a large amount of expertise, identifying forest plant species relies on the work of a team of botanical experts. The emergence of Machine Learning, especially Deep Learning, has opened up a new approach to plant classification. However, the application of plant classification based on deep learning models remains limited. This paper proposed a model, named PlantKViT, combining Vision Transformer architecture and the KNN algorithm to identify forest plants. The proposed model provides high efficiency and convenience for adding new plant species. The study was experimented with using Resnet-152, ConvNeXt networks, and the PlantKViT model to classify forest plants. The training and evaluation were implemented on the dataset of DanangForestPlant, containing 10,527 images and 489 species of forest plants. The accuracy of the proposed PlantKViT model reached 93%, significantly improved compared to the ConvNeXt model at 89% and the Resnet-152 model at only 76%. The authors also successfully developed a website and 2 applications called ‘plant id’ and ‘Danangplant’ on the iOS and Android platforms respectively. The PlantKViT model shows the potential in forest plant identification not only in the conducted dataset but also worldwide. Future work should gear toward extending the dataset and enhance the accuracy and performance of forest plant identification.
Forest plants, Plant identification, Resnet-152, ConvNeXt, Transformer-Learning, Deep learning models, K- nearest-neighbor