Effect of Typos Correction on the validation performance of Duplicate Bug Reports Detection

AuthorsBehzad Soleimani Neysiani
Conference Title10th International Conference on Information and Knowledge Technology (IKT)
Holding Date of Conference2019-12-31 - 2021-01-01
Event Place1 - تهران
Presented byپژوهشگاه ارتباطات و فناوری اطلاعات
PresentationSPEECH
Conference LevelInternational Conferences

Abstract

Typos are usual in human typings like bug reports in software triage systems. More than half the percentages of bug reports have typos. Interconnected typos are a common type of typos in bug reports. There are some heuristic and non-heuristic approaches for automatic typo correction. Also, there are four datasets, including Android, Eclipse, Mozilla, and Open Office, which their typos are determined, and some of them are corrected. This study involves to evaluated the effect of typo correction on duplicate bug report detection (DBRD). The experimental results on the Android dataset show the typos correction can improve the validation performance of DBRD at most 1% averagely, which is negligible. Also, it is better to do not remove the typos from bug reports for DBRD. The automatic typo correction can be useful in DBRD a little as a pre-processing operator, but it can be more helpful when the users are writing the bug reports, which can correct their typos in realtime.

Paper URL

tags: Typo; Correction; Duplicate; Bug Report; Text Mining; Information Retrieval;