| نویسندگان | Behzad Soleimani Neysiani |
| همایش | 24th International Conference of Information Technology (IVUS 2019) |
| تاریخ برگزاری همایش | 2019-04-25 - 2019-04-27 |
| محل برگزاری همایش | 122 - Kaunas |
| ارائه به نام دانشگاه | دانشگاه لیتوانی |
| نوع ارائه | سخنرانی |
| سطح همایش | بین المللی |
چکیده مقاله
Triagers deal with bug reports in software triage
systems like Bugzilla to prioritizing, finding duplicates, and
assigning those to developers, which these processes should be
automated, especially for huge open source projects. These bug
reports must be mined by text mining, information retrieval, and
natural language processing techniques for automation processes.
There are many typos in user bug reports which cause low
accuracy for artificial intelligence techniques. These typos can be
detected based on standard dictionaries, but correction of these
typos needs human knowledge based on the context of bug reports.
It is important which neither Google Translator nor Microsoft
Office Word can detect interconnected terms –a common type of
typos in bug reports- having more than two meaningful terms.
This research provides a novel language-independent approach
for fast correction of interconnected typos based on natural
language processing and human neural network structure to
detect and correct interconnected typos. A new tree-based method
proposed for term matching and two algorithms proposed for fast
longest term finding in an interconnected typo. A dataset is used
including 180-kilo typos based on four famous bug report dataset
of Android, Eclipse, Mozilla Firefox, and Open Office projects.
Then proposed method evaluated on typos versus the state of the
art. The results show the runtime performance of the proposed
method is as same as the related works but the average words
length is improved and at least more than 57% of typos in the
dataset can be classified as interconnected typos.