| Authors | Behzad Soleimani Neysiani |
| Conference Title | 24th International Conference of Information Technology (IVUS 2019) |
| Holding Date of Conference | 2019-04-25 - 2019-04-27 |
| Event Place | 122 - Kaunas |
| Presented by | دانشگاه لیتوانی |
| Presentation | SPEECH |
| Conference Level | International Conferences |
Abstract
Triagers deal with bug reports in software triage
systems like Bugzilla to prioritizing, finding duplicates, and
assigning those to developers, which these processes should be
automated, especially for huge open source projects. These bug
reports must be mined by text mining, information retrieval, and
natural language processing techniques for automation processes.
There are many typos in user bug reports which cause low
accuracy for artificial intelligence techniques. These typos can be
detected based on standard dictionaries, but correction of these
typos needs human knowledge based on the context of bug reports.
It is important which neither Google Translator nor Microsoft
Office Word can detect interconnected terms –a common type of
typos in bug reports- having more than two meaningful terms.
This research provides a novel language-independent approach
for fast correction of interconnected typos based on natural
language processing and human neural network structure to
detect and correct interconnected typos. A new tree-based method
proposed for term matching and two algorithms proposed for fast
longest term finding in an interconnected typo. A dataset is used
including 180-kilo typos based on four famous bug report dataset
of Android, Eclipse, Mozilla Firefox, and Open Office projects.
Then proposed method evaluated on typos versus the state of the
art. The results show the runtime performance of the proposed
method is as same as the related works but the average words
length is improved and at least more than 57% of typos in the
dataset can be classified as interconnected typos.