Please use this identifier to cite or link to this item: https://rda.sliit.lk/handle/123456789/3395
Full metadata record
DC FieldValueLanguage
dc.contributor.author.Wijesinghe, W.M.S.K-
dc.contributor.authorTissera, M-
dc.date.accessioned2023-05-16T06:58:19Z-
dc.date.available2023-05-16T06:58:19Z-
dc.date.issued2022-12-09-
dc.identifier.citationW. M. S. K. Wijesinghe and M. Tissera, "Sinhala Named Entity Recognition Model: Domain-Specific Classes in Sports," 2022 4th International Conference on Advancements in Computing (ICAC), Colombo, Sri Lanka, 2022, pp. 138-143, doi: 10.1109/ICAC57685.2022.10025148.en_US
dc.identifier.issn979-8-3503-9810-6-
dc.identifier.urihttps://rda.sliit.lk/handle/123456789/3395-
dc.description.abstractNamed Entity Recognition (NER) is one of the crucial and vital subtasks that must be solved in most Natural Language Processing (NLP) tasks. However, constructing a NER system for the Sinhala Language is challenging. Because it comes under the category of low-resource languages. Therefore, the proposed approach attempted designing a mechanism to identify specific named entities in the sports domain. Firstly, a domain-specific corpus was built using Sinhala sport e-News articles. Then a semi-automated, rule-based component named as “Class_Label_Suggester” was built to annotate pre-defined named entities. After auto annotation, the outcome was further validated manually with a little effort. Finally, it was trained using the annotated data. Linear Perceptron, Stochastic Gradient Descent (SGD), Multinomial Naive Bayes (MNB), and Passive Aggressive classifiers were used to train the NER model. Though, the above Machine Learning (ML) algorithms showed approximately 98% accuracy, the MNB model demonstrated highest accuracy for the identified class labels of which, 99.76% for ‘Ground’, 99.53% for ‘School’, 98.55% for ‘Tournament’, and 97.87% for ‘Other’ classes. Additionally, high precision values of the above classes were 81%, 72%, 62%, and 98% respectively. An accurately annotated Sinhala dataset and the trained Sinhala NER model are main contributions of the study.en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.ispartofseries2022 4th International Conference on Advancements in Computing (ICAC);-
dc.subjectSportsen_US
dc.subjectDomain-Specific Classesen_US
dc.subjectRecognition Modelen_US
dc.subjectNamed Entityen_US
dc.subjectSinhala Nameden_US
dc.titleSinhala Named Entity Recognition Model: Domain-Specific Classes in Sportsen_US
dc.typeArticleen_US
dc.identifier.doi10.1109/ICAC57685.2022.10025148en_US
Appears in Collections:4th International Conference on Advancements in Computing (ICAC) | 2022

Files in This Item:
File Description SizeFormat 
Sinhala_Named_Entity_Recognition_Model_Domain-Specific_Classes_in_Sports.pdf
  Until 2050-12-31
471.44 kBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.