Methods for Handling Spontaneous Health Arabic Queries using unsupervised machine learning
Document Type
Article
Publication Date
Winter 12-21-2022
Abstract
The goal of this work is to demonstrate that using mixed sublanguage and linguistic processing techniques, is both essential and possible to create a robust NL-based systems. The merging of accurate language processing with the analysis of the sublanguage will undoubtedly improve the processing's correctness and resilience. As a proof-of-concept, we created an experimental system (HASE) to test this hypothesis. The system is a search system for Arabic documents in the health and medical domain. To study the sublanguage we employed machine learning techniques. The initial corpus consists of 40 thousands unedited queries. HASE is built on top of SOLR with the integration of Arabic linguistic processing Component. Responses are generated using IR approach. Altibby is actively deploying HASE in Jordan (the largest health content). The IR component achieves a 90% f-measure when tested with actual noisy free text.
Recommended Citation
D. Daoud, S. A. El-Seoud, F. Alhosban and A. Farhat, "Methods for Handling Spontaneous Health Arabic Queries using unsupervised machine learning," 2022 International Conference on Computer and Applications (ICCA), Cairo, Egypt, 2022, pp. 1-6, doi: 10.1109/ICCA56443.2022.10039617.