Article
Journal :
Journal of Intelligent Information Systems
ISSN : 1573-7675
Publisher :
Information
Period : May 2022
Volume : 59 Number : 2
Pages : 501-522
Details
SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis
The main objective of multilingual sentiment analysis is to analyze reviews regardless of the original language in which they are written. Switching from one language to another is very common on social media platforms. Analyzing these multilingual reviews is a challenge since each language is different in terms of syntax, grammar, etc. This paper presents a new language-independent representation approach for sentiment analysis, SentiCode. Unlike previous work in multilingual sentiment analysis, the proposed approach does not rely on machine translation to bridge the gap between different languages. Instead, it exploits common features of languages, such as part-of-speech tags used in Universal Dependencies. Equally important, SentiCode enables sentiment analysis in multi-language and multi-domain environments simultaneously. Several experiments were conducted using machine/deep learning techniques to evaluate the performance of SentiCode in multilingual (English, French, German, Arabic, and Russian) and multi-domain environments. In addition, the vocabulary proposed by SentiCode and the effect of each token were evaluated by the ablation method. The results highlight the 70% accuracy of SentiCode, with the best trade-off between efficiency and computing time (training and testing) in a total of about 0.67 seconds, which is very convenient for real-time applications.
Key words :
Multilingual sentiment analysis SentiCode Machine learning Natural language processing Part-of-speech tags
Ref. laboratory citation :
misc-lab-376
DOI :
10.1007/s10844-022-00714-8
Link :
Texte intégral
ACM :
M. R. Kanfoud and A. Bouramoul. 2022. SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis. Journal of Intelligent Information Systems, 59, 2 (May 2022), Springer, 501-522. DOI: https://doi.org/10.1007/s10844-022-00714-8.
APA :
Kanfoud, M. R. & Bouramoul, A. (2022, May). SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis. Journal of Intelligent Information Systems, 59(2), Springer, 501-522. DOI: https://doi.org/10.1007/s10844-022-00714-8
IEEE :
M. R. Kanfoud and A. Bouramoul, "SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis". Journal of Intelligent Information Systems, vol. 59, no. 2, Springer, pp. 501-522, May, 2022. DOI: https://doi.org/10.1007/s10844-022-00714-8.
BibTeX :
@article{misc-lab-376,
author = {Kanfoud, Mohamed Raouf and Bouramoul, Abdelkrim},
title = {SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis},
journal = {Journal of Intelligent Information Systems},
volume = {59},
number = {2},
issn = {1573-7675},
pages = {501--522},
publisher = {Springer},
year = {2022},
month = {May},
doi = {10.1007/s10844-022-00714-8},
url = {https://link.springer.com/article/10.1007/s10844-022-00714-8},
keywords = {Multilingual sentiment analysis, SentiCode, Machine learning, Natural language processing, Part-of-speech tags}
}
RIS :
TI  - SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis
AU - M. R. Kanfoud
AU - A. Bouramoul
PY - 2022
SN - 1573-7675
JO - Journal of Intelligent Information Systems
VL - 59
IS - 2
SP - 501
EP - 522
PB - Springer
AB - The main objective of multilingual sentiment analysis is to analyze reviews regardless of the original language in which they are written. Switching from one language to another is very common on social media platforms. Analyzing these multilingual reviews is a challenge since each language is different in terms of syntax, grammar, etc. This paper presents a new language-independent representation approach for sentiment analysis, SentiCode. Unlike previous work in multilingual sentiment analysis, the proposed approach does not rely on machine translation to bridge the gap between different languages. Instead, it exploits common features of languages, such as part-of-speech tags used in Universal Dependencies. Equally important, SentiCode enables sentiment analysis in multi-language and multi-domain environments simultaneously. Several experiments were conducted using machine/deep learning techniques to evaluate the performance of SentiCode in multilingual (English, French, German, Arabic, and Russian) and multi-domain environments. In addition, the vocabulary proposed by SentiCode and the effect of each token were evaluated by the ablation method. The results highlight the 70% accuracy of SentiCode, with the best trade-off between efficiency and computing time (training and testing) in a total of about 0.67 seconds, which is very convenient for real-time applications.
KW - Multilingual sentiment analysis
KW - SentiCode
KW - Machine learning
KW - Natural language processing
KW - Part-of-speech tags
DO - 10.1007/s10844-022-00714-8
UR - https://link.springer.com/article/10.1007/s10844-022-00714-8
ID - misc-lab-376
ER -