[PDF][PDF] Impact of topic modeling on rule-based persian metaphor classification and its frequency estimation
H Abdi Ghavidel, P Khosravizadeh, A Rahimi - International Journal of …, 2015 - sharif.ir
H Abdi Ghavidel, P Khosravizadeh, A Rahimi
International Journal of Information and Communication Technology Research, 2015•sharif.irThe impact of several topic modeling techniques have been well established in many
various aspects of Persian language processing. In this paper, we choose to investigate the
influence of Latent Dirichlet Allocation technique in the metaphor processing aspect and
show this technique helps measure metaphor frequency effectively. In the first step, we apply
LDA on Persian or so-called Bijankhan corpus to extract classes containing the words which
share the most natural semantic proximity. Then, we develop a rule-based classifier for …
various aspects of Persian language processing. In this paper, we choose to investigate the
influence of Latent Dirichlet Allocation technique in the metaphor processing aspect and
show this technique helps measure metaphor frequency effectively. In the first step, we apply
LDA on Persian or so-called Bijankhan corpus to extract classes containing the words which
share the most natural semantic proximity. Then, we develop a rule-based classifier for …
Abstract
The impact of several topic modeling techniques have been well established in many various aspects of Persian language processing. In this paper, we choose to investigate the influence of Latent Dirichlet Allocation technique in the metaphor processing aspect and show this technique helps measure metaphor frequency effectively. In the first step, we apply LDA on Persian or so-called Bijankhan corpus to extract classes containing the words which share the most natural semantic proximity. Then, we develop a rule-based classifier for identifying natural and metaphorical sentences. The underlying assumption is that the classifier allocates a topic for each word in a sentence. If the overall topic of the sentence diverges from the topic of one of the words in the sentence, metaphoricity is detected. We run the classifier on whole the corpus and observed that roughly at least two and at most four sentence in the corpus carries metaphoricity. This classifier with an f-measure of 68.17% in a randomly 100 selected sentences promises that a LDA-based metaphoricty analysis seems efficient for Persian language processing.
sharif.ir
以上显示的是最相近的搜索结果。 查看全部搜索结果