[PDF][PDF] Investigating Esperanto's Statistical Proportions Relative to other Languages using Neural Networks and Zipf's Law.
BZ Manaris, L Pellicoro, GJ Pothering… - Artificial Intelligence …, 2006 - researchgate.net
BZ Manaris, L Pellicoro, GJ Pothering, H Hodges
Artificial Intelligence and Applications, 2006•researchgate.netEsperanto is a constructed natural language, which was intended to be an easy-to-learn
lingua franca. Zipf's law models the statistical proportions of various phenomena in human
ecology, including natural languages. Given Esperanto's artificial origins, one wonders how
“natural” it appears, relative to other natural languages, in the context of Zipf's law. To
explore this question, we collected a total of 283 books from six languages: English, French,
German, Italian, Spanish, and Esperanto. We applied Zipf-based metrics on our corpus to …
lingua franca. Zipf's law models the statistical proportions of various phenomena in human
ecology, including natural languages. Given Esperanto's artificial origins, one wonders how
“natural” it appears, relative to other natural languages, in the context of Zipf's law. To
explore this question, we collected a total of 283 books from six languages: English, French,
German, Italian, Spanish, and Esperanto. We applied Zipf-based metrics on our corpus to …
Abstract
Esperanto is a constructed natural language, which was intended to be an easy-to-learn lingua franca. Zipf's law models the statistical proportions of various phenomena in human ecology, including natural languages. Given Esperanto’s artificial origins, one wonders how “natural” it appears, relative to other natural languages, in the context of Zipf’s law. To explore this question, we collected a total of 283 books from six languages: English, French, German, Italian, Spanish, and Esperanto. We applied Zipf-based metrics on our corpus to extract distributions for word, word distance, word bigram, word trigram, and word length for each book. Statistical analyses show that Esperanto’s statistical proportions are similar to those of other languages. We then trained artificial neural networks (ANNs) to classify books according to language. The ANNs achieved high accuracy rates (86.3% to 98.6%). Subsequent analysis identified German as having the most unique proportions, followed by Esperanto, Italian, Spanish, English, and French. Analysis of misclassified patterns shows that Esperanto’s statistical proportions resemble mostly those of German and Spanish, and least those of French and Italian.
researchgate.net
以上显示的是最相近的搜索结果。 查看全部搜索结果