Improving embedding-based unsupervised keyphrase extraction by incorporating structural information

M Song, H Liu, Y Feng, L Jing - Findings of the Association for …, 2023 - aclanthology.org
M Song, H Liu, Y Feng, L Jing
Findings of the Association for Computational Linguistics: ACL 2023, 2023aclanthology.org
Keyphrase extraction aims to extract a set of phrases with the central idea of the source
document. In a structured document, there are certain locations (eg, the title or the first
sentence) where a keyphrase is most likely to appear. However, when extracting
keyphrases from the document, most existing embedding-based unsupervised keyphrase
extraction models ignore the indicative role of the highlights in certain locations, leading to
wrong keyphrases extraction. In this paper, we propose a new Highlight-Guided …
Abstract
Keyphrase extraction aims to extract a set of phrases with the central idea of the source document. In a structured document, there are certain locations (eg, the title or the first sentence) where a keyphrase is most likely to appear. However, when extracting keyphrases from the document, most existing embedding-based unsupervised keyphrase extraction models ignore the indicative role of the highlights in certain locations, leading to wrong keyphrases extraction. In this paper, we propose a new Highlight-Guided Unsupervised Keyphrase Extraction model (HGUKE) to address the above issue. Specifically, HGUKE first models the phrase-document relevance via the highlights of the documents. Next, HGUKE calculates the cross-phrase relevance between all candidate phrases. Finally, HGUKE aggregates the above two relevance as the importance score of each candidate phrase to rank and extract keyphrases. The experimental results on three benchmarks demonstrate that HGUKE outperforms the state-of-the-art unsupervised keyphrase extraction baselines.
aclanthology.org
以上显示的是最相近的搜索结果。 查看全部搜索结果