查看文章

Combining contextualized word representation and sub-document level analysis through Bi-LSTM+ CRF architecture for clinical de-identification

作者

Rosario Catelli, Valentina Casola, Giuseppe De Pietro, Hamido Fujita, Massimo Esposito

发表日期

2021/2/15

期刊

Knowledge-Based Systems

卷号

213

页码范围

106649

出版商

Elsevier

简介

Clinical de-identification aims to identify Protected Health Information in clinical data, enabling data sharing and publication. First automatic de-identification systems were based on rules or on machine learning methods, limited by language changes, lack of context awareness and time consuming feature engineering. Newer deep learning techniques for sequence labeling have shown better results with a reduction in feature engineering efforts and the use of word representation techniques in vector space. However, they are not able to jointly represent the polysemic and context-dependent nature of words, as well as their morpho-syntactic mutations characteristic of handwriting. To address these limitations, a new de-identification approach based on deep learning techniques for Named Entity Recognition has been proposed, whose key factors are: (i) a Bidirectional Long Short-Term Memory + Conditional …

引用总数

被引用次数：55

20212022202320248 23 16 8

学术搜索中的文章

Combining contextualized word representation and sub-document level analysis through Bi-LSTM+ CRF architecture for clinical de-identification

R Catelli, V Casola, G De Pietro, H Fujita, M Esposito - Knowledge-Based Systems, 2021

被引用次数：55 相关文章所有 4 个版本