[PDF][PDF] Address standardization using supervised machine learning

A Kaleem, KM Ghori, Z Khanzada, MN Malik - interpretation, 2011 - researchgate.net
A Kaleem, KM Ghori, Z Khanzada, MN Malik
interpretation, 2011researchgate.net
Data mining has become an important task of today's rich information environments. Strong
results are obtained through accurate historical reporting. Inaccurate and dirty records yield
weak and wrong analysis. Unfortunately, organizations store addresses in unstructured
formats resulting in multiple representations of same entities. These addresses need to be
cleansed before they can be used in mining data. In this paper we present a supervised
machine learning procedure, Hidden Markov Model (HMM). This automated probabilistic …
Abstract
Data mining has become an important task of today’s rich information environments. Strong results are obtained through accurate historical reporting. Inaccurate and dirty records yield weak and wrong analysis. Unfortunately, organizations store addresses in unstructured formats resulting in multiple representations of same entities. These addresses need to be cleansed before they can be used in mining data. In this paper we present a supervised machine learning procedure, Hidden Markov Model (HMM). This automated probabilistic approach is used to segment a set of Asian addresses into their atomic units and standardize them. Results of this technique show that it can also be used to standardize even a large set of complex and un-formatted addresses.
researchgate.net
以上显示的是最相近的搜索结果。 查看全部搜索结果