Measurement of text similarity: a survey

J Wang, Y Dong - Information, 2020 - mdpi.com
Text similarity measurement is the basis of natural language processing tasks, which play an
important role in information retrieval, automatic question answering, machine translation …

Trajectories of efficiency measurement: A bibliometric analysis of DEA and SFA

HW Lampe, D Hilgers - European journal of operational research, 2015 - Elsevier
This study surveys the increasing research field of performance measurement by making
use of a bibliometric literature analysis. We concentrate on two approaches, namely Data …

[图书][B] Data cleaning

IF Ilyas, X Chu - 2019 - books.google.com
This is an overview of the end-to-end data cleaning process. Data quality is one of the most
important problems in data management, since dirty data often leads to inaccurate data …

[PDF][PDF] A survey on similarity measures in text mining

MK Vijaymeena, K Kavitha - Machine Learning and Applications: An …, 2016 - academia.edu
The Volume of text resources have been increasing in digital libraries and internet.
Organizing these text documents has become a practical need. For organizing great number …

[PDF][PDF] A survey of text similarity approaches

WH Gomaa, AA Fahmy - international journal of Computer Applications, 2013 - Citeseer
Measuring the similarity between words, sentences, paragraphs and documents is an
important component in various tasks such as information retrieval, document clustering …

[图书][B] The data matching process

P Christen, P Christen - 2012 - Springer
This chapter provides an overview of the data matching process, and describes the five
major steps involved in this process: data pre-processing (cleaning and standardisation) …

Fixminer: Mining relevant fix patterns for automated program repair

A Koyuncu, K Liu, TF Bissyandé, D Kim, J Klein… - Empirical Software …, 2020 - Springer
Patching is a common activity in software development. It is generally performed on a source
code base to address bugs or add new functionalities. In this context, given the recurrence of …

[PDF][PDF] The stringdist package for approximate string matching.

MPJ Van der Loo - R J., 2014 - journal.r-project.org
Comparing text strings in terms of distance functions is a common and fundamental task in
many statistical text-processing applications. Thus far, string distance functionality has been …

Using a probabilistic model to assist merging of large-scale administrative records

T Enamorado, B Fifield, K Imai - American Political Science Review, 2019 - cambridge.org
Since most social science research relies on multiple data sources, merging data sets is an
essential part of researchers' workflow. Unfortunately, a unique identifier that unambiguously …

Data-Centric Systems and Applications

MJ Carey, S Ceri, P Bernstein, U Dayal, C Faloutsos… - Italy: Springer, 2006 - Springer
The rapid growth of the Web in the past two decades has made it the largest publicly
accessible data source in the world. Web mining aims to discover useful information or …