Daniel Deutsch
Daniel Deutsch
在 seas.upenn.edu 的电子邮件经过验证 - 首页
Towards question-answering as an automatic metric for evaluating the content quality of a summary
D Deutsch, T Bedrax-Weiss, D Roth
Transactions of the Association for Computational Linguistics 9, 774-789, 2021
A statistical analysis of summarization evaluation metrics using resampling methods
D Deutsch, R Dror, D Roth
Transactions of the Association for Computational Linguistics 9, 1132-1146, 2021
Understanding the extent to which content quality metrics measure the information quality of summaries
D Deutsch, D Roth
Proceedings of the 25th Conference on Computational Natural Language …, 2021
The devil is in the errors: Leveraging large language models for fine-grained machine translation evaluation
P Fernandes, D Deutsch, M Finkelstein, P Riley, AFT Martins, G Neubig, ...
arXiv preprint arXiv:2308.07286, 2023
SacreROUGE: An open-source library for using and developing summarization evaluation metrics
D Deutsch, D Roth
arXiv preprint arXiv:2007.05374, 2020
Re-examining system-level correlations of automatic summarization evaluation metrics
D Deutsch, R Dror, D Roth
arXiv preprint arXiv:2204.10216, 2022
On the limitations of reference-free evaluations of generated text
D Deutsch, R Dror, D Roth
arXiv preprint arXiv:2210.12563, 2022
Results of WMT23 metrics shared task: Metrics might be guilty but references are not innocent
M Freitag, N Mathur, C Lo, E Avramidis, R Rei, B Thompson, T Kocmi, ...
Proceedings of the Eighth Conference on Machine Translation, 578-628, 2023
A general-purpose algorithm for constrained sequential inference
D Deutsch, S Upadhyay, D Roth
Proceedings of the 23rd Conference on Computational Natural Language …, 2019
Gemv2: Multilingual nlg benchmarking in a single line of code
S Gehrmann, A Bhattacharjee, A Mahendiran, A Wang, A Papangelis, ...
arXiv preprint arXiv:2206.11249, 2022
The eval4nlp 2023 shared task on prompting large language models as explainable metrics
C Leiter, J Opitz, D Deutsch, Y Gao, R Dror, S Eger
arXiv preprint arXiv:2310.19792, 2023
MetricX-23: The Google submission to the WMT 2023 metrics shared task
J Juraska, M Finkelstein, D Deutsch, A Siddhant, M Mirzazadeh, M Freitag
Proceedings of the Eighth Conference on Machine Translation, 756-767, 2023
A distributional and orthographic aggregation model for english derivational morphology
D Deutsch, J Hewitt, D Roth
Proceedings of the 56th Annual Meeting of the Association for Computational …, 2018
Ties matter: Meta-evaluating modern metrics with pairwise accuracy and tie calibration
D Deutsch, G Foster, M Freitag
arXiv preprint arXiv:2305.14324, 2023
Summary cloze: A new task for content selection in topic-focused summarization
D Deutsch, D Roth
Proceedings of the 2019 Conference on Empirical Methods in Natural Language …, 2019
Training and meta-evaluating machine translation evaluation metrics at the paragraph level
D Deutsch, J Juraska, M Finkelstein, M Freitag
arXiv preprint arXiv:2308.13506, 2023
Incorporating question answering-based signals into abstractive summarization via salient span selection
D Deutsch, D Roth
arXiv preprint arXiv:2111.07935, 2021
Ties matter: Modifying kendall’s tau for modern metric meta-evaluation
D Deutsch, G Foster, M Freitag
arXiv preprint arXiv:2305.14324, 2023
Needle in a haystack: An analysis of high-agreement workers on mturk for summarization
L Zhang, S Mille, Y Hou, D Deutsch, E Clark, Y Liu, S Mahamood, ...
arXiv preprint arXiv:2212.10397, 2022
Is killed more significant than fled? a contextual model for salient event detection
D Jindal, D Deutsch, D Roth
Proceedings of the 28th International Conference on Computational …, 2020
文章 1–20