On-the-fly aligned data augmentation for sequence-to-sequence ASR

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

被引用次数：377 相关文章所有 7 个版本

[PDF] acm.org

A survey of mix-based data augmentation: Taxonomy, methods, applications, and explainability

C Cao, F Zhou, Y Dai, J Wang, K Zhang - ACM Computing Surveys, 2022 - dl.acm.org

Data augmentation (DA) is indispensable in modern machine learning and deep neural
networks. The basic idea of DA is to construct new training data to improve the model's …

被引用次数：21 相关文章所有 2 个版本

[PDF] arxiv.org

STEMM: Self-learning with speech-text manifold mixup for speech translation

Q Fang, R Ye, L Li, Y Feng, M Wang - arXiv preprint arXiv:2203.10426, 2022 - arxiv.org

How to learn a better speech representation for end-to-end speech-to-text translation (ST)
with limited labeled data? Existing techniques often attempt to transfer powerful machine …

被引用次数：89 相关文章所有 8 个版本

[HTML] mdpi.com

[HTML][HTML] A new approach for detecting fundus lesions using image processing and deep neural network architecture based on YOLO model

C Santos, M Aguiar, D Welfer, B Belloni - Sensors, 2022 - mdpi.com

Diabetic Retinopathy is one of the main causes of vision loss, and in its initial stages, it
presents with fundus lesions, such as microaneurysms, hard exudates, hemorrhages, and …

被引用次数：36 相关文章所有 9 个版本

[PDF] wiley.com Full View

An End‐to‐End Cardiac Arrhythmia Recognition Method with an Effective DenseNet Model on Imbalanced Datasets Using ECG Signal

H Ullah, MB Bin Heyat, F Akhtar, Sumbul… - Computational …, 2022 - Wiley Online Library

Electrocardiography (ECG) is a well‐known noninvasive technique in medical science that
provides information about the heart's rhythm and current conditions. Automatic ECG …

被引用次数：25 相关文章所有 12 个版本

[HTML] mdpi.com

[HTML][HTML] An automatic premature ventricular contraction recognition system based on imbalanced dataset and pre-trained residual network using transfer learning on …

H Ullah, MBB Heyat, F Akhtar, AY Muaad… - Diagnostics, 2022 - mdpi.com

The development of automatic monitoring and diagnosis systems for cardiac patients over
the internet has been facilitated by recent advancements in wearable sensor devices from …

被引用次数：19 相关文章所有 8 个版本

[PDF] wiley.com Full View

[Retracted] An Effective and Lightweight Deep Electrocardiography Arrhythmia Recognition Model Using Novel Special and Native Structural Regularization …

H Ullah, MB Bin Heyat, H AlSalman… - Journal of …, 2022 - Wiley Online Library

Recently, cardiac arrhythmia recognition from electrocardiography (ECG) with deep learning
approaches is becoming popular in clinical diagnosis systems due to its good prognosis …

被引用次数：22 相关文章所有 10 个版本

[PDF] arxiv.org

Sample, translate, recombine: Leveraging audio alignments for data augmentation in end-to-end speech translation

TK Lam, S Schamoni, S Riezler - arXiv preprint arXiv:2203.08757, 2022 - arxiv.org

End-to-end speech translation relies on data that pair source-language speech inputs with
corresponding translations into a target language. Such data are notoriously scarce, making …

被引用次数：25 相关文章所有 6 个版本

Optimizing data usage for low-resource speech recognition

Y Qian, Z Zhou - IEEE/ACM Transactions on Audio, Speech …, 2022 - ieeexplore.ieee.org

Automatic speech recognition has made huge progress recently. However, the current
modeling strategy still suffers a large performance degradation when facing the low …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

Fast random approximation of multi-channel room impulse response

Y Luo, R Gu - 2024 IEEE International Conference on Acoustics …, 2024 - ieeexplore.ieee.org

The training of modern neural-network-based speech processing systems typically requires
a large amount of reverberant data to make the systems robust against reverberation …

被引用次数：5 相关文章所有 2 个版本