[PDF][PDF] Recent advances in end-to-end automatic speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

A survey of mix-based data augmentation: Taxonomy, methods, applications, and explainability

C Cao, F Zhou, Y Dai, J Wang, K Zhang - ACM Computing Surveys, 2022 - dl.acm.org
Data augmentation (DA) is indispensable in modern machine learning and deep neural
networks. The basic idea of DA is to construct new training data to improve the model's …

STEMM: Self-learning with speech-text manifold mixup for speech translation

Q Fang, R Ye, L Li, Y Feng, M Wang - arXiv preprint arXiv:2203.10426, 2022 - arxiv.org
How to learn a better speech representation for end-to-end speech-to-text translation (ST)
with limited labeled data? Existing techniques often attempt to transfer powerful machine …

[HTML][HTML] A new approach for detecting fundus lesions using image processing and deep neural network architecture based on YOLO model

C Santos, M Aguiar, D Welfer, B Belloni - Sensors, 2022 - mdpi.com
Diabetic Retinopathy is one of the main causes of vision loss, and in its initial stages, it
presents with fundus lesions, such as microaneurysms, hard exudates, hemorrhages, and …

An End‐to‐End Cardiac Arrhythmia Recognition Method with an Effective DenseNet Model on Imbalanced Datasets Using ECG Signal

H Ullah, MB Bin Heyat, F Akhtar, Sumbul… - Computational …, 2022 - Wiley Online Library
Electrocardiography (ECG) is a well‐known noninvasive technique in medical science that
provides information about the heart's rhythm and current conditions. Automatic ECG …

[HTML][HTML] An automatic premature ventricular contraction recognition system based on imbalanced dataset and pre-trained residual network using transfer learning on …

H Ullah, MBB Heyat, F Akhtar, AY Muaad… - Diagnostics, 2022 - mdpi.com
The development of automatic monitoring and diagnosis systems for cardiac patients over
the internet has been facilitated by recent advancements in wearable sensor devices from …

[Retracted] An Effective and Lightweight Deep Electrocardiography Arrhythmia Recognition Model Using Novel Special and Native Structural Regularization …

H Ullah, MB Bin Heyat, H AlSalman… - Journal of …, 2022 - Wiley Online Library
Recently, cardiac arrhythmia recognition from electrocardiography (ECG) with deep learning
approaches is becoming popular in clinical diagnosis systems due to its good prognosis …

Sample, translate, recombine: Leveraging audio alignments for data augmentation in end-to-end speech translation

TK Lam, S Schamoni, S Riezler - arXiv preprint arXiv:2203.08757, 2022 - arxiv.org
End-to-end speech translation relies on data that pair source-language speech inputs with
corresponding translations into a target language. Such data are notoriously scarce, making …

Optimizing data usage for low-resource speech recognition

Y Qian, Z Zhou - IEEE/ACM Transactions on Audio, Speech …, 2022 - ieeexplore.ieee.org
Automatic speech recognition has made huge progress recently. However, the current
modeling strategy still suffers a large performance degradation when facing the low …

Fast random approximation of multi-channel room impulse response

Y Luo, R Gu - 2024 IEEE International Conference on Acoustics …, 2024 - ieeexplore.ieee.org
The training of modern neural-network-based speech processing systems typically requires
a large amount of reverberant data to make the systems robust against reverberation …