SampleRNN: An unconditional end-to-end neural audio generation model (2016)

F Ren, Y Bao - International Journal of Information Technology & …, 2020 - World Scientific

In the field of artificial intelligence, human–computer interaction (HCI) technology and its
related intelligent robot technologies are essential and interesting contents of research …

被引用次数：132 相关文章所有 10 个版本

[PDF] wiley.com Full View

An overview of image caption generation methods

H Wang, Y Zhang, X Yu - Computational intelligence and …, 2020 - Wiley Online Library

In recent years, with the rapid development of artificial intelligence, image caption has
gradually attracted the attention of many researchers in the field of artificial intelligence and …

被引用次数：109 相关文章所有 12 个版本

[PDF] arxiv.org

From artificial neural networks to deep learning for music generation: history, concepts and trends

JP Briot - Neural Computing and Applications, 2021 - Springer

The current wave of deep learning (the hyper-vitamined return of artificial neural networks)
applies not only to traditional statistical machine learning tasks: prediction and classification …

被引用次数：112 相关文章所有 10 个版本

[PDF] arxiv.org

Audio super resolution using neural networks

V Kuleshov, SZ Enam, S Ermon - arXiv preprint arXiv:1708.00853, 2017 - arxiv.org

We introduce a new audio processing technique that increases the sampling rate of signals
such as speech or music using deep convolutional neural networks. Our model is trained on …

被引用次数：148 相关文章所有 5 个版本

[HTML] alljournals.cn

[HTML][HTML] 循环神经网络研究综述

刘建伟，宋志妍 - 控制与决策, 2022 - kzyjc.alljournals.cn

循环神经网络是神经网络序列模型的主要实现形式, 近几年得到迅速发展, 其是机器翻译,
机器问题回答, 序列视频分析的标准处理手段, 也是对于手写体自动合成, 语音处理和图像生成等 …

被引用次数：16 相关文章所有 2 个版本

[PDF] arxiv.org

Review of end-to-end speech synthesis technology based on deep learning

Z Mu, X Yang, Y Dong - arXiv preprint arXiv:2104.09995, 2021 - arxiv.org

As an indispensable part of modern human-computer interaction system, speech synthesis
technology helps users get the output of intelligent machine more easily and intuitively, thus …

被引用次数：36 相关文章所有 2 个版本

[PDF] ieee.org

Srecg: Ecg signal super-resolution framework for portable/wearable devices in cardiac arrhythmias classification

TM Chen, YH Tsai, HH Tseng, KC Liu… - IEEE Transactions …, 2023 - ieeexplore.ieee.org

A combination of cloud-based deep learning (DL) algorithms with portable/wearable (P/W)
devices has been developed as a smart heath care system to support automatic cardiac …

被引用次数：26 相关文章所有 4 个版本

[HTML] mdpi.com

[HTML][HTML] DIA-TTS: deep-inherited attention-based text-to-speech synthesizer

J Yu, Z Xu, X He, J Wang, B Liu, R Feng, S Zhu… - Entropy, 2022 - mdpi.com

Text-to-speech (TTS) synthesizers have been widely used as a vital assistive tool in various
fields. Traditional sequence-to-sequence (seq2seq) TTS such as Tacotron2 uses a single …

被引用次数：5 相关文章所有 7 个版本

A novel method for Mandarin speech synthesis by inserting prosodic structure prediction into Tacotron2

J Liu, Z Xie, C Zhang, G Shi - International Journal of Machine Learning …, 2021 - Springer

Speech synthesis, an artificial intelligence technology that employs computers to imitate
human speech, has played a crucial role in human–computer interaction since it can …

被引用次数：9 相关文章

[HTML] lww.com

[HTML][HTML] A review of the text-to-speech synthesizer for human robot interaction for patients with Alzheimer's disease

J Yu, Y Yao, R Feng, T Liang, W Wang, J Li - Digital Medicine, 2023 - journals.lww.com

With the rapid growth of eldering process worldwide, the number of people with mild
cognitive impairment (MCI) has also been largely increased. To ease the problem that not all …

被引用次数：1 相关文章所有 2 个版本