The power of generative ai: A review of requirements, models, input–output formats, evaluation metrics, and challenges

A Bandi, PVSR Adapa, YEVPK Kuchi - Future Internet, 2023 - mdpi.com
Generative artificial intelligence (AI) has emerged as a powerful technology with numerous
applications in various domains. There is a need to identify the requirements and evaluation …

Artificial intelligence in the creative industries: a review

N Anantrasirichai, D Bull - Artificial intelligence review, 2022 - Springer
This paper reviews the current state of the art in artificial intelligence (AI) technologies and
applications in the context of the creative industries. A brief background of AI, and …

Diffwave: A versatile diffusion model for audio synthesis

Z Kong, W Ping, J Huang, K Zhao… - arXiv preprint arXiv …, 2020 - arxiv.org
In this work, we propose DiffWave, a versatile diffusion probabilistic model for conditional
and unconditional waveform generation. The model is non-autoregressive, and converts the …

Wavegrad: Estimating gradients for waveform generation

N Chen, Y Zhang, H Zen, RJ Weiss, M Norouzi… - arXiv preprint arXiv …, 2020 - arxiv.org
This paper introduces WaveGrad, a conditional model for waveform generation which
estimates gradients of the data density. The model is built on prior work on score matching …

[PDF][PDF] Jukebox: A generative model for music

P Dhariwal, H Jun, C Payne, JW Kim… - arXiv preprint arXiv …, 2020 - assets.pubpub.org
We introduce Jukebox, a model that generates music with singing in the raw audio domain.
We tackle the long context of raw audio using a multiscale VQ-VAE to compress it to discrete …

Melgan: Generative adversarial networks for conditional waveform synthesis

K Kumar, R Kumar, T De Boissiere… - Advances in neural …, 2019 - proceedings.neurips.cc
Previous works (Donahue et al., 2018a; Engel et al., 2019a) have found that generating
coherent raw audio waveforms with GANs is challenging. In this paper, we show that it is …

DDSP: Differentiable digital signal processing

J Engel, L Hantrakul, C Gu, A Roberts - arXiv preprint arXiv:2001.04643, 2020 - arxiv.org
Most generative models of audio directly generate samples in one of two domains: time or
frequency. While sufficient to express any signal, these representations are inefficient, as …

GenAICHI: generative AI and HCI

M Muller, LB Chilton, A Kantosalo, CP Martin… - CHI conference on …, 2022 - dl.acm.org
This workshop applies human centered themes to a new and powerful technology,
generative artificial intelligence (AI). Unlike AI systems that produce decisions or …

High fidelity speech synthesis with adversarial networks

M Bińkowski, J Donahue, S Dieleman, A Clark… - arXiv preprint arXiv …, 2019 - arxiv.org
Generative adversarial networks have seen rapid development in recent years and have led
to remarkable improvements in generative modelling of images. However, their application …

A comprehensive survey on deep music generation: Multi-level representations, algorithms, evaluations, and future directions

S Ji, J Luo, X Yang - arXiv preprint arXiv:2011.06801, 2020 - arxiv.org
The utilization of deep learning techniques in generating various contents (such as image,
text, etc.) has become a trend. Especially music, the topic of this paper, has attracted …