Arcface: Additive angular margin loss for deep face recognition

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

被引用次数：190 相关文章所有 6 个版本

[PDF] arxiv.org

A survey of convolutional neural networks: analysis, applications, and prospects

Z Li, F Liu, W Yang, S Peng… - IEEE transactions on …, 2021 - ieeexplore.ieee.org

A convolutional neural network (CNN) is one of the most significant networks in the deep
learning field. Since CNN made impressive achievements in many areas, including but not …

被引用次数：3618 相关文章所有 7 个版本

[PDF] thecvf.com

Adaface: Quality adaptive margin for face recognition

M Kim, AK Jain, X Liu - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com

Recognition in low quality face datasets is challenging because facial attributes are
obscured and degraded. Advances in margin-based loss functions have resulted in …

被引用次数：465 相关文章所有 7 个版本

[PDF] thecvf.com

Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation

W Zhang, X Cun, X Wang, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Generating talking head videos through a face image and a piece of speech audio still
contains many challenges. ie, unnatural head movement, distorted expression, and identity …

被引用次数：220 相关文章所有 7 个版本

[PDF] mlr.press

Mitigating neural network overconfidence with logit normalization

H Wei, R Xie, H Cheng, L Feng… - … conference on machine …, 2022 - proceedings.mlr.press

Detecting out-of-distribution inputs is critical for the safe deployment of machine learning
models in the real world. However, neural networks are known to suffer from the …

被引用次数：284 相关文章所有 4 个版本

[PDF] arxiv.org

Encoder-based domain tuning for fast personalization of text-to-image models

R Gal, M Arar, Y Atzmon, AH Bermano… - ACM Transactions on …, 2023 - dl.acm.org

Text-to-image personalization aims to teach a pre-trained diffusion model to reason about
novel, user provided concepts, embedding them into new scenes guided by natural …

被引用次数：161 相关文章所有 4 个版本

[PDF] thecvf.com

Efficient geometry-aware 3d generative adversarial networks

ER Chan, CZ Lin, MA Chan… - Proceedings of the …, 2022 - openaccess.thecvf.com

Unsupervised generation of high-quality multi-view-consistent images and 3D shapes using
only collections of single-view 2D photographs has been a long-standing challenge …

被引用次数：1332 相关文章所有 8 个版本

[PDF] arxiv.org

Wavlm: Large-scale self-supervised pre-training for full stack speech processing

S Chen, C Wang, Z Chen, Y Wu, S Liu… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

Self-supervised learning (SSL) achieves great success in speech recognition, while limited
exploration has been attempted for other speech processing tasks. As speech signal …

被引用次数：1703 相关文章所有 5 个版本

[PDF] thecvf.com

Diffusionclip: Text-guided diffusion models for robust image manipulation

G Kim, T Kwon, JC Ye - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com

Recently, GAN inversion methods combined with Contrastive Language-Image Pretraining
(CLIP) enables zero-shot image manipulation guided by text prompts. However, their …

被引用次数：649 相关文章所有 9 个版本

[PDF] thecvf.com

Diffusion autoencoders: Toward a meaningful and decodable representation

K Preechakul, N Chatthee… - Proceedings of the …, 2022 - openaccess.thecvf.com

Diffusion probabilistic models (DPMs) have achieved remarkable quality in image
generation that rivals GANs'. But unlike GANs, DPMs use a set of latent variables that lack …

被引用次数：395 相关文章所有 6 个版本