Tracking everything everywhere all at once

Q Wang, YY Chang, R Cai, Z Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present a new test-time optimization method for estimating dense and long-range motion
from a video sequence. Prior optical flow or particle video tracking algorithms typically …

Pointodyssey: A large-scale synthetic dataset for long-term point tracking

Y Zheng, AW Harley, B Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com
We introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework,
for the training and evaluation of long-term fine-grained tracking algorithms. Our goal is to …

Tap-vid: A benchmark for tracking any point in a video

C Doersch, A Gupta, L Markeeva… - Advances in …, 2022 - proceedings.neurips.cc
Generic motion understanding from video involves not only tracking objects, but also
perceiving how their surfaces deform and move. This information is useful to make …

Scene representation transformer: Geometry-free novel view synthesis through set-latent scene representations

MSM Sajjadi, H Meyer, E Pot… - Proceedings of the …, 2022 - openaccess.thecvf.com
A classical problem in computer vision is to infer a 3D scene representation from few images
that can be used to render novel views at interactive rates. Previous work focuses on …

Tapir: Tracking any point with per-frame initialization and temporal refinement

C Doersch, Y Yang, M Vecerik… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present a novel model for Tracking Any Point (TAP) that effectively tracks any queried
point on any physical surface throughout a video sequence. Our approach employs two …

Object 3dit: Language-guided 3d-aware image editing

O Michel, A Bhattad, E VanderBilt… - Advances in …, 2024 - proceedings.neurips.cc
Existing image editing tools, while powerful, typically disregard the underlying 3D geometry
from which the image is projected. As a result, edits made using these tools may become …

Quality not quantity: On the interaction between dataset design and robustness of clip

T Nguyen, G Ilharco, M Wortsman… - Advances in Neural …, 2022 - proceedings.neurips.cc
Web-crawled datasets have enabled remarkable generalization capabilities in recent image-
text models such as CLIP (Contrastive Language-Image pre-training) or Flamingo, but little …

Simple unsupervised object-centric learning for complex and naturalistic videos

G Singh, YF Wu, S Ahn - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Unsupervised object-centric learning aims to represent the modular, compositional, and
causal structure of a scene as a set of object representations and thereby promises to …

Infinite photorealistic worlds using procedural generation

A Raistrick, L Lipson, Z Ma, L Mei… - Proceedings of the …, 2023 - openaccess.thecvf.com
We introduce Infinigen, a procedural generator of photorealistic 3D scenes of the natural
world. Infinigen is entirely procedural: every asset, from shape to texture, is generated from …

Blenderproc2: A procedural pipeline for photorealistic rendering

M Denninger, D Winkelbauer, M Sundermeyer… - Journal of Open Source …, 2023 - elib.dlr.de
BlenderProc2 is a procedural pipeline that can render realistic images for the training of
neural networks. Our pipeline can be employed in various use cases, including …