Cophy: Counterfactual learning of physical dynamics

J Duan, S Yu, HL Tan, H Zhu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

There has been an emerging paradigm shift from the era of “internet AI” to “embodied AI,”
where AI algorithms and agents no longer learn from datasets of images, videos or text …

被引用次数：241 相关文章所有 8 个版本

[PDF] arxiv.org

Benchmarks for automated commonsense reasoning: A survey

E Davis - ACM Computing Surveys, 2023 - dl.acm.org

More than one hundred benchmarks have been developed to test the commonsense
knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems …

被引用次数：39 相关文章所有 4 个版本

[PDF] arxiv.org

Clevrer: Collision events for video representation and reasoning

K Yi, C Gan, Y Li, P Kohli, J Wu, A Torralba… - arXiv preprint arXiv …, 2019 - arxiv.org

The ability to reason about temporal and causal events from videos lies at the core of human
intelligence. Most video reasoning benchmarks, however, focus on pattern recognition from …

被引用次数：450 相关文章所有 6 个版本

[PDF] arxiv.org

When physics meets machine learning: A survey of physics-informed machine learning

C Meng, S Seo, D Cao, S Griesemer, Y Liu - arXiv preprint arXiv …, 2022 - arxiv.org

Physics-informed machine learning (PIML), referring to the combination of prior knowledge
of physics, which is the high level abstraction of natural phenomenons and human …

被引用次数：75 相关文章所有 2 个版本

[PDF] arxiv.org

Causalworld: A robotic manipulation benchmark for causal structure and transfer learning

O Ahmed, F Träuble, A Goyal, A Neitz, Y Bengio… - arXiv preprint arXiv …, 2020 - arxiv.org

Despite recent successes of reinforcement learning (RL), it remains a challenge for agents
to transfer learned skills to related environments. To facilitate research addressing this …

被引用次数：132 相关文章所有 5 个版本

[PDF] arxiv.org

CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning

R Girdhar, D Ramanan - arXiv preprint arXiv:1910.04744, 2019 - arxiv.org

Computer vision has undergone a dramatic revolution in performance, driven in large part
through deep features trained on large-scale supervised datasets. However, much of these …

被引用次数：176 相关文章所有 3 个版本

[PDF] neurips.cc

Dynamic visual reasoning by learning differentiable physics models from video and language

M Ding, Z Chen, T Du, P Luo… - Advances In Neural …, 2021 - proceedings.neurips.cc

In this work, we propose a unified framework, called Visual Reasoning with Differ-entiable
Physics (VRDP), that can jointly learn visual concepts and infer physics models of objects …

被引用次数：69 相关文章所有 8 个版本

[PDF] arxiv.org

Capturing the objects of vision with neural networks

B Peters, N Kriegeskorte - Nature human behaviour, 2021 - nature.com

Human visual perception carves a scene at its physical joints, decomposing the world into
objects, which are selectively attended, tracked and predicted as we engage our …

被引用次数：56 相关文章所有 12 个版本

[PDF] arxiv.org

Grounding physical concepts of objects and events through dynamic visual reasoning

Z Chen, J Mao, J Wu, KYK Wong… - arXiv preprint arXiv …, 2021 - arxiv.org

We study the problem of dynamic visual reasoning on raw videos. This is a challenging
problem; currently, state-of-the-art models often require dense supervision on physical …

被引用次数：96 相关文章所有 6 个版本

[PDF] arxiv.org

Learning what makes a difference from counterfactual examples and gradient supervision

D Teney, E Abbasnedjad, A van den Hengel - Computer Vision–ECCV …, 2020 - Springer

One of the primary challenges limiting the applicability of deep learning is its susceptibility to
learning spurious correlations rather than the underlying mechanisms of the task of interest …

被引用次数：128 相关文章所有 8 个版本