Interpreting CLIP's Image Representation via Text-Based Decomposition
We investigate the CLIP image encoder by analyzing how individual model components
affect the final representation. We decompose the image representation as a sum across …
affect the final representation. We decompose the image representation as a sum across …
Interpreting CLIP's Image Representation via Text-Based Decomposition
Y Gandelsman, AA Efros, J Steinhardt - arXiv e-prints, 2023 - ui.adsabs.harvard.edu
We investigate the CLIP image encoder by analyzing how individual model components
affect the final representation. We decompose the image representation as a sum across …
affect the final representation. We decompose the image representation as a sum across …
Interpreting CLIP's Image Representation via Text-Based Decomposition
Y Gandelsman, AA Efros, J Steinhardt - The Twelfth International … - openreview.net
We investigate the CLIP image encoder by analyzing how individual model components
affect the final representation. We decompose the image representation as a sum across …
affect the final representation. We decompose the image representation as a sum across …