Inspecting the geographical representativeness of images from text-to-image models
Recent progress in generative models has resulted in models that produce both realistic as
well as relevant images for most textual inputs. These models are being used to generate …
well as relevant images for most textual inputs. These models are being used to generate …
Exploring the limitations in how ChatGPT introduces environmental justice issues in the United States: A case study of 3,108 counties
The potential of Generative AI, such as ChatGPT, has sparked discussions among
researchers and the public. This study empirically explores the capabilities and limitations of …
researchers and the public. This study empirically explores the capabilities and limitations of …
Survey of cultural awareness in language models: Text and beyond
Large-scale deployment of large language models (LLMs) in various applications, such as
chatbots and virtual assistants, requires LLMs to be culturally sensitive to the user to ensure …
chatbots and virtual assistants, requires LLMs to be culturally sensitive to the user to ensure …
A survey on advancements in image-text multimodal models: From general techniques to biomedical implementations
With the significant advancements of Large Language Models (LLMs) in the field of Natural
Language Processing (NLP), the development of image-text multimodal models has …
Language Processing (NLP), the development of image-text multimodal models has …
Incorporating Geo-Diverse Knowledge into Prompting for Increased Geographical Robustness in Object Recognition
Existing object recognition models have been shown to lack robustness in diverse
geographical scenarios due to domain shifts in design and context. Class representations …
geographical scenarios due to domain shifts in design and context. Class representations …
A survey on image-text multimodal models
With the significant advancements of Large Language Models (LLMs) in the field of Natural
Language Processing (NLP), the development of image-text multimodal models has …
Language Processing (NLP), the development of image-text multimodal models has …
Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration
To create culturally inclusive vision-language models (VLMs), the foremost requirement is
developing a test benchmark that can diagnose the models' ability to respond to questions …
developing a test benchmark that can diagnose the models' ability to respond to questions …
GD-COMET: A Geo-Diverse Commonsense Inference Model
With the increasing integration of AI into everyday life, it's becoming crucial to design AI
systems that serve users from diverse backgrounds by making them culturally aware. In this …
systems that serve users from diverse backgrounds by making them culturally aware. In this …
See It from My Perspective: Diagnosing the Western Cultural Bias of Large Vision-Language Models in Image Understanding
Vision-language models (VLMs) can respond to queries about images in many languages.
However, beyond language, culture affects how we see things. For example, individuals …
However, beyond language, culture affects how we see things. For example, individuals …
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor
Large multimodal models (LMMs) have shown remarkable performance in the visual
commonsense reasoning (VCR) task, which aims to answer a multiple-choice question …
commonsense reasoning (VCR) task, which aims to answer a multiple-choice question …