MPCCT: Multimodal vision-language learning paradigm with context-based compact Transformer
C Chen, D Han, CC Chang - Pattern Recognition, 2024 - Elsevier
Transformer and its variants have become the preferred option for multimodal vision-
language paradigms. However, they struggle with tasks that demand high-dependency …
language paradigms. However, they struggle with tasks that demand high-dependency …
A multimodal hybrid parallel network intrusion detection model
S Shi, D Han, M Cui - Connection Science, 2023 - Taylor & Francis
With the rapid growth of Internet data traffic, the means of malicious attack become more
diversified. The single modal intrusion detection model cannot fully exploit the rich feature …
diversified. The single modal intrusion detection model cannot fully exploit the rich feature …
CLVIN: Complete language-vision interaction network for visual question answering
C Chen, D Han, X Shen - Knowledge-Based Systems, 2023 - Elsevier
The emergence of the Transformer optimizes the interactive modeling of multimodal
information in visual question answering (VQA) tasks, helping machines better understand …
information in visual question answering (VQA) tasks, helping machines better understand …
Intelligent productivity transformation: corporate market demand forecasting with the aid of an AI virtual assistant
B Liu, M Li, Z Ji, H Li, J Luo - Journal of Organizational and End User …, 2024 - igi-global.com
With the penetration of deep learning technology into forecasting and decision support
systems, enterprises have an increasingly urgent need for accurate forecasting of time …
systems, enterprises have an increasingly urgent need for accurate forecasting of time …
Multi-modal adaptive gated mechanism for visual question answering
Y Xu, L Zhang, X Shen - Plos one, 2023 - journals.plos.org
Visual Question Answering (VQA) is a multimodal task that uses natural language to ask and
answer questions based on image content. For multimodal tasks, obtaining accurate …
answer questions based on image content. For multimodal tasks, obtaining accurate …
[HTML][HTML] BoostedDim attention: A novel data-driven approach to improving LiDAR-based lane detection
O Patil, BB Nair, R Soni, A Thayyilravi… - Ain Shams Engineering …, 2024 - Elsevier
Lane detection is a fundamental component of advanced driver assistance systems,
facilitating critical functionalities like Lane Keep/Change Assistance, Lane Departure …
facilitating critical functionalities like Lane Keep/Change Assistance, Lane Departure …
Relational reasoning and adaptive fusion for visual question answering
X Shen, D Han, L Zong, Z Guo, J Hua - Applied Intelligence, 2024 - Springer
Visual relationship modeling plays an indispensable role in visual question answering
(VQA). VQA models need to fully understand the visual scene and positional relationships …
(VQA). VQA models need to fully understand the visual scene and positional relationships …
ARDN: Attention Re-distribution Network for Visual Question Answering
J Yi, D Han, C Chen, X Shen, L Zong - Arabian Journal for Science and …, 2024 - Springer
The Transformer-based architecture, with its efficient parallel computation, long-range
dependency modeling, and context-aware capabilities, has showcased remarkable …
dependency modeling, and context-aware capabilities, has showcased remarkable …
Subgraph representation learning with self-attention and free adversarial training
D Qin, X Tang, J Lu - Applied Intelligence, 2024 - Springer
Due to its capacity to capture subgraph information within graph data, subgraph
representation learning has garnered considerable attention in recent years. However …
representation learning has garnered considerable attention in recent years. However …