International Journal of Drug Delivery Technology
Volume 16, Issue 11s, 2026

Multimodal Fusion Framework for Real-time Deceptive Behavior Detection Through Integrated Analysis of Micro-expressions, Vocal Prosody, and Physiological Markers

1 Dr. Gudapati Syam Prasad, 2 Dr. Nancharaiah Murari, 3 Dr. G Srilakshmi, 4 Dr. B. Raja Srinivasa Reddy, 5 Dr. R. Kiran Kumar, 6 T. Subha Mastan Rao, 7 Dr. Subba Rao Polamuri

1Professor & HOD, Department of Computer Science and Engineering, Sri Vasavi Institute of Engineering & Technology, Nandamuru. Email: syamprasad.gudapati@gmail.com

2Associate Professor, Department of CSE, Sri Vasavi Institute of Engineering and Technology, Nandamuru. Email: nancharaiahmurari@gmail.com

3Associate Professor, IT Department, SRK Institute of Technology, Vijayawada. Email: sree.gpk@gmail.com

4Principal & Professor, Dept of CSE, Sri Vasavi Institute of Engineering and Technology, Pedana -521369, Andhra Pradesh. Email: brsreddy208@sviet.edu.in

5Department of CSE, Krishna University, Machilipatnam. Email: kirankreddi@gmail.com

6Associate Professor, Department of Computer Science and Engineering, Koneru Lakshmaiah Educational Foundation, Vaddeswaram. Email: mastan1061@gmail.com

7Associate Professor, Department of Computer Science and Engineering, Aditya University, Surampalem, Andhra Pradesh, India. Email: psr.subbu546@gmail.com


ABSTRACT

Deceptive behavior detection remains a critical challenge across forensic psychology, security screening, and human-computer interaction domains. Traditional approaches focusing on single modality analysis often fail to capture the complex nature of deceptive communication patterns. This research introduces a novel multimodal fusion framework that simultaneously analyzes micro-expressions, vocal prosodic features, and physiological markers to achieve enhanced accuracy in real-time lie detection. Our proposed Deep Multimodal Attention Network (DMAN) employs transformer-based architectures with cross-modal attention mechanisms to identify subtle correlations between facial expressions, voice characteristics, and autonomic nervous system responses during deceptive episodes.

The framework incorporates three specialized neural networks: a Temporal Convolutional Network (TCN) for micro-expression analysis, a Multi-scale Spectral CNN for prosodic feature extraction, and a Recurrent Neural Network for physiological signal processing. These networks are integrated through a novel attention-weighted fusion layer that dynamically assigns importance weights to each modality based on their reliability in specific conversational contexts. Experimental validation on a newly constructed dataset of 2,500 interview sessions demonstrates superior performance with 94.7% accuracy, significantly outperforming existing single-modality approaches.

The research contributes to advancing automated deception detection by addressing temporal synchronization challenges, handling missing modality data, and providing interpretable decision-making through attention visualization. Our findings reveal that micro-expressions contribute 38% to final predictions, vocal features 35%, and physiological markers 27%, with the remaining attributed to cross-modal interactions. This work establishes new benchmarks for multimodal deception detection and provides practical frameworks for real-world deployment in security and investigative applications.

Keywords: Multimodal fusion, micro-expression analysis, vocal prosody, physiological markers, deception detection, transformer networks, attention mechanisms

How to cite this article: Syam Prasad G, Murari N, Srilakshmi G, Srinivasa Reddy BR, Kiran Kumar R, Mastan Rao TS, Polamuri S. Multimodal Fusion Framework for Real-time Deceptive Behavior Detection Through Integrated Analysis of Micro-expressions, Vocal Prosody, and Physiological Markers. Int J Drug Deliv Technol. 2026;16(11s): 31-39; DOI: 10.25258/ijddt.16.11s.4

Source of support: Nil.

Conflict of interest: None