Multi-modal Machine Comprehension (MMC Group), abbreviated as MMC Group, is a research group focusing on multimodal information mining and processing problems. MMC Group brings together researchers from various fields in Computer Science such as Natural Language Processing, Audio Processing, and Image Processing from home and abroad, as well as from local and international researchers. in industry. The main goal of the group is to develop multimodal processing systems to solve today's common problems.

Research topics

  • Modeling multimodal knowledge such as images, text, sound,...
  • Predictive model based on multimodal data: text, image, sound,...
  • Explainable AI model

Member

  • Dr. Nguyen Tien Huy (Team Leader) - The Faculty of Information Technology, VNUHCM-University of Science
  • Dr. Le Thanh Tung - The Faculty of Information Technology, VNUHCM-University of Science
  • Assoc.Prof.Dr. Nguyen Le Minh - JAIST, Japan
  • Dr. Pho Ngoc Dang Khoa - Trusting Social
  • MSc. Nguyen Tran Duy Minh - The Faculty of Information Technology, VNUHCM-University of Science
  • BSc. Nguyen Duc Anh - The Faculty of Information Technology, VNUHCM-University of Science

Typical research projects

  • Visual language model with few objects for classifying image questions for the blind (Study at University of Science: Sep 2021)
  • Predicting the ability to answer image questions for visually impaired people using Residual Attention model (Study at University of Science: Mar 2022)

Scientific publication

  1. Tung Le, Huy Tien Nguyen, and Minh Le Nguyen. 2021. “Multi Visual and Textual Embedding on Visual Question Answering for Blind People.” Neurocomputing 465:451–64. doi: https://doi.org/10.1016/j.neucom.2021.08.117.

  2. Tung Le, Khoa Pho, Thong Bui, Huy Tien Nguyen, and Minh Le Nguyen. 2022. “Object-Less Vision-Language Model on Visual Question Classification for Blind People.” Pp. 180–187 in Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART, SciTePress.

  3. Duy-Minh Nguyen-Tran, Tung Le, Minh Le Nguyen and Huy Tien Nguyen, 2022. “Bi-directional Cross-Attention Network on Vietnamese Visual Question Answering”, in Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation (PACLIC).

  4. Duy-Minh Nguyen-Tran, T. Le, K. Pho, M. Le. Nguyen, and Huy Tien Nguyen, 2022, “RVT-Transformer: Residual Attention in Answerability Prediction on Visual Question Answering for Blind People”, In Proceedings of the 14th International Conference on Computational Collective Intelligence (ICCCI).

  5. Anh Duc Nguyen, Tung Le, and Huy Tien Nguyen, 2022. “Combining Multi-vision Embedding in Contextual Attention for Vietnamese Visual Question Answering”, in Pacific-Rim Symposium on Image and Video Technology (PSIVT)Tung Le, Huy Tien Nguyen, and Minh Le Nguyen. 2021. “Vision And Text Transformer For Predicting Answerability On Visual Question Answering.” Pp. 934–138 in 2021 IEEE International Conference on Image Processing (ICIP).

  6. Tung Le, Thong Bui, Huy Tien Nguyen, and Minh Le Nguyen. 2021. “Bi-Direction Co-Attention Network on Visual Question Answering for Blind People.” Pp. 335–442 in Fourteenth International Conference on Machine Vision (ICMV 2021). Vol. 12084.