View Details

The research group from FIT - HCMUS has achieved a scientific paper award at the A* ACM MM 2022 conference


Congratulations to the research group for scientific paper acceptance.

Tran Thanh Phuc, 2018 course, Faculty of Information Technology. 

Tran Quang Tien, 2018 course, Faculty of Information Technology. 

Scientific paper titled “A Textual-Visual Entailment-based Unsupervised Algorithm for Cheapfake Detection” in Grand Challenge: Detecting CheapFakes, a part of the A*30th ACM International Conference on Multimedia 2022 (Lisbon, Portugal).

Congratulations to the scientific research group from fit@hcmus for winning and being accepted the scientific paper “A Textual-Visual Entailment-based Unsupervised Algorithm for Cheapfake Detection” in the Grand Challenge: Detecting CheapFakes, which is a part of A*ACM MM Conference 2022 (Lisbon, Portugal). 

About Grand Challenge Cheapfake: 

“ACMMMM 2022 Grand Challenge on Detecting Cheap Fake” includes 2  challenges: 

  • Task 1: The participants are required to suggest solutions to detect trio 2 captions - 1 image in a wrong context. More specifically, with trios (Image, Caption1, Caption2) as inputs, the suggested model will predict the corresponding class labels (OOC or NOOC). The aim of this mission is not to determine which caption is right or wrong, it is to identify whether there exists a misunderstanding of the content inside. The purpose of this challenge is to support human verification confirmation because finding the trio conflict caption-image will narrow the search space. 

  • Task 2: A Caption with the wrong context from Task 1 cannot give out the correct conclusion about the accuracy of the statements. In fact, there may not be many available captions for a specific image. In this situation, our job is to examine whether there is a verification between the image and the caption. This is a challenging task, even with sensors as humans who do not have enough knowledge about the origin of the image. The participants will be asked to suggest solutions for determining whether a given caption-image pair is real or fake. 

About researching methods: 

The method of this work is building a model predicting the accuracy of context inside the trio of 2 captions and 1 image. 

The purpose of this problem is not to find the wrong caption, it focuses on the contradiction of the inside context. The research group has proposed two contiguous directions, using the supervised research model and the combined model. The combined model approach has achieved better results than the basic method. 

In addition, the team’s method applies a combination of SOTA models in different Natural Language Processing tasks, leading to many new directions to develop a complete system. These methods after being combined have overcome the weaknesses of COSMOS and increased the accuracy by 7.2% on the contest’s testing dataset. Furthermore, with the second task, the research group’s method has also been applied and reached the correctness of 73%. 

The team members mainly come from Computer Security Club, Faculty of Information Technology, University of Science (VNU-HCM); in collaboration with post-graduate students from University of Information Technology (VNU-HCM) and 

teachers for NICT (Japan), University of Bergen (Norway). 

  • Student Tran Quang Tien (Honors Program, class of 2018). 

  • Student Tran Thanh Phuc (Honors Program class of 2018).

  • Post-graduate student La Tuan Vinh (University of Information Technology). 

  • MS Tran Anh Duy (University of Science). 

  • Dr. Dao Minh Son (NICT, Japan). 

  • Assoc. Prof. Dr. Dang Nguyen Duc Tien (University of Bergen, Norway). 

Older Posts