Nghiên cứu khoa học
Tìm kiếm thông tin và Khai thác văn bản

Nhóm tập trung vào các chủ đề bao gồm các mô hình tìm kiếm thông tin, tìm kiếm thông tin xuyên ngữ, khai thác văn bản (như văn bản luật), và phân tích dữ liệu lớn.

Chủ đề nghiên cứu

  • Các mô hình tìm kiếm thông tin (IR Model)
  • Tìm kiếm thông tin xuyên ngữ (Cross-Language IR)
  • Khai thác văn bản (Text Mining)
  • Phân tích dữ liệu lớn (Big Data Analytics)

Nhân lực

  • PGS.TS. Hồ Bảo Quốc
  • TS. Lê Thị Nhàn
  • TS. Nguyễn Trường Sơn
  • NCS. Lê Hoài Nam

Đề tài nghiên cứu

  • Khai thác dữ liệu song ngữ Anh-Viêt từ Internet (Mining English-Vietnamese corpus from Internet) – Đề tài cấp Trường KHTN
  • Xây dựng công cụ và tài nguyên cho xử lý ngôn ngữ tự nhiên tiếng Việt (Building tools and resources for Vietnamese Natural Language Processing) – Đề tài nhánh KC01
  • Xây dựng hệ tìm kiếm thông tin xuyên ngữ Việt - Pháp - Anh (Building a Vietnamese-Frech-English Cross Languages Information Retrieval System) Đề tài hợp tác Quốc tế - tài trợ bởi AUF
  • Khai thác văn bản Pháp luật (Text mining for legal text ) – Đề tài cấp Đại học Quốc gia Tp. HCM

Các hoạt động hợp tác có thể thực hiện

  • Các dự án về Tìm kiếm thông tin (đơn ngữ, đa ngữ)
  • Các dự án rút trích thông tin từ văn bản (Y khoa, Luật…)
  • Các dự án phân tích dữ liệu lớn

Công bố khoa học

  • Huan Doan, Dinh Thuan Nguyen, Bao Quoc Ho: Building a Measure to Integrate into a Hybrid Data Mining Method to Analyze the Risk of Customer. Advanced Computer and Communication Engineering Technology, 01/2016: pages 843-851; ISBN: 978-3-319-24582-9, DOI:10.1007/978-3-319-24584-3_71
  • Le Nguyen Hoai Nam, Ho Bao Quoc: The Hybrid Filter Feature Selection Methods for Improving High-Dimensional Text Categorization. International Journal of Uncertainty Fuzziness and Knowledge-Based Systems 04/2017; 25(02):235-265., DOI:10.1142/S021848851750009X
  • Nguyen Truong Son, Le Minh Nguyen, Ho Bao Quoc, Akira Shimazu: Recognizing logical parts in legal texts using neural architectures. 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE); 10/2016, DOI:10.1109/KSE.2016.7758062
  • Nghia Huynh, Lam Vu, Quoc Ho: A hybrid approach for DocTime classification. 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE); 10/2016, DOI:10.1109/KSE.2016.7758053
  • Le Nguyen Hoai Nam, Ho Bao Quoc: A Comprehensive Filter Feature Selection for Improving Document Classification. The 29th Pacific Asia Conference on Language, Information and Computation (PACLIC 2015), Shanghai, China; 11/2015
  • Le Nguyen Hoai Nam, Ho Bao Quoc: A Combined Approach for Filter Feature Selection in Document Classification. 27th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2015), Vietri sur Mare, Italy; 11/2015, DOI:10.1109/ICTAI.2015.56
  • Nghia Huynh, Quoc Ho: A Combined Approach for Disease/Disorder Template Filling. 7th International Conference on Knowledge and Systems Engineering, Ho Chi Minh, Vietnam; 10/2015, DOI:10.1109/KSE.2015.62
  • Quoc Ho, Nghia Huynh: TeamHCMUS: Analysis of Clinical Text. The 9th International Workshop on Semantic Evaluation, Denver, Colorado; 06/2015, DOI:10.18653/v1/S15-2063
  • Son Nguyen, Quoc Ho, Minh Nguyen: JAIST: A two-phase machine learning approach for identifying discourse relations in newswire texts. Proceedings of the Nineteenth Conference on Computational Natural Language Learning - Shared Task; 01/2015, DOI:10.18653/v1/K15-2010
  • Thinh D. Bui, Quoc B. Ho: An approach for automatically structuring Vietnamese legal text. Proceedings of the International Conference on Asian Language Processing 2014, IALP 2014; 12/2014, DOI:10.1109/IALP.2014.6973500
  • Huu Nghia Huynh, Bao-Quoc Ho: A Rule-based Approach for Relation Extraction from Clinical Documents. Asian Conference on Information Systems 2014, Nha trang, Vietnam; 12/2014
  • Son Nguyen, Ho Bao Quoc, Nguyen Le Minh: Recognizing logical parts in Vietnamese Legal Texts using Conditional Random Fields. The 11th IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF2015); 11/2014, DOI:10.1109/RIVF.2015.7049865
  • Bui Dac Thinh, Nguyen Truong Son, Ho Bao Quoc: Towards a Conceptual Search for Vietnamese Legal Text. 13th International Conference on Computer Information Systems and Industrial Management Applications; 11/2014, DOI:10.13140/2.1.2885.0563
  • Bao-Quoc Ho, Huu Nghia Huynh, Son Lam Vu: ShARe/CLEFeHealth: A Hybrid Approach for Task 2. Working Notes for CLEF 2014 Conference, Sheffield, UK; 09/2014
  • Xuan Quang Pham, Minh Quang Le, Bao Quoc Ho: A Hybrid approach for biomedical event extraction. Proceedings of the BioNLP Shared Task 2013 Workshop; 08/2013
  • Thành Nguyễn, Bao-Quoc HO: Semantic Integration In MMR For Multi- Document Summarization. FAIR - Fundamental and Applied IT Research, Hue University of Sciences, Hue City, Vietnam; 06/2013, DOI:10.13140/2.1.1779.6160
  • Hoai-Duc Tuan-Nguyen, Bao-Quoc Ho, Tuan-Dung Bui, Minh-Chau Hoang: A Grammatically Structured Noun Phrase Extractor for Vietnamese. Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012 IEEE RIVF International Conference on; 02/2012, DOI:10.1109/rivf.2012.6169837
  • Quang Le Minh, Son Nguyen Truong, Quoc Ho Bao: A pattern approach for biomedical event annotation. Proceedings of the BioNLP Shared Task 2011 Workshop; 06/2011
  • Quang-Vinh Tran, Ryutaro Ichise, Bao-Quoc Ho: Cluster-based similarity aggregation for ontology matching.. Proceedings of the 6th International Workshop on Ontology Matching, Bonn, Germany, October 24, 2011; 01/2011
  • Bao-Quoc Ho, Van B. Dang, Minh V. Luong, Thuy T.B. Dong: English-Vietnamese Cross-Language Information Retrieval: An experimental study. Research, Innovation and Vision for the Future, 2008. RIVF 2008. IEEE International Conference on; 08/2008, DOI:10.1109/RIVF.2008.4586341
  • Ho Bao Quoc: Vietnamese Text Retrieval: Test Collection and First Experimentations. The First International Workshop on Evaluating Information Access; 04/2007
  • Van B. Dang, Bao-Quoc Ho: Automatic Construction of English-Vietnamese Parallel Corpus through Web Mining. Research, Innovation and Vision for the Future, 2007 IEEE International Conference on; 04/2007, DOI:10.1109/RIVF.2007.369166
  • Q. Le Minh, S.N. Truong, Q.H. Bao: A pattern approach for biomedical event annotation. Proceedings of the BioNLP Shared Task 2011 Workshop; 04/2007
  • Bao-Quoc Ho, Dong Thi Bich Thuy, Jean-Pierre Chevallet, Marie-France Bruandet: A structured indexing model based on noun phrases. 4th International Confernce on Computer Sciences: Research, Innovation and Vision for the Future, February 12-16, 2006, Ho Chi Minh City, Vietnam; 01/2006, DOI:10.1109/RIVF.2006.1696423
  • Bao-Quoc Ho, Jean-Pierre Chevallet, Marie-France Bruandet: Recherche d'Information Bilingue français-vietnamien. Actes de la Deuxieme Conference Internationale Associant Chercheurs Vietnamiens et Francophones en Informatique, Hanoï Vietnam, 2-5 Février 2004; 01/2004
  • Marie-France Bruandet, Jean-Pierre Chevallet, Dong Thi, Bich Thuy, Ho Bao Quoc: An approach to Vietnamese Information Retrieval.. RIVF; 01/2002