About the project

Author: Krzysztof Jassem

Research on a multimodal information retrieval and extraction is a response to the rapidly growing availability of data in various formats: text, audio, graphics, and video. Their simultaneous analysis, integration, and contextual “understanding” open up opportunities in science, industry, and public administration but also demand complex technical and methodological solutions.

The most critical research challenge is to create universal methods for merging data from different modalities so that the system can automatically extract key information regardless of its source. It is crucial to ensure the understanding of the material by extracting facts and their interrelations; this requires advanced data fusion mechanisms and sophisticated knowledge representation techniques.

Scalability remains paramount — both in terms of efficiently handling large datasets and adapting the system to specific domains (through additional training techniques such as fine-tuning). The issue of data security and privacy, especially when medical or industrial projects are involved, not only demands compliance with legal regulations but also the development of validation procedures and controls for the quality of automated processing. Lastly, adapting to domain-specific vocabularies and diverse data acquisition methods (e.g., video systems with varying quality parameters) poses a significant challenge, necessitating high model flexibility and the continuous advancement of AI tools.

Key scientific and research challenges:

  1. Integration of data from various modalities
  2. Structuring and semantic analysis of information
  3. Adapting models to specific domains
  4. Managing data quality and privacy

By leveraging advanced artificial intelligence algorithms and collaborating with partners with many years of R&D experience, the project stands a real chance of finding applications in concrete market solutions.

The research carried out at the Center for Artificial Intelligence has been commercialized (among others) in the following projects:

Name Technology Commercial Partner Description
Medico Semantic search, graphics/ audio retrieval WN PWN Virtual Medical Assistant
Shipyard Extractor Information extraction from graphics Remontowa Shipyard Information extraction from order offers
Pons Assistant Bilingual semantic search Pons Language Learning Assistant
Speacair Speech Recognition and Normalization Samsung Speech Recognition in a Dialogue System
Ferryt Navigator Retrieval Augmented Generation DomData Documentation-based Chat Assistant

 

About the project

Project leaflet

CSI_Project_PolSV_Jassem.pdf

View file

Partners taking part in this project

logotyp_pionowy_UAM_EN_kolor_RGB

Adam Mickiewicz University in Poznań

view more