Advancing Large AI Models: Integration of New Data Modalities and Expansion of Capabilities (AI, Data and Robotics Partnership) (RIA)
Projects are expected to contribute to one or more of the following outcomes:
- Enhanced applicability of large AI systems to new domains through the integration of innovative data modalities, such as sensor measurements (e.g. in robotics, IoT) or remote sensing (e.g. earth observation), as input.
- Improvement of current multimodal large AI systems capabilities and expansion of the number of data modalities jointly handed by one AI system, leading to broader application potential and improved AI performance.
Large artificial intelligence (AI) models refer to a new generation of general-purpose AI models (i.e., generative AI) capable of adapting to diverse domains and tasks without significant modification. Notable examples, such as OpenAI's GPT-4V and META?s Llama 2 or DinoV2, have demonstrated a wide and growing variety of capabilities.
The swift progression of large AI models in recent years holds immense potential to revolutionize various industries, due to their ability to adapt to diverse tasks and domains. For them to achieve their potential, access to vast data repositories, significant computing resources, and skilled engineers is required. A promising avenue of research is the development of multi-modal large AI models that can seamlessly integrate multiple modalities, including text, structured data, computer code, visual or audio media, robotics or IoT sensors, and remote sensing data.
This topic centres around the development of innovative multimodal large AI models, covering both the training of foundation models and their subsequent fine-tuning. These models should show superior capabilities across a wide array of down-stream tasks. The emphasis is both on integrating new input data modalities into large AI models and on developing multimodal large AI models with either significantly higher capabilities and/or the ability to handle a greater number of modalities.
Moreover, projects should contribute to reinforcing Europe's research excellence in the field of large AI models by driving substantial scientific progress and innovation in key large AI areas. This includes the development of novel methods for pretraining multimodal foundation models. Additionally, novel approaches to effective and efficient fine-tuning of such models should be pursued.
Research activities should explore innovative methodologies for enhancing the representation, alignment, and interaction among the different data modalities, thereby substantially improving the overall performance and trustworthiness of these models. Advances in efficient computation for the pre-training, execution and fine-tuning of foundation models to reduce their computational and environmental impact, and increasing the safety of models are also topics of interest.
Proposals should outline how the models will incorporate trustworthiness, considering factors such as explainability, security, and privacy in line with provisions in the upcoming Artificial Intelligence Act. Additionally, the models should incorporate characteristics that align with European values, and provide improved multilingual capabilities, where relevant.
Proposals should address at least one of the following focus areas:
- the integration of innovative modalities of data for large AI models during training and inference. Examples of innovative modalities include event streams, structured data and sensor measurements. The incorporation of such new modalities could potentially bring unforeseen enhancements to model performance and enable their application in new domains like weather forecasting, robotics, and manufacturing.
- enhanced multimodal models that exceed the current state of the art, with either significantly improved capabilities or the ability to handle a larger number of modalities. This focus area also encompasses models capable of multi-modal output generation. Current large-scale multimodal models most commonly engage with only vision and language.
Each proposal is expected to address all of the following:
- Data Collection, Processing and Cross-modal Alignment. The proposal should describe convincingly the characteristics and availability of the large, trustworthy data sources, as well as the trustworthy data processing to be utilised within the project, detailing the data processing steps to ensure reliability, accountability and transparency, and the alignment of data among the different modalities. A modest portion (up to 10%) of the budget may be allocated to data collection activities; proposals may involve relevant data owners in this task, if necessary. Importantly, the proposal should delineate how potential privacy and IPR issues associated with the data will be managed and mitigated.
- Multimodal Foundation Model Pretraining. The pretrained multimodal foundation model is expected to demonstrate high capabilities across a wide range of tasks. The pretraining tasks used should be agnostic of down-stream tasks. These activities also cover the development of the codebase and implementation of small-scale experiments. A minor portion (up to 10%) of the budget may be allocated for the acquisition of computing resources for codebase development and small-scale experiments, though the primary source of computing resources for pretraining should be sought from external high-performance computing facilities such as EuroHPC or National centres. The proposal should describe convincingly the strategy to access these computing resources.
- Fine-Tuning of Multimodal Foundation Models: The proposal should clearly detail the activities pursued to fine-tune the model for diverse downstream tasks demonstrating illustrative potential use-cases. The tasks' output may either be of a single modality or multimodality. Research activities should investigate innovative methodologies designed to bolster the interplay between different data modalities, thereby enhancing the overall performance of these models.
- Testing and Evaluation: The proposal should detail the development of workflows, benchmarks, testing procedures, and pertinent tools for evaluating both foundation and fine-tuned models. Attention should be paid to the performance, transparency, bias, robustness, accuracy, and security of the models, through appropriate testing procedures (e.g., red teaming for safety and security), in compliance with the future AI Act.
Proposals should adopt a multidisciplinary research team, as appropriate, to cover all the above issues.
Proposals should adhere to Horizon Europe's guidelines regarding Open Science practices as well as the FAIR data principles. Open access should be provided to research outputs - including training datasets, software tools, model architecture and hyperparameters, as well as model weights - unless a legitimate interest or constraint applies. Additionally, proposals are encouraged to deliver results under open-source licenses.
All proposals are expected to embed mechanisms to assess and demonstrate progress (with qualitative and quantitative KPIs, benchmarking and progress monitoring, including participation to international evaluation contests, as well as illustrative application use-cases demonstrating concrete potential added value), and share communicable results with the European R&D community, through the AI-on-demand platform, and Common European data spaces, and if necessary other relevant digital resource platforms in order to enhance the European AI, Data and Robotics ecosystem through the sharing of results and best practice.
Proposals are also expected to dedicate tasks and resources to collaborate with and provide input to the open innovation challenge under HORIZON-CL4-2023-HUMAN-01-04. Research teams involved in the proposals are expected to participate in the respective Innovation Challenges.
This topic implements the co-programmed European Partnership on AI, data and robotics.
Specific Topic Conditions:Activities are expected to start at TRL 2-3 and achieve TRL 4-5 by the end of the project ? see General Annex B.
Projects are expected to contribute to one or more of the following outcomes:
- Enhanced applicability of large AI systems to new domains through the integration of innovative data modalities, such as sensor measurements (e.g. in robotics, IoT) or remote sensing (e.g. earth observation), as input.
- Improvement of current multimodal large AI systems capabilities and expansion of the number of data modalities jointly handed by one AI system, leading to broader application potential and improved AI performance.
Large artificial intelligence (AI) models refer to a new generation of general-purpose AI models (i.e., generative AI) capable of adapting to diverse domains and tasks without significant modification. Notable examples, such as OpenAI's GPT-4V and META?s Llama 2 or DinoV2, have demonstrated a wide and growing variety of capabilities.
The swift progression of large AI models in recent years holds immense potential to revolutionize various industries, due to their ability to adapt to diverse tasks and domains. For them to achieve their potential, access to vast data repositories, significant computing resources, and skilled engineers is required. A promising avenue of research is the development of multi-modal large AI models that can seamlessly integrate multiple modalities, including text, structured data, computer code, visual or audio media, robotics or IoT sensors, and remote sensing data.
This topic centres around the development of innovative multimodal large AI models, covering both the training of foundation models and their subsequent fine-tuning. These models should show superior capabilities across a wide array of down-stream tasks. The emphasis is both on integrating new input data modalities into large AI models and on developing multimodal large AI models with either significantly higher capabilities and/or the ability to handle a greater number of modalities.
Moreover, projects should contribute to reinforcing Europe's research excellence in the field of large AI models by driving substantial scientific progress and innovation in key large AI areas. This includes the development of novel methods for pretraining multimodal foundation models. Additionally, novel approaches to effective and efficient fine-tuning of such models should be pursued.
Research activities should explore innovative methodologies for enhancing the representation, alignment, and interaction among the different data modalities, thereby substantially improving the overall performance and trustworthiness of these models. Advances in efficient computation for the pre-training, execution and fine-tuning of foundation models to reduce their computational and environmental impact, and increasing the safety of models are also topics of interest.
Proposals should outline how the models will incorporate trustworthiness, considering factors such as explainability, security, and privacy in line with provisions in the upcoming Artificial Intelligence Act. Additionally, the models should incorporate characteristics that align with European values, and provide improved multilingual capabilities, where relevant.
Proposals should address at least one of the following focus areas:
- the integration of innovative modalities of data for large AI models during training and inference. Examples of innovative modalities include event streams, structured data and sensor measurements. The incorporation of such new modalities could potentially bring unforeseen enhancements to model performance and enable their application in new domains like weather forecasting, robotics, and manufacturing.
- enhanced multimodal models that exceed the current state of the art, with either significantly improved capabilities or the ability to handle a larger number of modalities. This focus area also encompasses models capable of multi-modal output generation. Current large-scale multimodal models most commonly engage with only vision and language.
Each proposal is expected to address all of the following:
- Data Collection, Processing and Cross-modal Alignment. The proposal should describe convincingly the characteristics and availability of the large, trustworthy data sources, as well as the trustworthy data processing to be utilised within the project, detailing the data processing steps to ensure reliability, accountability and transparency, and the alignment of data among the different modalities. A modest portion (up to 10%) of the budget may be allocated to data collection activities; proposals may involve relevant data owners in this task, if necessary. Importantly, the proposal should delineate how potential privacy and IPR issues associated with the data will be managed and mitigated.
- Multimodal Foundation Model Pretraining. The pretrained multimodal foundation model is expected to demonstrate high capabilities across a wide range of tasks. The pretraining tasks used should be agnostic of down-stream tasks. These activities also cover the development of the codebase and implementation of small-scale experiments. A minor portion (up to 10%) of the budget may be allocated for the acquisition of computing resources for codebase development and small-scale experiments, though the primary source of computing resources for pretraining should be sought from external high-performance computing facilities such as EuroHPC or National centres. The proposal should describe convincingly the strategy to access these computing resources.
- Fine-Tuning of Multimodal Foundation Models: The proposal should clearly detail the activities pursued to fine-tune the model for diverse downstream tasks demonstrating illustrative potential use-cases. The tasks' output may either be of a single modality or multimodality. Research activities should investigate innovative methodologies designed to bolster the interplay between different data modalities, thereby enhancing the overall performance of these models.
- Testing and Evaluation: The proposal should detail the development of workflows, benchmarks, testing procedures, and pertinent tools for evaluating both foundation and fine-tuned models. Attention should be paid to the performance, transparency, bias, robustness, accuracy, and security of the models, through appropriate testing procedures (e.g., red teaming for safety and security), in compliance with the future AI Act.
Proposals should adopt a multidisciplinary research team, as appropriate, to cover all the above issues.
Proposals should adhere to Horizon Europe's guidelines regarding Open Science practices as well as the FAIR data principles. Open access should be provided to research outputs - including training datasets, software tools, model architecture and hyperparameters, as well as model weights - unless a legitimate interest or constraint applies. Additionally, proposals are encouraged to deliver results under open-source licenses.
All proposals are expected to embed mechanisms to assess and demonstrate progress (with qualitative and quantitative KPIs, benchmarking and progress monitoring, including participation to international evaluation contests, as well as illustrative application use-cases demonstrating concrete potential added value), and share communicable results with the European R&D community, through the AI-on-demand platform, and Common European data spaces, and if necessary other relevant digital resource platforms in order to enhance the European AI, Data and Robotics ecosystem through the sharing of results and best practice.
Proposals are also expected to dedicate tasks and resources to collaborate with and provide input to the open innovation challenge under HORIZON-CL4-2023-HUMAN-01-04. Research teams involved in the proposals are expected to participate in the respective Innovation Challenges.
This topic implements the co-programmed European Partnership on AI, data and robotics.
Specific Topic Conditions:Activities are expected to start at TRL 2-3 and achieve TRL 4-5 by the end of the project ? see General Annex B.
General conditions
1. Admissibility conditions: described in Annex A and Annex E of the Horizon Europe Work Programme General Annexes
Proposal page limits and layout: described in Part B of the Application Form available in the Submission System
2. Eligible countries: described in Annex B of the Work Programme General Annexes
A number of non-EU/non-Associated Countries that are not automatically eligible for funding have made specific provisions for making funding available for their participants in Horizon Europe projects. See the information in the Horizon Europe Programme Guide.
3. Other eligibility conditions: described in Annex B of the Work Programme General Annexes
If projects use satellite-based earth observation, positioning, navigation and/or related timing data and services, beneficiaries must make use of Copernicus and/or Galileo/EGNOS (other data and services may additionally be used).
In order to achieve the expected outcomes, and safeguard the Union?s strategic assets, interests, autonomy, and security, participation in this topic is limited to legal entities established in Member States, associated countries, OECD and Mercosur countries, countries with which the EU cooperates under a Trade and Technology Council, and countries with which the EU has a Digital Partnership. Proposals including legal entities which are not established in these countries will be ineligible.
This decision has been taken on the grounds that, in the area of research covered by this topic, EU open strategic autonomy is particularly at stake. It is important to avoid a situation of technological dependency on a non-EU source, in a global context that requires the EU to take action to build on its strengths, and to carefully assess and address any strategic weaknesses, vulnerabilities and high-risk dependencies which put at risk the attainment of its ambitions.
For the duly justified and exceptional reasons listed in the paragraph above, in order to guarantee the protection of the strategic interests of the Union and its Member States, entities established in an eligible country listed above, but which are directly or indirectly controlled by a non-eligible country or by a non-eligible country entity, may not participate in the action unless it can be demonstrated, by means of guarantees provided by their eligible country of establishment, that their participation to the action would not negatively impact the Union?s strategic, assets, interests, autonomy, or security[[ The guarantees shall in particular substantiate that, for the purpose of the action, measures are in place to ensure that:
a) control over the applicant legal entity is not exercised in a manner that retrains or restricts its ability to carry out the action and to deliver results, that imposes restrictions concerning its infrastructure, facilities, assets, resources, intellectual property or know-how needed for the purpose of the action, or that undermines its capabilities and standards necessary to carry out the action;
b) access by a non-eligible country or by a non-eligible country entity to sensitive information relating to the action is prevented; and the employees or other persons involved in the action have a national security clearance issued by an eligible country, where appropriate;
c) ownership of the intellectual property arising from, and the results of, the action remain within the recipient during and after completion of the action, are not subject to control or restrictions by non-eligible countries or non-eligible country entity, and are not exported outside the eligible countries, nor is access to them from outside the eligible countries granted, without the approval of the eligible country in which the legal entity is established.]].
The participants directly subject to this eligibility condition are not only beneficiaries, affiliated entities and associated partners but also subcontractors. Their participation is therefore subject to an ex-ante ownership control assessment by the EC and, if relevant, the EC acceptance of a guarantee approved by an eligible country[[Notwithstanding that this eligibility condition specifically applies to beneficiaries, affiliated entities, associated partners and subcontractors, applicants are reminded that the restrictions on place of establishment and control extend to all participants under the grant agreement. See MGA, Annex 5, SPECIFIC RULES FOR CARRYING OUT THE ACTION (? ARTICLE 18) Implementation in case of restrictions due to strategic assets, interests, autonomy or security of the EU and its Member States.]].
4. Financial and operational capacity and exclusion: described in Annex C of the Work Programme General Annexes
5. Evaluation and award:
-
Award criteria, scoring and thresholds are described in Annex D of the Work Programme General Annexes
-
Submission and evaluation processes are described in Annex F of the Work Programme General Annexes and the Online Manual
-
Indicative timeline for evaluation and grant agreement: described in Annex F of the Work Programme General Annexes
6. Legal and financial set-up of the grants: described in Annex G of the Work Programme General Annexes
Specific conditions
7. Specific conditions: described in the [specific topic of the Work Programme]
Documents
Call documents:
Standard application form — call-specific application form is available in the Submission System
Standard application form (HE RIA, IA)
Standard evaluation form — will be used with the necessary adaptations
Standard evaluation form (HE RIA, IA)
MGA
Call-specific instructions
Ownership control assessment declaration
Additional documents:
HE Main Work Programme 2023–2024 – 1. General Introduction
HE Main Work Programme 2023–2024 – 7. Digital, Industry and Space
HE Main Work Programme 2023–2024 – 13. General Annexes
HE Framework Programme and Rules for Participation Regulation 2021/695
HE Specific Programme Decision 2021/764
Rules for Legal Entity Validation, LEAR Appointment and Financial Capacity Assessment
EU Grants AGA — Annotated Model Grant Agreement
Funding & Tenders Portal Online Manual
Please read carefully all provisions below before the preparation of your application.
Online Manual is your guide on the procedures from proposal submission to managing your grant.
Horizon Europe Programme Guide contains the detailed guidance to the structure, budget and political priorities of Horizon Europe.
Funding & Tenders Portal FAQ – find the answers to most frequently asked questions on submission of proposals, evaluation and grant management.
Research Enquiry Service – ask questions about any aspect of European research in general and the EU Research Framework Programmes in particular.
National Contact Points (NCPs) – get guidance, practical information and assistance on participation in Horizon Europe. There are also NCPs in many non-EU and non-associated countries (‘third-countries’).
Enterprise Europe Network – contact your EEN national contact for advice to businesses with special focus on SMEs. The support includes guidance on the EU research funding.
IT Helpdesk – contact the Funding & Tenders Portal IT helpdesk for questions such as forgotten passwords, access rights and roles, technical aspects of submission of proposals, etc.
European IPR Helpdesk assists you on intellectual property issues.
CEN-CENELEC Research Helpdesk and ETSI Research Helpdesk – the European Standards Organisations advise you how to tackle standardisation in your project proposal.
The European Charter for Researchers and the Code of Conduct for their recruitment – consult the general principles and requirements specifying the roles, responsibilities and entitlements of researchers, employers and funders of researchers.
Partner Search Services help you find a partner organisation for your proposal.
Updates - News
Call
