Dr. Simone Conia, Assistant Professor at Sapienza University of Rome specializing in Natural Language Processing and Large Language Models

Simone Conia

Assistant Professor

Sapienza University of Rome

About Me

I am an Assistant Professor at Sapienza University of Rome, where I conduct research at the intersection of Natural Language Processing and Machine Learning. My work focuses on advancing the capabilities and understanding of Large Language Models, particularly in multilingual settings.

Author of over 40 publications in top-tier conferences (including ACL, EMNLP, AAAI, IJCAI, NAACL), I have contributed significantly to the field of AI and NLP. My research has been recognized with multiple awards, including the Outstanding Paper Award at EMNLP 2024 and NAACL 2021. Earlier in my career, I was also a Distinguished PC at IJCAI, Honor Student at Sapienza, a CyberChallenge.IT podium winner, and winner of the Google Startup Workshop.

Large Language Models

Understanding and improving the capabilities of modern language models

Multilingual NLP

Developing NLP systems that work across diverse languages

Retrieval Augmented Generation

Enhancing language models with external knowledge retrieval

Evaluation

Creating robust metrics and benchmarks for NLP systems

Citation Statistics

Citations by Year (2020-2025)

2020
23
2021
63
2022
140
2023
174
2024
399
2025
253
Total Citations 1,063
h-index 14
i10-index 17

Awards & Recognition

6
Outstanding Paper - EMNLP 2024
Outstanding Paper - NAACL 2021
Distinguished PC - IJCAI
Honor Student - Sapienza University
3rd place - CyberChallenge.IT
1st place - Google Startup Workshop

Collaborations

Research collaborations and projects

Find Me On

Open source projects and AI models

Current Research

Large Language Models

Investigating the fundamental capabilities and limitations of large language models, with a focus on improving their reasoning abilities and reducing hallucinations.

Multilingual NLP

Developing robust natural language processing systems that can effectively handle multiple languages, with particular attention to low-resource languages.

Retrieval Augmented Generation

Enhancing language models by integrating external knowledge retrieval mechanisms to improve factual accuracy and reduce knowledge gaps.

Selected Publications

Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs

Simone Conia, Daniel Lee, Min Li, Umar Farooq Minhas, Saloni Potdar, Yunyao Li

In Proceedings of EMNLP 2024

KG-MT introduces a novel end-to-end approach that integrates multilingual knowledge graphs into neural machine translation via dense retrieval, enabling significant improvements in translating culturally-nuanced entity names compared to state-of-the-art systems.

Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs

Simone Conia, Min Li, Daniel Lee, Umar Minhas, Ihab Ilyas, Yunyao Li

In Proceedings of EMNLP 2023

M-NTA is a novel unsupervised approach that combines Machine Translation, Web Search, and Large Language Models to automatically generate high-quality multilingual textual information for knowledge graphs, significantly improving coverage and precision for non-English languages.

Unifying Cross-Lingual Semantic Role Labeling with Heterogeneous Linguistic Resources

Simone Conia, Andrea Bacciu, Roberto Navigli

In Proceedings of NAACL 2021

This paper introduces a unified model for cross-lingual Semantic Role Labeling that learns to map heterogeneous linguistic formalisms across languages without word alignment or translation, enabling robust and simultaneous annotation with multiple inventories.

Experience & Education

Assistant Professor

2023 - Present

Sapienza University of Rome

Leading research in Natural Language Processing with focus on Large Language Models, Multilingual NLP, and Retrieval Augmented Generation. Teaching graduate courses in Computer Science.

Research External Collaborator

2023 - Present

Apple

Collaborating on research projects related to natural language understanding and generation, focusing on improving the performance of language models in real-world applications, especially in multilingual contexts.

Research Scientist Intern

2023

Apple

Conducted research on multilingual language models, focusing on enhancing their understanding and generation capabilities across various languages. Developed novel methodologies for improving knowledge-related question answering tasks across multiple languages.

Ph.D. in Computer Science

2019 - 2023

Sapienza University of Rome

Specialized in Natural Language Processing and Machine Learning. Dissertation on multilingual language understanding and generation.

Get In Touch

Contact Information

Research Interests

Large Language Models Multilingual NLP Retrieval Augmented Generation Evaluation Machine Learning

I'm always interested in discussing research collaborations and innovative projects in NLP and AI.