About Me
I am a PhD student at the
School of Computer and Communication Sciences (IC),
EPFL, Switzerland. I am a doctoral research assistant at
NLP and LSIR labs
under supervision of Prof. Antoine Bosselut
and Prof. Karl Aberer.
Currently, I am doing a 4-month research internship at Google Zurich.
NEW: I’m on the job market, looking for a position starting in the Fall of 2025.
Research Interests
My research interests broadly encompass natural language processing (NLP) and machine learning, with a particular focus on improving the multilingual capabilities of large language models (LLMs), especially in low-resource settings.
I work across the entire pipeline of training multilingual LLMs, including:
- Pretraining Data Construction: Language identification, data filtering, and preprocessing to ensure high-quality datasets.
- Multilingual Data Mixtures: Designing effective data strategies for balanced language representation.
- Language-Aware Tokenization & Architectures: Developing multilingual tokenizers and LLMs that better handle low-resource languages.
- Robust Multilingual Evaluation: Curating robust benchmarks to assess model performance across languages.
Education
PhD in Computer & Communication Sciences | EPFL, Switzerland, 2019 - 2025 (expected)
MSc in Computer Engineering (Artificial Intelligence) | Shiraz University, Iran, 2013 - 2016
- Thesis: Deep Learning for Image Recognition
- Advisor: Prof. Ali Hamzeh
BSc in Computer Engineering (Software Engineering) | Shiraz University, Iran, 2009 - 2013
Work Experience
Research Intern | Google Research, Zurich, Switzerland [January-April 2025]
Working on a project to optimize long-context inference, improving LLMs' efficiency in processing and understanding extended inputs.
Doctoral Research Assistant | EPFL, Lausanne, Switzerland [2019 - Present]
Contributed to multiple research projects on multilingual LLMs, covering the entire training pipeline.
Led the multilingual effort within the SwissAI initiative.
Collaborated on projects with Google, Cohere, and HuggingFace.
Supervised junior researchers and summer interns. Served as a teaching assistant in several courses.
Scientific Assistant | Machine Learning and Optimization Laboratory, EPFL, Lausanne, Switzerland [Sept. 2018 - Sept. 2019]
I was involved in the mlbench project, a benchmark framework for distributed machine learning.
Research Intern | Data Analytics Laboratory, ETH, Zurich, Switzerland [May - July 201y]
As an intern in Thomas Hofmann's lab working under the supervision of Carsten Eickhoff,
I worked on a modular, patient-centric information retrieval system designed for precision oncology applications.
The result of the project was a submission to the TREC 2017 Precision Medicine track.
Research Intern | Max Planck Institute for Software Systems, Kaiserslautern, Germany [February - April 2017]
As an intern under the supervision of Manuel Gomez Rodriguez,
I worked on a project analyzing the dynamics of citation networks: quantifying the value of a set of published papers and modeling knowledge diffusion across a citation network.
R&D Engineer | Center of Intelligent Vision & Image Processing, Shiraz University, Shiraz, Iran [2016 - 2017]
I worked on projects focused on object detection, facial expression analysis, and real-time face recognition and tracking.
Publications
-
ICLR'25
Angelika Romanou, Negar Foroutan, Anna Sotnikova, et al.
International Conference on Learning Representations (ICLR), 2025.
-
PNAS'24
Beatriz Borges*, Negar Foroutan*, Deniz Bayazit*, Anna Sotnikova*, et al.
Proceedings of the National Academy of Sciences (PNAS), 2024.
-
EMNLP'24
Deniz Bayazit, Negar Foroutan, Zeming Chen, Gail Weiss, Antoine Bosselut
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
-
arXiv
Constanza Fierro, Negar Foroutan, Desmond Elliott, Anders Søgaard
arXiv preprint, 2024.
-
EMNLP'23
Negar Foroutan, Mohammadreza Banaei, Karl Aberer, Antoine Bosselut
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
-
ACL'23
Yasmine Karoui, Rémi Lebret, Negar Foroutan, Karl Aberer
Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
-
EMNLP'22
Negar Foroutan, Mohammadreza Banaei, Remi Lebret, Antoine Bosselut, Karl Aberer
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
-
Negar Foroutan, Angelika Romanou, Stéphane Massonnet, Rémi Lebret, Karl Aberer
Proceedings of the 4th Financial Narrative Processing Workshop @LREC2022.
-
Negar Foroutan, Martin Jaggi.
Workshop on "Beyond first-order methods in ML systems" at ICML, 2020.
-
TCSS'17
Negar Foroutan, Ali Hamzeh
IEEE Transactions on Computational Social Systems.
-
Negar Foroutan, Jannick Griner, Nicolas Mesot, Leandro von Werra and Carsten Eickhoff
TREC Precision Medicine 2017.
-
ICCKE'15
Negar Foroutan, Ardavan Afshar, Bahareh Ashenagar, Ali Hamzeh
5th International Conference on Computer and Knowledge Engineering (ICCKE).
Powered by Jekyll and Minimal Light theme.