Personal photo

Iana Atanassova, Ph.D.

Assistant Professor | Maître de Conférences HDR

Centre Tesnière - CRIT, Université de Bourgogne Franche-Comté
30 rue Mégevand, 25030 Besançon Cedex, France

iana.atanassova@univ-fcomte.fr

ORCiD: 0000-0003-3571-4006

Research

Subjects and Projects

My research is in the field of semantic information retrieval and extraction, Natural Language Processing and more specifically full-text scientific paper processing. I explore the problems of semantic annotation, linguistic modeling, semantic publishing and information extraction from scientific texts in the perspective of developing semantic-driven text navigation interfaces.

Semantic annotation

I am interested in knowledge-based approaches to semantic annotation, text classification and information extraction. I examine the ways in which linguistic resources and rule-based processing can contribute to the precision of machine learning systems, statistical models and deep learning. For example, I have been working on the full text semantic processing of scientific papers and the classification of sentences in various categories such as method, result, hypothesis, etc.

Information Retrieval, Semantic Web, LOD

I have been conducting a series of experiments on faceted semantic search on scientific corpora using knowledge-based approaches for building semantic-driven IR interfaces. This problem is closely related to the problem of ontology population and the generation of Linked Open Data from papers.

The structure of scientific papers

By analysing the IMRaD (Introduction, Methods, Results and Discussion) structure of scientific papers, we can show that this rethorical framework affects the distribution of references, the lexical distributions and many other phenomena that we can observe in papers. Studying the properties of scientific papers provides data for numerous applications, especially IR, Bibliometrics and the analysis of citation networks, automatic summarization and others. Our aim is gain a deeper understading of scientific writing in order to design robust and efficient methods for the intelligent exploitation of scientific corpora using semantic technology.
This research has resulted in studying the argumentative structures and the composition of abstracts, citation context analysis, and applications to visualization and science maps.

PhD. Theses in Progress

I am the principal supervisor of the following PhD students at the Université de Bourgogne Franche-Comté, France:
  • Student: Youcef Ihab Morsi (since 2016). Thesis title: "Information extraction in Arabic". We design a linguistically-motivated approach for the morphosyntactic and semantic processing using the specific structure of Arabic. The results are to be compared with state of the art parsers such as Universal Dependency parcer.
  • Student: Francois C. Rey (since 2016). Thesis title: "Categorisation of speculative sentences in scientific texts". We study the expression of speculations and uncertainty in scientific papers and design methods for the automatic extraction and classification of text segments.
  • Student: Séda Ozturk (since 2017). Thesis title: "Analysing Scientific Papers for the Extraction and Characterization of Datasets". In the context of Open Science, we design methods to extract and classify information on datasets and results obtained by using them in scientific papers.

Past projects

  • Project WEBSO+ (2016-2018, funded by Interreg France-Switzerland): development of a platform for strategic technological and competitive intelligence and e-reputation integrating semantic processing, text classification and sentiment analysis features
  • Project SARS (2015-2017, funded by Franche-Comté, France): tools for scientific writing using lexical databases and Information Retrieval
  • As Post-Doctoral Fellow at Concordia University, Montreal, Canada (2014):
    • Design and implementation of processing chains for the digitizing, OCR and semantic annotation of journal articles. Collaboration with BANQ (Bibliothèque et Archives nationales du Québec) and Érudit. Semantic analysis of journal articles with applications to social and historical studies, in collaboration with Prof. Jean-Philippe Warren.
    • Studying the impact of the IMRaD (Introduction, Methods, Results and Discussion) structure of scientific papers on the distribution of citations. Citation context analysis, in collaboration with Marc Bertin, Vincent Larivière and Yves Gingras. Applications to visualization and science maps.
  • As associate researcher at STIH, Paris-Sorbonne University, Paris, France (2012 - 2013): research in the field of semantic annotation of scientific articles, Information Retrieval, Information Extraction and Interfaces.

Editorial Work and Science Evaluation

Advisory Board

Member of the advisory board of John Benjamin’s NLP book series.

Editor

Editor of the Research Topic Mining Scientific Papers: NLP-enhanced Bibliometrics in Frontiers in Research Metrics and Analytics.

Reviewer for journals

Reviewer for the following journals:

Organising committees

Member of the organising committees of the following workshops:

ISO-AFNOR Commission

Member of the ISO-AFNOR commission, work group ISO/TC 37/SC 4 "Language Resource Management".

National referent for the norm ISO 24617 "Semantic Annotation Framework - Part 6 (SemAF-Basics) and Part 8 (SemAF-DRel)".

Programme committees

Member of the programme committees of the following conferences and workshops:

  • 7th International Workshop On Mining Scientific Publications (WOSP-2018), LREC 2018
  • Computational Linguistics in Bulgaria (CLIB 2018)
  • Recent Advances in Natural Language Processing (RANLP-2017)
  • Workshop of Scholarly Web Mining (SWM) at the International Conference on Web Search and Data Mining (WSDM 2017) (SWM-2017)
  • Workshop Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL) at the Joint Conference on Digital Libraries (JCDL '16) (BIRNDL-2016)
  • Workshop series "Bibliometric-enhanced Information Retrieval" (BIR) at the European Conference on Information Retrieval (ECIR) (BIR-2016), since 2016
  • Workshop on "Natural Language Processing for Slavic Languages" ("Traitement Automatique des Langues Slaves" - TASLA), 22ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN - 2015), Caen, France, 2015
  • The International Florida Artificial Intelligence Research Society Conference (FLAIRS-28, FLAIRS-29), since 2015
  • Poster Judge at The American Association for the Advancement of Science (AAAS) 2015 Annual Meeting, San Jose, California, United States
  • The International ACM Conference on Management of computational and collective intElligence in Digital EcoSystems (MEDES) (MEDES-2014, MEDES-2015, MEDES-2016), since 2014

Publications [Download BibTeX]

Datasets

Other Communications and Academic Merit

  • Invited Keynote lectures: INFuture Conference "Integrating ICT in Society", 2017, Zagreb, Croatia
  • Nomination for the annual International Society for Scientometrics and Infometrics (ISSI) Paper of the year award: Marc Bertin, Iana Atanassova, Vincent Larivière, and Yves Gingras. The Invariant Distribution of References in Scientific Papers. JASIST, 2016
  • TDM Stories: The structure of papers, OpenMinTeD Interview

Memberships and associations

Teaching activities

Courses

Since 2014, at the Université de Franche-Comté, France

  • course "Computer Programming for Natural Language Processing 1" (lectures and tutorial classes), Master-1 "Natural Language Processing"
    Imperative programming: python and perl. Development of applications for Natural Lanaguage Processing.
  • course "Models for Natural Lanaguage Processing" (lectures), B.Sc. "English Language", specialty "Natural Language Processing"
  • course "Cognitive Science" (lectures and tutorial classes), Master-1 "Natural Language Processing"
  • course "Computer Programming 2" (lectures and tutorial classes), Master-2 "Natural Language Processing"
  • course "Computational Linguistics" (lectures and tutorial classes), B.Sc. "English Language", specialty "Natural Language Processing"
  • course "Formal Linguistics and Computer Programming" (tutorial classes), B.Sc. "English Language", specialty "Natural Language Processing"
  • course "Introduction to Natural Language Processing" (tutorial classes), B.Sc. "English Language", specialty "Natural Language Processing"

2011/2012, Paris-Sorbonne University (Paris-IV), Paris, France

  • course "Introduction to Informatics and Natural Language Processing" (lectures and tutorial classes), Master-1 "French Language and Applications" (web site)
  • course "Preparation for the Certificate of Informatics and Internet (C2i) - level 1" (lectures and tutorial classes)

2009/2010 and 2010/2011, Paris-Sorbonne University (Paris-IV), Paris, France

  • course "Methodology" (tutorial classes), Master-1 "French Language and Information Technologies"
  • course "Logics and Introduction to Informatics" (tutorial classes), B.Sc. "French Language and Information Technologies"
  • course "General Mathematics and Analysis - 1" (lectures and tutorial classes), B.Sc. "French Language and Information Technologies"
  • course "Preparation for the Certificate of Informatics and Internet (C2i) - level 1" (lectures and tutorial classes)

2006/2007, 2007/2008 and 2008/2009, Paris-Sorbonne University (Paris-IV), Paris, France

  • course "Informatic tools" (tutorial classes), Master-1 "Philosophy and Sociology"

Master's thesis supervision

  • Student: Léo Annebi (2018). Subject: "Information Extraction and Scientific Monitoring in the domain of Aerospace Research and Technology"
  • Student: Inès Hatira (2018). Subject: "Controlled Language for Alert Messages in Smart Cities: Application to Pedestrian Safety"
  • Student: Camélia El Cadi (2018). Subject: "Controlled Language for Alert Messages in Smart Cities: Automatic Translation of Security Messages"
  • Student: Nicolas Gutherlé (2018). Subject: "Controlled Language for Alert Messages in Smart Cities: Controlling the Phonological Ambiguity"
  • Student: Youcef Ihab Morsi (2016). Subject: "Vocabulary-based Text Complexity Assessment in Arabic"
  • Student: Marc Delhotal (2016). Subject: "Automatic categorisation of geographic location in scientific texts"
  • Student: Francois-C. Rey (2016). Subject: "Automatic generation of resources for English language learning"
  • Student: Laurie Lougrada (2016). Subject: "Grammar and spell checkers"

Curriculum Vitae

Current position

Since 09/2014
Assistant Professor (Maître de Conférences HDR)
Centre Tesnière - CRIT
Université de Franche-Comté
Université de Bourgogne Franche-Comté
Besançon, France

Past positions

02/2017
Visiting Scholar, Research Group in Computational Linguistics, University of Wolverhampton, United Knigdom

RGCL

02/2014 - 09/2014
Post-doctoral Fellow, Concordia University, Montreal, Canada

Concordia

03/2012 - 12/2013
Research & Development, MyScienceWork, Paris, France / Luxembourg

MyScienceWork

09/2011 - 09/2013
Invited lecturer & Associate researcher, Sens, Texte, Informatique, Histoire (STIH)
Paris-Sorbonne University (Paris-IV), Faculty of Applied Human Sciences, Paris, France
09/2009 - 09/2011
Associate researcher (A.T.E.R.), Langues, Logiques, Informatique, Cognition (LaLIC)
Paris-Sorbonne University (Paris-IV), Faculty of Applied Human Sciences, Paris, France
09/2006 - 09/2009
Allocataire de recherche, Paris-Sorbonne University (Paris-IV), Paris, France

Paris-Sorbonne

Academic qualification

12/2015
Habilitation to Direct Research (Habilitation à Diriger des Recherches - HDR) in Natural Language Processing, University of Franche-Comte, Centre Tesnière
Title: Analysis of Scientific Discourse. Applications in Information Retrieval, Information Extraction and Semantic Web
Supervisor: Prof. Sylviane Cardey-Greenfield
09/2006 - 01/2012
Ph.D. thesis in Mathematics, Informatics and Applications in Human Sciences, Paris-Sorbonne University (Paris-IV), Faculty of Applied Human Sciences, Paris, France
Thesis title: Information Retrieval and Text Navigation through the Exploitation of the Automatic Semantic Annotation of the Excom Engine
Supervisor: Prof. Jean-Pierre Desclés
09/2005 - 06/2006
M.Sc. "Information and Communication", Paris-Sorbonne University (Paris-IV), Faculty of Applied Human Sciences, Paris, France
Thesis title: Automatic Semantic Annotation in Bulgarian and in French. Text syntheses.
Supervisor: Prof. Jean-Pierre Desclés

Paris-Sorbonne

09/2004 - 06/2005
Master "Computational Linguistics", Sofia University, Faculty of Classical and Modern Philologies, Sofia, Bulgaria
09/2002 - 06/2005
"English Studies", Sofia University, Faculty of Classical and Modern Philologies, Sofia, Bulgaria (Completed the first 3 years out of 4 of the B.Sc. program)
09/2000 - 06/2004
B.Sc. in Mathematics, Sofia University, Faculty of Mathematics and Informatics, Sofia, Bulgaria

Sofia University

Toolbox

Some of the tools that I use to develop semantic processing applications:
Solr Search Server, D2RQ, Circos visualization tool, Shiny server, NLTK, CoreNLP