ABOUT ME
Professional
info
I graduated from Language and Communication Technologies Master's Program (EMLCT) with the distinction 'Cum Laude'. Currently I am based in Nice, France since I worked as a Research Engineer at INRIA Research Center. We worked on the detection of cyberbullying events on social media dataset for different languages. I showed the contribution of emotion and sentiment analysis on the detection of cyberbullying instances.
As a Computational Linguist with Python, I am interested in 'Natural Language Processing' and 'Machine Learning'. I am always open to learn new technical concepts. I love working on languages.
-
In my master's thesis, I worked on sentiment classification.
-
I am familiar with ML and NLP libraries such as scikit learn, keras, NLTK.
-
As a motivated and hardworking person as well as a good team player, I would bring valuable contributions to where I work.
-
I gained project management experience in Google since I worked there as a Speech Linguistic Project Manager.
Work
experience
Independent Computational Linguist and Researcher in France and The Netherlands (Remote)
June 2019 - Present
I am working as an Independent Computational Linguist and Researcher based in Europe (France, Netherlands). I provide freelance computational linguistic and language consultancy services to various industrial and research collaborators (e.g. Lionbridge, Unbabel, DefinedCrowd, Babelscape srl).
Some projects which I am involved in:
--> providing services to create high quality textual data.
--> contribution to Turkish WordNet.
--> annotations for offensive words/internet slangs.
--> quality check services.
--> translation quality evaluations.
--> providing high quality speech data (e.g. Alexa).
--> native speaker expertise on the provided speech data.
--> building text classification models by implementing machine/deep learning algorithms.
--> contribution to some shared tasks (SemEval 2020 - Task 12: OffensEval 2020 Identification of Turkish Offensive Language on social media datasets)
Research Engineer at Inria Research Center, Sophia Antipolis, France
March 2018 - June 2019
worked on:
--> Cyberbullying Detection
--> Abusive Language Detection
--> Sentiment Analysis
--> Emotion Analysis
Keywords: Machine Learning, Natural Language Processing, Text Classification, BullyingDetection, Emotion and Sentiment Analysis
Speech Linguistics Project Manager (Turkish) in Google, Berlin, Germany (via Adecco)
April 2014 - June 2015
-
I worked in the team that built Turkish voice for OK Google in collaboration with engineers and researchers in the field of speech technologies, Natural Language Processing and Machine Learning.
-
I was based in Google Berlin Office. I regularly collaborated with colleagues in New York, Mountain View, London, Istanbul, Norway, Denmark, Sweden and Dublin Google Offices. I was responsible for the recruitment of my teammates to work in Berlin Office and of a professional voice actress to work in Turkey. I was also responsible for quality checks for the work carried out by Vendor Companies.
-
More about responsibilities:
-
Working on a number of projects towards Speech research (i.e. ASR, TTS).
-
Managing a team of Data Evaluators
-
Overseeing and managing all work related to achieving high data quality for speech projects in Turkish.
-
Training, managing and overseeing the work of my team
-
Creating verbalisation rules, such as expanding URLs, email addresses, numbers
-
Creating annotation conventions through RegExp scripting
-
Evaluating data quality
-
Providing expertise on pronunciation and phonotactics
-
Working with QA tools according to given guidelines and using in-house tools
-
Linguist (Telecommute) in Appen Language Solutions
December 2012 - May 2013
Phonemic Transcription in Turkish.
English Language Teacher in Multiple Schools
2007 - 2012
Awards
Languages
-
Turkish (Native language)
-
English (Full professional proficiency)
-
Dutch (Flemish, Intermediate level)
-
German (Elementary level)
-
French (Elementary level)
Key Skills
-
Python
-
Natural Language Processing
-
Project Management
-
Language and Communication Technologies
Education
University of Groningen, The Netherlands (Cum Laude)
Erasmus Mundus Language and Communication Technologies (EMLCT) Master's Programme (2nd year)
2016 - 2017
Master's thesis title: "Using Linguistic Features and External Lexical Resources for Improving Sentence-level Sentiment Classification"
Courses taken:
-
Statistics (e.g. SPSS)
-
Corpus Linguistics (on Python)
-
Semantic Web Technology (e.g. Protege, Ontology building, sparql)
-
Linguistic Theory (Syntax and Semantics)
-
Learning from Data (Machine Learning on Python; sklearn, keras, gensim)
University of Basque Country, Spain
Erasmus Mundus Language and Communication Technologies Master's Programme (1st year)
2015 - 2016
Master's thesis title: "Using Linguistic Features and External Lexical Resources for Improving Sentence-level Sentiment Classification"
Courses taken:
-
Programming (e.g. Python)
-
Computational Linguistics, Morphology, and Syntax (e.g. FOMA, NLTK)
-
Artificial Intelligence (e.g. Protege, CLIPs)
-
Statistics, Machine Learning (e.g. WEKA) and Corpus Linguistics
-
Advanced Semantics
-
Computational Logic, Discourse, and Pragmatics (e.g. E-Theorem Prover, Prolog)
-
Applications on Language Technologies (i.e. Statistical Machine Translation, Rule Based Translation, Information Retrieval and Extraction, NLP and Education)
-
Speech Processing (i.e. Signal Processing, Text to Speech, Automatic Speech Recognition)
Publications
-
Corazza, M., Menini, S., Arslan, P., Sprugnoli, R., Cabrio, E., Tonelli, S., & Villata, S. (2018, September). InriaFBK at Germeval 2018: Identifying Offensive Tweets Using Recurrent Neural Networks. In GermEval 2018 Workshop.
-
Corazza, M., Menini, S., Arslan, P., Sprugnoli, R., Cabrio, E., Tonelli, S., & Villata, S. (2018, December). Comparing Different Supervised Approaches to Hate Speech Detection. In EVALITA 2018.
-
Arslan, P., Corazza, M., Cabrio, E., Villata, S. (2019, April). Overwhelmed by Negative Emotions? Maybe You Are Being Cyber-bullied!. In The 34th ACM/SIGAPP Symposium on Applied Computing (SAC ’19), April 8–12, 2019, Limassol, Cyprus. ACM, New York, NY, USA, Article 4, 3 pages. https://doi.org/10.1145/3297280.3297573