Going Neurotic With Neural Word Embeddings… again!

18 July 2017

Luís Sarmento Tonic App

Word embeddings, such as Word2Vec or Glove, are vector representations that capture lexical-semantic properties of words. They constitute a practical way for transferring knowledge between two machine learning models, and they contribute to greatly reducing the learning time required for solving various NLP tasks. There is great practical interest in experimenting with different word embedding models. Neural-based models, due to their flexibility, are a great framework for that experimentation. However, that very same flexibility also brings many degrees of freedom to the experimentation, which end up becoming a challenge in itself. In this talk, we will present Syntagma, a python toolkit (still under development) that enables rapid experimentation of neural word embedding models. We will present preliminary results of experimenting with some of the hyper-parameters of a baseline word embedding model (similar to Word2Vec), and we will discuss the next steps for Syntagma.



Luís Sarmento holds a PhD in Computer Science from University of Porto (2010), with background in Electrical Engineering (Bs+MsC) and Artificial Intelligence (MsC). He has been working in the fields of Natural Language / Search for about 15 years, both as a member of research groups at the University of Porto and in the industry. In 2010 he joined Portugal Telecom / SAPO as tech lead for Big-Data and Recommender Systems, and in 2012 he joined Amazon where, until early 2017, he led research teams in the fields of Query Understanding and Voice Shopping. He is now CTO of Tonic App (http://www.tonicapp.com/), a startup developing productivity tools for medical doctors.