Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Press enter or click to view image in full size

A complete guide to transfer learning from English to other Languages using Sentence Embeddings BERT Models

This story aims to provide details on Sentence Embeddings BERT Models to machine learning practitioners.

8 min readMay 21, 2020

--

The idea of sentence embedding using Siamese BERT-Networks 📋

Word Vector using Neural vector representations which has become ubiquitous in all sub fields of Natural Language Processing (NLP) is obviously a familiar field with a lot of people [1]. Among those techniques, Sentence embedding idea holds various application potential, which attempts to encode a sentence or short text paragraphs into a fixed length vector (dense vector space) and then the vector is used to evaluate how well their cosine similarities mirror human judgments of semantic relatedness [2].

Sentence embedding has wide applications in NLP such as information retrieval, clustering, automatic essay scoring, and for semantic textual similarity. Up to now, there are some popular methodologies for generating fixed length sentence embedding: Continuous Bag of Words (CBOW) Sequence-to-sequence models (seq2seq) and Bidirectional Encoder Representations from Transformers (BERT).

  • CBOW [3] is an extremely naive…

--

--

TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Chien Vu

Written by Chien Vu

PhD, Researcher | I love to connect and share | LinkedIn Top Voice https://www.linkedin.com/in/vumichien/

Responses (3)