Word Embeddings

Machine LearningNLPFree Lesson

Advertisement

Introduction

Word embeddings represent words as dense vectors capturing semantic relationships.

Using Pre-trained Embeddings

import gensim.downloader as api

# Download pre-trained word vectors
model = api.load("glove-wiki-gigaword-100")

# Find similar words
model.most_similar("king")
model.most_similar("cat", topn=5)

# Word analogies
model.most_similar(positive=["king", "woman"], negative=["man"])[0]

Word2Vec

from gensim.models import Word2Vec

sentences = [["cat", "sat", "on", "mat"], ["dog", "ran", "fast"]]
model = Word2Vec(sentences, vector_size=100, window=5, min_count=1)

vector = model.wv["cat"]
similar = model.wv.most_similar("cat")

Embedding Layer in Keras

from tensorflow.keras import layers

embedding = layers.Embedding(input_dim=10000, output_dim=128, input_length=100)
# Input: (batch_size, 100)
# Output: (batch_size, 100, 128)

Practice Problems

  1. Use pre-trained word vectors
  2. Train Word2Vec model
  3. Explore word relationships
  4. Use embeddings in neural networks
  5. Visualize word vectors

Advertisement

Need Expert Python Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement