Skip to content

Natural Language Processing (AING426)

Text preprocessing including text cleaning and tokenization, removal of special characters, case conversion, correcting spellings, removal of stop words, stemming and Lemmatization. Part-of-speech tagging, constituency and dependency parsing. Bag of Words model, N-Grams, term frequency, TF-IDF document representation. Document Similarity. Document classification using TF-IDF representation. Word embeddings, Word2Vec representations using continuous Bag of Words and skip-gram models, the GLOVE model for word embeddings. Text summarization and topic models. Introduction to transformer neural networks. Input embedding, positional encoding and multi-head attention. Pretrained language models. Transfer learning from pretrained models. Using pretrained tokenizers to convert tokens to index numbers, selecting a batch size and creating iterators. Loading a pretrained model and fine-tuning its parameters.

Related Programs

Ask Us