Encoder, decoder and encoder-decoder transformers are a type of neural network currently at the bleeding edge in NLP. This article…
What is MinHash? MinHash is a technique for estimating the similarity between two sets. It was first introduced in information…
What is SimHash? Simhash is a technique for generating a fixed-length "fingerprint" or "hash" of a variable-length input, such as…
This article discusses one of the most valuable tools when analysing textual data in natural language processing — fuzzy string…
Abstractive text summarization is a valuable tool in Python when working with large documents, or you quickly want to summarize…
This is a complete guide on utilising NLTK to build a whole preprocessing pipeline. Take the time to read through…
In this guide, we cover how to start with the bag-of-words technique in Python. We first cover what a bag-of-words…
Text classification is an important natural language processing (NLP) technique that allows us to turn unstructured data into structured data;…
Text similarity is a really useful natural language processing (NLP) tool. It allows you to find similar pieces of text…
What is text generation in NLP? Text generation is a subfield of natural language processing (NLP) that deals with generating…