Neri Van Otten

MinHash — How To Deal With Finding Similarity At Scale With Python Code To Get Started

What is MinHash? MinHash is a technique for estimating the similarity between two sets. It was first introduced in information…

3 years ago

SimHash — The Ultimate Guide And How To Get Started Guide In Python

What is SimHash? Simhash is a technique for generating a fixed-length "fingerprint" or "hash" of a variable-length input, such as…

3 years ago

How To Implement Fuzzy String Matching [4 Ways In Python]

This article discusses one of the most valuable tools when analysing textual data in natural language processing — fuzzy string…

3 years ago

How To Implement Abstractive Text Summarization In Python [2 Ways]

Abstractive text summarization is a valuable tool in Python when working with large documents, or you quickly want to summarize…

3 years ago

Top 15 Most Popular Machine Learning And Deep Learning Algorithms For NLP

This list covers the top 7 machine learning algorithms and 8 deep learning algorithms used for NLP. If you are…

3 years ago

Reinforcement Learning In NLP Made Simple & 5 Relevant Tools To Get Started

This article covers reinforcement learning and its application in natural language processing (NLP). It also covered the latest developments in…

3 years ago

Top 14 Steps To Build A Complete NLTK Preprocessing Pipeline In Python

This is a complete guide on utilising NLTK to build a whole preprocessing pipeline. Take the time to read through…

3 years ago

How To Implement Bag-Of-Words In Python [2 Ways: scikit-learn & NLTK]

In this guide, we cover how to start with the bag-of-words technique in Python. We first cover what a bag-of-words…

3 years ago

Text Classification: How To In Python [Best 2 Ways Machine Learning & Deep Learning]

Text classification is an important natural language processing (NLP) technique that allows us to turn unstructured data into structured data;…

3 years ago

Top 7 Ways To Implement Document & Text Similarity In Python: NLTK, Scikit-learn, BERT, RoBERTa, FastText and PyTorch

Text similarity is a really useful natural language processing (NLP) tool. It allows you to find similar pieces of text…

3 years ago