How To Implement Data Augmentation In Python [Image & Text (NLP)]

by | Mar 9, 2023 | Artificial Intelligence, Machine Learning, Natural Language Processing

Top 7 ways of implementing data augmentation for both images and text. With the top 3 libraries in Python to use for image processing and NLP.

What is data augmentation?

Data augmentation is a technique used in machine learning and computer vision to increase the size of a dataset by creating new variations of existing data. It aims to make machine learning models better at generalising by giving them more training data with different versions of the same information.

Data augmentation techniques can include a range of transformations such as flipping, rotation, scaling, cropping, adding noise, changing brightness or contrast, and more. By applying these transformations to the original data, the resulting augmented data can provide the model with additional examples to learn from and can also help to reduce overfitting, as the model is exposed to a broader range of data.

Some typical applications of augmentation include image classification, object detection, and natural language processing, among others. In addition, augmentation can be performed on-the-fly during training or offline before training by generating and storing the augmented data in a separate dataset.

Data augmentation in deep learning

Data augmentation is a widespread technique in deep learning for generating new training data from existing data. It is used to increase the training data and introduce variation to the data, improving the robustness and generalisation of deep learning models.

Here are some standard augmentation techniques used in deep learning:

  1. Image augmentation: This involves manipulating images to create variations in the training data, such as rotating, flipping, cropping, scaling, or adding noise.
  2. Text augmentation: This involves manipulating text to create variations in the training data, such as replacing synonyms, adding or removing words, or changing the word order.
  3. Audio augmentation: This involves manipulating audio data to create variations in the training data, such as adding noise, changing pitch or tempo, or adding echoes.
  4. Video augmentation: This involves manipulating video data to create variations in the training data, such as flipping, rotating, or cropping frames or changing the playback speed.

Different deep learning frameworks, like TensorFlow and PyTorch, have built-in libraries or scripts that can be used for data augmentation. For example, in TensorFlow, the ImageDataGenerator class can be used for image augmentation, while in PyTorch, the torchvision.transforms module provides various image transformation functions.

It is important to note that augmentation can increase the training time and computational cost of deep learning models, and some techniques may not be suitable for all types of data. Therefore, it is crucial to carefully evaluate and choose the appropriate augmentation techniques based on the specific problem and dataset.

Top 7 data augmentation techniques for images

Data augmentation techniques involve applying transformations or modifications to the existing dataset to generate additional training samples. Here are some standard augmentation techniques used in image processing:

  1. Rotation: Rotating images by a certain degree can be helpful in training models to recognise and recognise objects from different angles.
  2. Flipping: Flipping images horizontally or vertically can help increase the dataset’s size and alleviate the model learning symmetry and invariance.
  3. Cropping and Resizing: Cropping images to different sizes or resolutions can create new samples that contain only a portion of the original image, which can be helpful for object detection tasks.
  4. Adding Noise: Adding random noise to images can reduce overfitting and improve the robustness of the model.
  5. Brightness and Contrast Adjustment: Changing the brightness or contrast of images can create new samples with different lighting conditions, which can help the model to generalise better.
  6. Zooming: Zooming in or out on images can create new samples with different levels of detail, which can help the model to recognise objects at different scales.
  7. Translation: Translating images by a certain amount in any direction can create new samples with different positions of the objects in the picture.

Using different filters could also be used for augmentation

Top 7 data augmentation techniques for NLP

Data augmentation techniques can also be applied to text data, particularly in natural language processing (NLP). Here are some standard augmentation techniques used in:

  1. Synonym Replacement: Replacing words in a sentence with their synonyms can create new samples that retain the same meaning but use different words.
  2. Back Translation: Translating a sentence from one language to another and back to the original language can create new samples with different phrasing and word choices.
  3. Random Deletion: Randomly deleting words from a sentence can create new samples with different sentence structures and help the model learn to deal with missing information.
  4. Random Swap: Randomly swapping two words in a sentence can create new samples with different sentence structures and word orders.
  5. Random Insertion: Randomly inserting words into a sentence can create new samples with additional information and help the model learn to handle noisy input.
  6. Changing Tenses: Changing the tense of verbs in a sentence can create new samples with different temporal contexts.
  7. Masking: Replacing words in a sentence with a unique token can create new samples with missing information, which can help the model learn to fill in gaps in the input.

Top 3 data augmentation libraries in Python for images

Several Python libraries provide augmentation functionality for machine learning tasks. Here are some of the most popular ones for image processing:

1. Keras ImageDataGenerator

This library provides an easy-to-use interface for image augmentation in Keras, a popular deep learning framework. It supports a variety of augmentation techniques such as rotation, shearing, zooming, and flipping.

from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

2. Albumentations

This powerful and flexible library for image augmentation supports a wide range of transformations. It can be used with various deep learning frameworks and supports both CPU and GPU acceleration.

import albumentations as A

transform = A.Compose([
    A.Rotate(limit=20),
    A.RandomCrop(width=256, height=256),
    A.HorizontalFlip(),
    A.RandomBrightnessContrast(),
    A.Normalize()
])

3. imgaug

This library provides a flexible and extensible platform for image augmentation with support for various transformations. It also supports batching and multiprocessing for the efficient processing of large datasets.

import imgaug.augmenters as iaa

seq = iaa.Sequential([
    iaa.Rotate((0, 45)),
    iaa.Flipud(),
    iaa.GaussianBlur(sigma=(0, 3.0)),
    iaa.AdditiveGaussianNoise(scale=(0, 0.1*255)),
    iaa.Crop(px=(0, 16))
])

These libraries provide a variety of augmentation techniques that can be combined and customized to suit the specific needs of your machine learning task.

Top 3 data augmentation libraries in Python for text

Data augmentation for text can help improve the performance of natural language processing (NLP) models by increasing the diversity and quantity of training data. Here are some popular Python libraries for text augmentation:

1. NLPAug

This library provides various text augmentation techniques, including word embeddings, back-translation, and contextual word embeddings. It supports various NLP tasks such as text classification, sentiment analysis, and machine translation.

import nlpaug.augmenter.word as naw

aug = naw.ContextualWordEmbsAug(
    model_path='bert-base-uncased', 
    action="substitute"
)
augmented_text = aug.augment("This is a sentence.")

2. TextBlob

This library provides simple and easy-to-use methods for text augmentation, including synonym replacement and random word insertion.

from textblob import Word

word = Word("happy")
augmented_word = word.synsets[0].lemma_names()[0]

3. AugLy

This library provides a range of augmentation techniques for various data types, including text. It supports spelling correction, keyboard noise, and random word deletion techniques.

from augly.text.augmenters import (
    apply_levenshtein_distance,
    apply_typo,
    insert_punctuation_chars,
)

augmented_text = apply_levenshtein_distance("This is a sentence.")

These libraries provide a variety of text augmentation techniques that can be used to generate additional training data for NLP tasks, improving the model’s ability to generalize and perform well on unseen data. It is important to note that text augmentation can sometimes lead to semantically incorrect or nonsensical text, so it is important to carefully evaluate the generated data before using it for training.

Does data augmentation combat overfitting?

Data augmentation is often used to combat overfitting in machine learning. Overfitting occurs when a model learns to fit the training data too closely and needs to generalise well to new, unseen data.

Data augmentation helps prevent overfitting by increasing the variety and quantity of training data, which can help the model learn more generalised and robust features.

By generating new data instances through data augmentation, the model is exposed to more data variations and is, therefore, less likely to memorise the training data and more likely to learn meaningful patterns. Additionally, augmentation can reduce the effects of class imbalance in the data by generating new instances of underrepresented classes.

However, it is essential to note that more than data augmentation is needed to prevent overfitting. Other techniques such as regularisation, early stopping, and model architecture optimisation are also necessary to ensure the best performance and generalisation of the model.

Conclusion

Data augmentation is a powerful technique for improving the performance and robustness of machine learning models. It involves generating new training data from existing data, which can increase the diversity and quantity of data and introduce variation that helps models generalise better to unseen data.

Many techniques are available for augmentation, including image, text, audio, and video augmentation. In addition, several popular libraries and frameworks support implementing data augmentation in machine learning pipelines.

However, it is crucial to carefully evaluate the augmentation techniques and their impact on the model’s performance. Some methods may not be suitable for all data types and can introduce unwanted noise or bias. Overall, data augmentation can be a valuable tool in the machine learning toolbox for improving the performance and robustness of models on a wide range of tasks.

About the Author

Neri Van Otten

Neri Van Otten

Neri Van Otten is the founder of Spot Intelligence, a machine learning engineer with over 12 years of experience specialising in Natural Language Processing (NLP) and deep learning innovation. Dedicated to making your projects succeed.

Recent Articles

key elements of cognitive computing

Cognitive Computing Made Simple: Powerful Artificial Intelligence (AI) Capabilities & Examples

What is Cognitive Computing? The term "cognitive computing" has become increasingly prominent in today's rapidly evolving technological landscape. As our society...

Multilayer Perceptron Architecture

Multilayer Perceptron Explained And How To Train & Optimise MLPs

What is a Multilayer perceptron (MLP)? In artificial intelligence and machine learning, the Multilayer Perceptron (MLP) stands as one of the foundational architectures,...

Left: Illustration of SGD optimization with a typical learning rate schedule. The model converges to a minimum at the end of training. Right: Illustration of Snapshot Ensembling. The model undergoes several learning rate annealing cycles, converging to and escaping from multiple local minima. We take a snapshot at each minimum for test-time ensembling

Learning Rate In Machine Learning And Deep Learning Made Simple

Machine learning algorithms are at the core of many modern technological advancements, powering everything from recommendation systems to autonomous vehicles....

What causes the cold-start problem?

The Cold-Start Problem In Machine Learning Explained & 6 Mitigating Strategies

What is the Cold-Start Problem in Machine Learning? The cold-start problem refers to a common challenge encountered in machine learning systems, particularly in...

Nodes and edges in a bayesian network

Bayesian Network Made Simple [How It Is Used In Artificial Intelligence & Machine Learning]

What is a Bayesian Network? Bayesian network, also known as belief networks or Bayes nets, are probabilistic graphical models representing random variables and their...

Query2vec is an example of knowledge graph reasoning. Conjunctive queries: Where did Canadian citizens with Turing Award Graduate?

Knowledge Graph Reasoning Made Simple [3 Technical Methods & How To Handle Uncertanty]

What is Knowledge Graph Reasoning? Knowledge Graph Reasoning refers to drawing logical inferences, making deductions, and uncovering implicit information within a...

the process of speech recognition

How To Implement Speech Recognition [3 Ways & 7 Machine Learning Models]

What is Speech Recognition? Speech recognition, also known as automatic speech recognition (ASR) or voice recognition, is a technology that converts spoken language...

Key components of conversational AI

Conversational AI Explained: Top 9 Tools & How To Guide [Including GPT]

What is Conversational AI? Conversational AI, short for Conversational Artificial Intelligence, refers to using artificial intelligence and natural language processing...

7 common NLP tools

Top 10 Most Useful Natural Language Processing (NLP) Tools [Libraries & Frameworks] LLMs Included

What are Common Natural Language Processing (NLP) Tools? Natural Language Processing (NLP) tools are software components, libraries, or frameworks designed to...

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

nlp trends

2024 NLP Expert Trend Predictions

Get a FREE PDF with expert predictions for 2024. How will natural language processing (NLP) impact businesses? What can we expect from the state-of-the-art models?

Find out this and more by subscribing* to our NLP newsletter.

You have Successfully Subscribed!