How To Implement Data Augmentation In Python [Image & Text]

Top 7 ways of implementing data augmentation for both images and text. With the top 3 libraries in Python to use for image processing and NLP.

Table of Contents

What is data augmentation?

Data augmentation is a technique used in machine learning and computer vision to increase the size of a dataset by creating new variations of existing data. It aims to make machine learning models better at generalising by giving them more training data with different versions of the same information.

Data augmentation techniques can include a range of transformations such as flipping, rotation, scaling, cropping, adding noise, changing brightness or contrast, and more. By applying these transformations to the original data, the resulting augmented data can provide the model with additional examples to learn from and can also help to reduce overfitting, as the model is exposed to a broader range of data.

Some typical applications of augmentation include image classification, object detection, and natural language processing, among others. In addition, augmentation can be performed on-the-fly during training or offline before training by generating and storing the augmented data in a separate dataset.

Data augmentation in deep learning

Data augmentation is a widespread technique in deep learning for generating new training data from existing data. It is used to increase the training data and introduce variation to the data, improving the robustness and generalisation of deep learning models.

Here are some standard augmentation techniques used in deep learning:

Image augmentation: This involves manipulating images to create variations in the training data, such as rotating, flipping, cropping, scaling, or adding noise.
Text augmentation: This involves manipulating text to create variations in the training data, such as replacing synonyms, adding or removing words, or changing the word order.
Audio augmentation: This involves manipulating audio data to create variations in the training data, such as adding noise, changing pitch or tempo, or adding echoes.
Video augmentation: This involves manipulating video data to create variations in the training data, such as flipping, rotating, or cropping frames or changing the playback speed.

Different deep learning frameworks, like TensorFlow and PyTorch, have built-in libraries or scripts that can be used for data augmentation. For example, in TensorFlow, the ImageDataGenerator class can be used for image augmentation, while in PyTorch, the torchvision.transforms module provides various image transformation functions.

It is important to note that augmentation can increase the training time and computational cost of deep learning models, and some techniques may not be suitable for all types of data. Therefore, it is crucial to carefully evaluate and choose the appropriate augmentation techniques based on the specific problem and dataset.

Top 7 data augmentation techniques for images

Data augmentation techniques involve applying transformations or modifications to the existing dataset to generate additional training samples. Here are some standard augmentation techniques used in image processing:

Rotation: Rotating images by a certain degree can be helpful in training models to recognise and recognise objects from different angles.
Flipping: Flipping images horizontally or vertically can help increase the dataset’s size and alleviate the model learning symmetry and invariance.
Cropping and Resizing: Cropping images to different sizes or resolutions can create new samples that contain only a portion of the original image, which can be helpful for object detection tasks.
Adding Noise: Adding random noise to images can reduce overfitting and improve the robustness of the model.
Brightness and Contrast Adjustment: Changing the brightness or contrast of images can create new samples with different lighting conditions, which can help the model to generalise better.
Zooming: Zooming in or out on images can create new samples with different levels of detail, which can help the model to recognise objects at different scales.
Translation: Translating images by a certain amount in any direction can create new samples with different positions of the objects in the picture.

Using different filters could also be used for augmentation

Top 7 data augmentation techniques for NLP

Data augmentation techniques can also be applied to text data, particularly in natural language processing (NLP). Here are some standard augmentation techniques used in:

Synonym Replacement: Replacing words in a sentence with their synonyms can create new samples that retain the same meaning but use different words.
Back Translation: Translating a sentence from one language to another and back to the original language can create new samples with different phrasing and word choices.
Random Deletion: Randomly deleting words from a sentence can create new samples with different sentence structures and help the model learn to deal with missing information.
Random Swap: Randomly swapping two words in a sentence can create new samples with different sentence structures and word orders.
Random Insertion: Randomly inserting words into a sentence can create new samples with additional information and help the model learn to handle noisy input.
Changing Tenses: Changing the tense of verbs in a sentence can create new samples with different temporal contexts.
Masking: Replacing words in a sentence with a unique token can create new samples with missing information, which can help the model learn to fill in gaps in the input.

Top 3 data augmentation libraries in Python for images

Several Python libraries provide augmentation functionality for machine learning tasks. Here are some of the most popular ones for image processing:

1. Keras ImageDataGenerator

This library provides an easy-to-use interface for image augmentation in Keras, a popular deep learning framework. It supports a variety of augmentation techniques such as rotation, shearing, zooming, and flipping.

from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

2. Albumentations

This powerful and flexible library for image augmentation supports a wide range of transformations. It can be used with various deep learning frameworks and supports both CPU and GPU acceleration.

import albumentations as A

transform = A.Compose([
    A.Rotate(limit=20),
    A.RandomCrop(width=256, height=256),
    A.HorizontalFlip(),
    A.RandomBrightnessContrast(),
    A.Normalize()
])

3. imgaug

This library provides a flexible and extensible platform for image augmentation with support for various transformations. It also supports batching and multiprocessing for the efficient processing of large datasets.

import imgaug.augmenters as iaa

seq = iaa.Sequential([
    iaa.Rotate((0, 45)),
    iaa.Flipud(),
    iaa.GaussianBlur(sigma=(0, 3.0)),
    iaa.AdditiveGaussianNoise(scale=(0, 0.1*255)),
    iaa.Crop(px=(0, 16))
])

These libraries provide a variety of augmentation techniques that can be combined and customized to suit the specific needs of your machine learning task.

Top 3 data augmentation libraries in Python for text

Data augmentation for text can help improve the performance of natural language processing (NLP) models by increasing the diversity and quantity of training data. Here are some popular Python libraries for text augmentation:

1. NLPAug

This library provides various text augmentation techniques, including word embeddings, back-translation, and contextual word embeddings. It supports various NLP tasks such as text classification, sentiment analysis, and machine translation.

import nlpaug.augmenter.word as naw

aug = naw.ContextualWordEmbsAug(
    model_path='bert-base-uncased', 
    action="substitute"
)
augmented_text = aug.augment("This is a sentence.")

2. TextBlob

This library provides simple and easy-to-use methods for text augmentation, including synonym replacement and random word insertion.

from textblob import Word

word = Word("happy")
augmented_word = word.synsets[0].lemma_names()[0]

3. AugLy

This library provides a range of augmentation techniques for various data types, including text. It supports spelling correction, keyboard noise, and random word deletion techniques.

from augly.text.augmenters import (
    apply_levenshtein_distance,
    apply_typo,
    insert_punctuation_chars,
)

augmented_text = apply_levenshtein_distance("This is a sentence.")

These libraries provide a variety of text augmentation techniques that can be used to generate additional training data for NLP tasks, improving the model’s ability to generalize and perform well on unseen data. It is important to note that text augmentation can sometimes lead to semantically incorrect or nonsensical text, so it is important to carefully evaluate the generated data before using it for training.

Does data augmentation combat overfitting?

Data augmentation is often used to combat overfitting in machine learning. Overfitting occurs when a model learns to fit the training data too closely and needs to generalise well to new, unseen data.

Data augmentation helps prevent overfitting by increasing the variety and quantity of training data, which can help the model learn more generalised and robust features.

By generating new data instances through data augmentation, the model is exposed to more data variations and is, therefore, less likely to memorise the training data and more likely to learn meaningful patterns. Additionally, augmentation can reduce the effects of class imbalance in the data by generating new instances of underrepresented classes.

However, it is essential to note that more than data augmentation is needed to prevent overfitting. Other techniques such as regularisation, early stopping, and model architecture optimisation are also necessary to ensure the best performance and generalisation of the model.

Conclusion

Data augmentation is a powerful technique for improving the performance and robustness of machine learning models. It involves generating new training data from existing data, which can increase the diversity and quantity of data and introduce variation that helps models generalise better to unseen data.

Many techniques are available for augmentation, including image, text, audio, and video augmentation. In addition, several popular libraries and frameworks support implementing data augmentation in machine learning pipelines.

However, it is crucial to carefully evaluate the augmentation techniques and their impact on the model’s performance. Some methods may not be suitable for all data types and can introduce unwanted noise or bias. Overall, data augmentation can be a valuable tool in the machine learning toolbox for improving the performance and robustness of models on a wide range of tasks.