Final Up to date on August 6, 2022
If you work on a machine studying drawback associated to pictures, not solely do you’ll want to accumulate some photographs as coaching information, however you additionally must make use of augmentation to create variations within the picture. It’s very true for extra advanced object recognition issues.
There are various methods for picture augmentation. You could use some exterior libraries or write your individual capabilities for that. There are some modules in TensorFlow and Keras for augmentation too.
On this submit, you’ll uncover how you should utilize the Keras preprocessing layer in addition to the tf.picture
module in TensorFlow for picture augmentation.
After studying this submit, you’ll know:
- What are the Keras preprocessing layers, and methods to use them
- What are the capabilities offered by the
tf.picture
module for picture augmentation - The way to use augmentation along with the
tf.information
dataset
Let’s get began.

Picture augmentation with Keras preprocessing layers and tf.picture.
Photograph by Steven Kamenar. Some rights reserved.
Overview
This text is split into 5 sections; they’re:
- Getting Photographs
- Visualizing the Photographs
- Keras Preprocessing Layers
- Utilizing tf.picture API for Augmentation
- Utilizing Preprocessing Layers in Neural Networks
Getting Photographs
Earlier than you see how you are able to do augmentation, you’ll want to get the photographs. Finally, you want the photographs to be represented as arrays, for instance, in HxWx3 in 8-bit integers for the RGB pixel worth. There are various methods to get the photographs. Some could be downloaded as a ZIP file. In case you’re utilizing TensorFlow, you might get some picture datasets from the tensorflow_datasets
library.
On this tutorial, you’ll use the citrus leaves photographs, which is a small dataset of lower than 100MB. It may be downloaded from tensorflow_datasets
as follows:
import tensorflow_datasets as tfds ds, meta = tfds.load(‘citrus_leaves’, with_info=True, cut up=‘prepare’, shuffle_files=True) |
Working this code the primary time will obtain the picture dataset into your laptop with the next output:
Downloading and making ready dataset 63.87 MiB (obtain: 63.87 MiB, generated: 37.89 MiB, whole: 101.76 MiB) to ~/tensorflow_datasets/citrus_leaves/0.1.2… Extraction accomplished…: 100%|██████████████████████████████| 1/1 [00:06<00:00,  6.54s/ file] Dl Dimension…: 100%|██████████████████████████████████████████| 63/63 [00:06<00:00,  9.63 MiB/s] Dl Accomplished…: 100%|███████████████████████████████████████| 1/1 [00:06<00:00,  6.54s/ url] Dataset citrus_leaves downloaded and ready to ~/tensorflow_datasets/citrus_leaves/0.1.2. Subsequent calls will reuse this information. |
The operate above returns the photographs as a tf.information
dataset object and the metadata. This can be a classification dataset. You possibly can print the coaching labels with the next:
... for i in vary(meta.options[‘label’].num_classes): Â Â Â Â print(meta.options[‘label’].int2str(i)) |
This prints:
Black spot canker greening wholesome |
In case you run this code once more at a later time, you’ll reuse the downloaded picture. However the different method to load the downloaded photographs right into a tf.information
dataset is to make use of the image_dataset_from_directory()
operate.
As you possibly can see from the display output above, the dataset is downloaded into the listing ~/tensorflow_datasets
. In case you have a look at the listing, you see the listing construction as follows:
…/Citrus/Leaves ├── Black spot ├── Melanose ├── canker ├── greening └── wholesome |
The directories are the labels, and the photographs are recordsdata saved underneath their corresponding listing. You possibly can let the operate to learn the listing recursively right into a dataset:
import tensorflow as tf from tensorflow.keras.utils import image_dataset_from_listing  # set to fastened picture measurement 256×256 PATH = “…/Citrus/Leaves” ds = image_dataset_from_directory(PATH,                                   validation_split=0.2, subset=“coaching”,                                   image_size=(256,256), interpolation=“bilinear”,                                   crop_to_aspect_ratio=True,                                   seed=42, shuffle=True, batch_size=32) |
You could wish to set batch_size=None
if you do not need the dataset to be batched. Often, you need the dataset to be batched for coaching a neural community mannequin.
Visualizing the Photographs
It is very important visualize the augmentation consequence, so you possibly can confirm the augmentation result’s what we would like it to be. You need to use matplotlib for this.
In matplotlib, you may have the imshow()
operate to show a picture. Nevertheless, for the picture to be displayed appropriately, the picture ought to be offered as an array of 8-bit unsigned integers (uint8).
Given that you’ve a dataset created utilizing image_dataset_from_directory()
You will get the primary batch (of 32 photographs) and show a number of of them utilizing imshow()
, as follows:
... import matplotlib.pyplot as plt  fig, ax = plt.subplots(3, 3, sharex=True, sharey=True, figsize=(5,5))  for photographs, labels in ds.take(1):     for i in vary(3):         for j in vary(3):             ax[i][j].imshow(photographs[i*3+j].numpy().astype(“uint8”))             ax[i][j].set_title(ds.class_names[labels[i*3+j]]) plt.present() |
Right here, you see a show of 9 photographs in a grid, labeled with their corresponding classification label, utilizing ds.class_names
. The pictures ought to be transformed to NumPy array in uint8 for show. This code shows a picture like the next:
The whole code from loading the picture to show is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
from tensorflow.keras.utils import image_dataset_from_directory import matplotlib.pyplot as plt  # use image_dataset_from_directory() to load photographs, with picture measurement scaled to 256×256 PATH=‘…/Citrus/Leaves’  # modify to your path ds = image_dataset_from_directory(PATH,                                   validation_split=0.2, subset=“coaching”,                                   image_size=(256,256), interpolation=“mitchellcubic”,                                   crop_to_aspect_ratio=True,                                   seed=42, shuffle=True, batch_size=32)  # Take one batch from dataset and show the photographs fig, ax = plt.subplots(3, 3, sharex=True, sharey=True, figsize=(5,5))  for photographs, labels in ds.take(1):     for i in vary(3):         for j in vary(3):             ax[i][j].imshow(photographs[i*3+j].numpy().astype(“uint8”))             ax[i][j].set_title(ds.class_names[labels[i*3+j]]) plt.present() |
Notice that in case you’re utilizing tensorflow_datasets
to get the picture, the samples are offered as a dictionary as an alternative of a tuple of (picture,label). You need to change your code barely to the next:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
import tensorflow_datasets as tfds import matplotlib.pyplot as plt  # use tfds.load() or image_dataset_from_directory() to load photographs ds, meta = tfds.load(‘citrus_leaves’, with_info=True, cut up=‘prepare’, shuffle_files=True) ds = ds.batch(32)  # Take one batch from dataset and show the photographs fig, ax = plt.subplots(3, 3, sharex=True, sharey=True, figsize=(5,5))  for pattern in ds.take(1):     photographs, labels = pattern[“image”], pattern[“label”]     for i in vary(3):         for j in vary(3):             ax[i][j].imshow(photographs[i*3+j].numpy().astype(“uint8”))             ax[i][j].set_title(meta.options[‘label’].int2str(labels[i*3+j])) plt.present() |
For the remainder of this submit, assume the dataset is created utilizing image_dataset_from_directory()
. You could must tweak the code barely in case your dataset is created in a different way.
Keras Preprocessing Layers
Keras comes with many neural community layers, akin to convolution layers, that you’ll want to prepare. There are additionally layers with no parameters to coach, akin to flatten layers to transform an array like a picture right into a vector.
The preprocessing layers in Keras are particularly designed to make use of within the early levels of a neural community. You need to use them for picture preprocessing, akin to to resize or rotate the picture or modify the brightness and distinction. Whereas the preprocessing layers are alleged to be half of a bigger neural community, you may also use them as capabilities. Under is how you should utilize the resizing layer as a operate to rework some photographs and show them side-by-side with the unique:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
...  # create a resizing layer out_height, out_width = 128,256 resize = tf.keras.layers.Resizing(out_height, out_width)  # present unique vs resized fig, ax = plt.subplots(2, 3, figsize=(6,4))  for photographs, labels in ds.take(1):     for i in vary(3):         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # resize         ax[1][i].imshow(resize(photographs[i]).numpy().astype(“uint8”))         ax[1][i].set_title(“resize”) plt.present() |
The pictures are in 256×256 pixels, and the resizing layer will make them into 256×128 pixels. The output of the above code is as follows:
For the reason that resizing layer is a operate, you possibly can chain them to the dataset itself. For instance,
... def increase(picture, label):     return resize(picture), label  resized_ds = ds.map(increase)  for picture, label in resized_ds:   ... |
The dataset ds
has samples within the type of (picture, label)
. Therefore you created a operate that takes in such tuple and preprocesses the picture with the resizing layer. You then assigned this operate as an argument for the map()
within the dataset. If you draw a pattern from the brand new dataset created with the map()
operate, the picture will probably be a reworked one.
There are extra preprocessing layers accessible. Some are demonstrated beneath.
As you noticed above, you possibly can resize the picture. You may also randomly enlarge or shrink the peak or width of a picture. Equally, you possibly can zoom in or zoom out on a picture. Under is an instance of manipulating the picture measurement in varied methods for a most of 30% improve or lower:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
...  # Create preprocessing layers out_height, out_width = 128,256 resize = tf.keras.layers.Resizing(out_height, out_width) peak = tf.keras.layers.RandomHeight(0.3) width = tf.keras.layers.RandomWidth(0.3) zoom = tf.keras.layers.RandomZoom(0.3)  # Visualize photographs and augmentations fig, ax = plt.subplots(5, 3, figsize=(6,14))  for photographs, labels in ds.take(1):     for i in vary(3):         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # resize         ax[1][i].imshow(resize(photographs[i]).numpy().astype(“uint8”))         ax[1][i].set_title(“resize”)         # peak         ax[2][i].imshow(peak(photographs[i]).numpy().astype(“uint8”))         ax[2][i].set_title(“peak”)         # width         ax[3][i].imshow(width(photographs[i]).numpy().astype(“uint8”))         ax[3][i].set_title(“width”)         # zoom         ax[4][i].imshow(zoom(photographs[i]).numpy().astype(“uint8”))         ax[4][i].set_title(“zoom”) plt.present() |
This code reveals photographs as follows:
When you specified a hard and fast dimension in resize, you may have a random quantity of manipulation in different augmentations.
You may also do flipping, rotation, cropping, and geometric translation utilizing preprocessing layers:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
... # Create preprocessing layers flip = tf.keras.layers.RandomFlip(“horizontal_and_vertical”) # or “horizontal”, “vertical” rotate = tf.keras.layers.RandomRotation(0.2) crop = tf.keras.layers.RandomCrop(out_height, out_width) translation = tf.keras.layers.RandomTranslation(height_factor=0.2, width_factor=0.2)  # Visualize augmentations fig, ax = plt.subplots(5, 3, figsize=(6,14))  for photographs, labels in ds.take(1):     for i in vary(3):         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # flip         ax[1][i].imshow(flip(photographs[i]).numpy().astype(“uint8”))         ax[1][i].set_title(“flip”)         # crop         ax[2][i].imshow(crop(photographs[i]).numpy().astype(“uint8”))         ax[2][i].set_title(“crop”)         # translation         ax[3][i].imshow(translation(photographs[i]).numpy().astype(“uint8”))         ax[3][i].set_title(“translation”)         # rotate         ax[4][i].imshow(rotate(photographs[i]).numpy().astype(“uint8”))         ax[4][i].set_title(“rotate”) plt.present() |
This code reveals the next photographs:
And eventually, you are able to do augmentations on coloration changes as effectively:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
... brightness = tf.keras.layers.RandomBrightness([–0.8,0.8]) distinction = tf.keras.layers.RandomContrast(0.2)  # Visualize augmentation fig, ax = plt.subplots(3, 3, figsize=(6,7))  for photographs, labels in ds.take(1):     for i in vary(3):         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # brightness         ax[1][i].imshow(brightness(photographs[i]).numpy().astype(“uint8”))         ax[1][i].set_title(“brightness”)         # distinction         ax[2][i].imshow(distinction(photographs[i]).numpy().astype(“uint8”))         ax[2][i].set_title(“distinction”) plt.present() |
This reveals the photographs as follows:
For completeness, beneath is the code to show the results of varied augmentations:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
from tensorflow.keras.utils import image_dataset_from_directory import tensorflow as tf import matplotlib.pyplot as plt  # use image_dataset_from_directory() to load photographs, with picture measurement scaled to 256×256 PATH=‘…/Citrus/Leaves’  # modify to your path ds = image_dataset_from_directory(PATH,                                   validation_split=0.2, subset=“coaching”,                                   image_size=(256,256), interpolation=“mitchellcubic”,                                   crop_to_aspect_ratio=True,                                   seed=42, shuffle=True, batch_size=32)  # Create preprocessing layers out_height, out_width = 128,256 resize = tf.keras.layers.Resizing(out_height, out_width) peak = tf.keras.layers.RandomHeight(0.3) width = tf.keras.layers.RandomWidth(0.3) zoom = tf.keras.layers.RandomZoom(0.3)  flip = tf.keras.layers.RandomFlip(“horizontal_and_vertical”) rotate = tf.keras.layers.RandomRotation(0.2) crop = tf.keras.layers.RandomCrop(out_height, out_width) translation = tf.keras.layers.RandomTranslation(height_factor=0.2, width_factor=0.2)  brightness = tf.keras.layers.RandomBrightness([–0.8,0.8]) distinction = tf.keras.layers.RandomContrast(0.2)  # Visualize photographs and augmentations fig, ax = plt.subplots(5, 3, figsize=(6,14)) for photographs, labels in ds.take(1):     for i in vary(3):         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # resize         ax[1][i].imshow(resize(photographs[i]).numpy().astype(“uint8”))         ax[1][i].set_title(“resize”)         # peak         ax[2][i].imshow(peak(photographs[i]).numpy().astype(“uint8”))         ax[2][i].set_title(“peak”)         # width         ax[3][i].imshow(width(photographs[i]).numpy().astype(“uint8”))         ax[3][i].set_title(“width”)         # zoom         ax[4][i].imshow(zoom(photographs[i]).numpy().astype(“uint8”))         ax[4][i].set_title(“zoom”) plt.present()  fig, ax = plt.subplots(5, 3, figsize=(6,14)) for photographs, labels in ds.take(1):     for i in vary(3):         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # flip         ax[1][i].imshow(flip(photographs[i]).numpy().astype(“uint8”))         ax[1][i].set_title(“flip”)         # crop         ax[2][i].imshow(crop(photographs[i]).numpy().astype(“uint8”))         ax[2][i].set_title(“crop”)         # translation         ax[3][i].imshow(translation(photographs[i]).numpy().astype(“uint8”))         ax[3][i].set_title(“translation”)         # rotate         ax[4][i].imshow(rotate(photographs[i]).numpy().astype(“uint8”))         ax[4][i].set_title(“rotate”) plt.present()  fig, ax = plt.subplots(3, 3, figsize=(6,7)) for photographs, labels in ds.take(1):     for i in vary(3):         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # brightness         ax[1][i].imshow(brightness(photographs[i]).numpy().astype(“uint8”))         ax[1][i].set_title(“brightness”)         # distinction         ax[2][i].imshow(distinction(photographs[i]).numpy().astype(“uint8”))         ax[2][i].set_title(“distinction”) plt.present() |
Lastly, it is very important level out that the majority neural community fashions can work higher if the enter photographs are scaled. Whereas we often use an 8-bit unsigned integer for the pixel values in a picture (e.g., for show utilizing imshow()
as above), a neural community prefers the pixel values to be between 0 and 1 or between -1 and +1. This may be finished with preprocessing layers too. Under is how one can replace one of many examples above so as to add the scaling layer into the augmentation:
... out_height, out_width = 128,256 resize = tf.keras.layers.Resizing(out_height, out_width) rescale = tf.keras.layers.Rescaling(1/127.5, offset=–1)  # rescale pixel values to [-1,1]  def increase(picture, label):     return rescale(resize(picture)), label  rescaled_resized_ds = ds.map(increase)  for picture, label in rescaled_resized_ds:   ... |
Utilizing tf.picture API for Augmentation
In addition to the preprocessing layer, the tf.picture
module additionally supplies some capabilities for augmentation. Not like the preprocessing layer, these capabilities are meant for use in a user-defined operate and assigned to a dataset utilizing map()
as we noticed above.
The capabilities offered by the tf.picture
aren’t duplicates of the preprocessing layers, though there’s some overlap. Under is an instance of utilizing the tf.picture
capabilities to resize and crop photographs:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
...  fig, ax = plt.subplots(5, 3, figsize=(6,14))  for photographs, labels in ds.take(1):     for i in vary(3):         # unique         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # resize         h = int(256 * tf.random.uniform([], minval=0.8, maxval=1.2))         w = int(256 * tf.random.uniform([], minval=0.8, maxval=1.2))         ax[1][i].imshow(tf.picture.resize(photographs[i], [h,w]).numpy().astype(“uint8”))         ax[1][i].set_title(“resize”)         # crop         y, x, h, w = (128 * tf.random.uniform((4,))).numpy().astype(“uint8”)         ax[2][i].imshow(tf.picture.crop_to_bounding_box(photographs[i], y, x, h, w).numpy().astype(“uint8”))         ax[2][i].set_title(“crop”)         # central crop         x = tf.random.uniform([], minval=0.4, maxval=1.0)         ax[3][i].imshow(tf.picture.central_crop(photographs[i], x).numpy().astype(“uint8”))         ax[3][i].set_title(“central crop”)         # crop to (h,w) at random offset         h, w = (256 * tf.random.uniform((2,))).numpy().astype(“uint8”)         seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(“int32”)         ax[4][i].imshow(tf.picture.stateless_random_crop(photographs[i], [h,w,3], seed).numpy().astype(“uint8”))         ax[4][i].set_title(“random crop”) plt.present() |
Under is the output of the above code:
Whereas the show of photographs matches what you would possibly count on from the code, the usage of tf.picture
capabilities is kind of totally different from that of the preprocessing layers. Each tf.picture
operate is totally different. Due to this fact, you possibly can see the crop_to_bounding_box()
operate takes pixel coordinates, however the central_crop()
operate assumes a fraction ratio because the argument.
These capabilities are additionally totally different in the way in which randomness is dealt with. A few of these capabilities don’t assume random habits. Due to this fact, the random resize ought to have the precise output measurement generated utilizing a random quantity generator individually earlier than calling the resize operate. Another capabilities, akin to stateless_random_crop()
, can do augmentation randomly, however a pair of random seeds within the int32
must be specified explicitly.
To proceed the instance, there are the capabilities for flipping a picture and extracting the Sobel edges:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
... fig, ax = plt.subplots(5, 3, figsize=(6,14))  for photographs, labels in ds.take(1):     for i in vary(3):         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # flip         seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(“int32”)         ax[1][i].imshow(tf.picture.stateless_random_flip_left_right(photographs[i], seed).numpy().astype(“uint8”))         ax[1][i].set_title(“flip left-right”)         # flip         seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(“int32”)         ax[2][i].imshow(tf.picture.stateless_random_flip_up_down(photographs[i], seed).numpy().astype(“uint8”))         ax[2][i].set_title(“flip up-down”)         # sobel edge         sobel = tf.picture.sobel_edges(photographs[i:i+1])         ax[3][i].imshow(sobel[0, ..., 0].numpy().astype(“uint8”))         ax[3][i].set_title(“sobel y”)         # sobel edge         ax[4][i].imshow(sobel[0, ..., 1].numpy().astype(“uint8”))         ax[4][i].set_title(“sobel x”) plt.present() |
This reveals the next:
And the next are the capabilities to control the brightness, distinction, and colours:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
... fig, ax = plt.subplots(5, 3, figsize=(6,14))  for photographs, labels in ds.take(1):     for i in vary(3):         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # brightness         seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(“int32”)         ax[1][i].imshow(tf.picture.stateless_random_brightness(photographs[i], 0.3, seed).numpy().astype(“uint8”))         ax[1][i].set_title(“brightness”)         # distinction         ax[2][i].imshow(tf.picture.stateless_random_contrast(photographs[i], 0.7, 1.3, seed).numpy().astype(“uint8”))         ax[2][i].set_title(“distinction”)         # saturation         ax[3][i].imshow(tf.picture.stateless_random_saturation(photographs[i], 0.7, 1.3, seed).numpy().astype(“uint8”))         ax[3][i].set_title(“saturation”)         # hue         ax[4][i].imshow(tf.picture.stateless_random_hue(photographs[i], 0.3, seed).numpy().astype(“uint8”))         ax[4][i].set_title(“hue”) plt.present() |
This code reveals the next:
Under is the entire code to show all the above:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
from tensorflow.keras.utils import image_dataset_from_directory import tensorflow as tf import matplotlib.pyplot as plt  # use image_dataset_from_directory() to load photographs, with picture measurement scaled to 256×256 PATH=‘…/Citrus/Leaves’  # modify to your path ds = image_dataset_from_directory(PATH,                                   validation_split=0.2, subset=“coaching”,                                   image_size=(256,256), interpolation=“mitchellcubic”,                                   crop_to_aspect_ratio=True,                                   seed=42, shuffle=True, batch_size=32)  # Visualize tf.picture augmentations  fig, ax = plt.subplots(5, 3, figsize=(6,14)) for photographs, labels in ds.take(1):     for i in vary(3):         # unique         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # resize         h = int(256 * tf.random.uniform([], minval=0.8, maxval=1.2))         w = int(256 * tf.random.uniform([], minval=0.8, maxval=1.2))         ax[1][i].imshow(tf.picture.resize(photographs[i], [h,w]).numpy().astype(“uint8”))         ax[1][i].set_title(“resize”)         # crop         y, x, h, w = (128 * tf.random.uniform((4,))).numpy().astype(“uint8”)         ax[2][i].imshow(tf.picture.crop_to_bounding_box(photographs[i], y, x, h, w).numpy().astype(“uint8”))         ax[2][i].set_title(“crop”)         # central crop         x = tf.random.uniform([], minval=0.4, maxval=1.0)         ax[3][i].imshow(tf.picture.central_crop(photographs[i], x).numpy().astype(“uint8”))         ax[3][i].set_title(“central crop”)         # crop to (h,w) at random offset         h, w = (256 * tf.random.uniform((2,))).numpy().astype(“uint8”)         seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(“int32”)         ax[4][i].imshow(tf.picture.stateless_random_crop(photographs[i], [h,w,3], seed).numpy().astype(“uint8”))         ax[4][i].set_title(“random crop”) plt.present()  fig, ax = plt.subplots(5, 3, figsize=(6,14)) for photographs, labels in ds.take(1):     for i in vary(3):         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # flip         seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(“int32”)         ax[1][i].imshow(tf.picture.stateless_random_flip_left_right(photographs[i], seed).numpy().astype(“uint8”))         ax[1][i].set_title(“flip left-right”)         # flip         seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(“int32”)         ax[2][i].imshow(tf.picture.stateless_random_flip_up_down(photographs[i], seed).numpy().astype(“uint8”))         ax[2][i].set_title(“flip up-down”)         # sobel edge         sobel = tf.picture.sobel_edges(photographs[i:i+1])         ax[3][i].imshow(sobel[0, ..., 0].numpy().astype(“uint8”))         ax[3][i].set_title(“sobel y”)         # sobel edge         ax[4][i].imshow(sobel[0, ..., 1].numpy().astype(“uint8”))         ax[4][i].set_title(“sobel x”) plt.present()  fig, ax = plt.subplots(5, 3, figsize=(6,14)) for photographs, labels in ds.take(1):     for i in vary(3):         ax[0][i].imshow(photographs[i].numpy().astype(“uint8”))         ax[0][i].set_title(“unique”)         # brightness         seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(“int32”)         ax[1][i].imshow(tf.picture.stateless_random_brightness(photographs[i], 0.3, seed).numpy().astype(“uint8”))         ax[1][i].set_title(“brightness”)         # distinction         ax[2][i].imshow(tf.picture.stateless_random_contrast(photographs[i], 0.7, 1.3, seed).numpy().astype(“uint8”))         ax[2][i].set_title(“distinction”)         # saturation         ax[3][i].imshow(tf.picture.stateless_random_saturation(photographs[i], 0.7, 1.3, seed).numpy().astype(“uint8”))         ax[3][i].set_title(“saturation”)         # hue         ax[4][i].imshow(tf.picture.stateless_random_hue(photographs[i], 0.3, seed).numpy().astype(“uint8”))         ax[4][i].set_title(“hue”) plt.present() |
These augmentation capabilities ought to be sufficient for many makes use of. However when you have some particular concepts on augmentation, you’ll in all probability want a greater picture processing library. OpenCV and Pillow are widespread however highly effective libraries that mean you can rework photographs higher.
Utilizing Preprocessing Layers in Neural Networks
You used the Keras preprocessing layers as capabilities within the examples above. However they will also be used as layers in a neural community. It’s trivial to make use of. Under is an instance of how one can incorporate a preprocessing layer right into a classification community and prepare it utilizing a dataset:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
from tensorflow.keras.utils import image_dataset_from_directory import tensorflow as tf import matplotlib.pyplot as plt  # use image_dataset_from_directory() to load photographs, with picture measurement scaled to 256×256 PATH=‘…/Citrus/Leaves’  # modify to your path ds = image_dataset_from_directory(PATH,                                   validation_split=0.2, subset=“coaching”,                                   image_size=(256,256), interpolation=“mitchellcubic”,                                   crop_to_aspect_ratio=True,                                   seed=42, shuffle=True, batch_size=32)  AUTOTUNE = tf.information.AUTOTUNE ds = ds.cache().prefetch(buffer_size=AUTOTUNE)  num_classes = 5 mannequin = tf.keras.Sequential([   tf.keras.layers.RandomFlip(“horizontal_and_vertical”),   tf.keras.layers.RandomRotation(0.2),   tf.keras.layers.Rescaling(1/127.0, offset=–1),   tf.keras.layers.Conv2D(32, 3, activation=‘relu’),   tf.keras.layers.MaxPooling2D(),   tf.keras.layers.Conv2D(32, 3, activation=‘relu’),   tf.keras.layers.MaxPooling2D(),   tf.keras.layers.Conv2D(32, 3, activation=‘relu’),   tf.keras.layers.MaxPooling2D(),   tf.keras.layers.Flatten(),   tf.keras.layers.Dense(128, activation=‘relu’),   tf.keras.layers.Dense(num_classes) ])  mannequin.compile(optimizer=‘adam’,               loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),               metrics=[‘accuracy’])   mannequin.match(ds, epochs=3) |
Working this code provides the next output:
Discovered 609 recordsdata belonging to five lessons. Utilizing 488 recordsdata for coaching. Epoch 1/3 16/16 [==============================] – 5s 253ms/step – loss: 1.4114 – accuracy: 0.4283 Epoch 2/3 16/16 [==============================] – 4s 259ms/step – loss: 0.8101 – accuracy: 0.6475 Epoch 3/3 16/16 [==============================] – 4s 267ms/step – loss: 0.7015 – accuracy: 0.7111 |
Within the code above, you created the dataset with cache()
and prefetch()
. This can be a efficiency method to permit the dataset to arrange information asynchronously whereas the neural community is educated. This is able to be vital if the dataset has another augmentation assigned utilizing the map()
operate.
You will note some enchancment in accuracy in case you take away the RandomFlip
and RandomRotation
layers since you make the issue simpler. Nevertheless, as you need the community to foretell effectively on a large variation of picture high quality and properties, utilizing augmentation might help your ensuing community grow to be extra highly effective.
Additional Studying
Under is a few documentation from TensorFlow that’s associated to the examples above:
Abstract
On this submit, you may have seen how you should utilize the tf.information
dataset with picture augmentation capabilities from Keras and TensorFlow.
Particularly, you discovered:
- The way to use the preprocessing layers from Keras, each as a operate and as a part of a neural community
- The way to create your individual picture augmentation operate and apply it to the dataset utilizing the
map()
operate - The way to use the capabilities offered by the
tf.picture
module for picture augmentation