{ "cells": [ { "cell_type": "markdown", "id": "5dd2450c", "metadata": {}, "source": [ "# Tiler Notebook\n", "\n", "Takes input images from a yolov5 structured directory of images and tiles them into smaller overlapping squares. Boundary boxes are maintained in the tiled images. \n", "\n", "The images and labels are assumed to be in a yolov5 directory structure as follows\n", "\n", "```text\n", " - path\n", " - train (existing images and labels)\n", " - images\n", " - labels\n", " - val (existing images and labels)\n", " - images\n", " - labels\n", " - tiled_train (will be created, if this already exists, it will be overwitten by function call)\n", " - images\n", " - labels\n", " - empty_imgs\n", "```\n", "\n", "This notebook can be **run from anywhere** on the machine, just use a complete path as in the example.\n", "\n", "### References\n", "\n", "- The tiling itself is mostly stolen and modified from this repo and the accompanying article. \n", " - GitHub repo ([link](https://github.com/slanj/yolo-tiling))\n", " - Medium article ([link](https://towardsdatascience.com/tile-slice-yolo-dataset-for-small-objects-detection-a75bf26f7fa2?gi=97fbf079fdec))\n", " - **We made the following IMPROVEMENTS to the code.**\n", " - Added multi-threading\n", " - Went from taking about 18.5 minutes to tile down to taking about 3.5\n", " - Configured to work with yolov5 style directories. Had been yolov4 style (images and labels were in same folder)\n", " - Added overlapped zones between tiles\n", " - Added better progress tracking (tqdm instead of a running list of files being processed)\n", " - Fixed a bug where sometimes a line was grabbed from the interection of polygons\n", " - Added robustness against errors in labeling from our side. It won't crahsh if given a zero height box anymore.\n", "- This stackoverflow was very helpful wiht the multithreading ([link](https://stackoverflow.com/questions/5666576/show-the-progress-of-a-python-multiprocessing-pool-imap-unordered-call/40133278#40133278))\n" ] }, { "cell_type": "markdown", "id": "ce34b9e5", "metadata": {}, "source": [ "## Do Imports & Define Functions\n", "\n", "At a high level there is a function `tile_one_overlap()` that does most of the heavy lifting. It takes a path to an image and slices the referenced image up into tiles of `slice_size` by `slice_size` pixels. It saves the image and the corresponding yolo labels in the referenced directories that are passed in. Each tile overlaps its neighbor by `ol_size` pixels. The funciton also pads the image to be sliced with grey so that the tiles will evenly fit into the image. \n", "\n", "The wrapper function `tile_one_ol_multi_wrap` is just there to make the thing work with multi-threading.\n", "\n", "The `tile_train_multi` function is what you'll actually call to run the notebook. You pass in the folder to tile, the size of the tiles, the overlap between adjacent tiles, and the number of threads to run on. (Passing nothing uses all CPUs)" ] }, { "cell_type": "code", "execution_count": 11, "id": "c6f3f0a0", "metadata": { "scrolled": true }, "outputs": [], "source": [ "## Import Stuff\n", "\n", "!python3 -m pip install -qr /workspace/yolo-tiling/requirements.txt\n", "\n", "import pandas as pd\n", "import numpy as np\n", "from PIL import Image\n", "from shapely.geometry import Polygon\n", "import glob\n", "import argparse\n", "import os\n", "import random\n", "from tqdm import tqdm\n", "from threading import Thread\n", "from shutil import copyfile\n", "import multiprocessing as mp\n", "\n", "\n", "## Define functions\n", "\n", "def get_pad_and_nsquares(oldwidth,oldheight,slice_size, ol_size):\n", " '''\n", " oldwidth - width of old (unpadded) image in pixels\n", " oldheight - height of old (unpadded) image in pixels\n", " slice_size - size of tiles in px (tiles are square)\n", " ol_size - size in px of the overlapped region between tiles\n", " \n", " returns tuple with:\n", " - htsquare - the height of the padded image in tiled squared (overlap included)\n", " - wdsquare - the width of the padded image in tiled squared (overlap included)\n", " - tbpad - the width of the padded border on the top and bottom of the image\n", " - lrpad - the width of the padded border on the left and right of the image\n", " '''\n", "\n", " htsquare = (oldheight - ol_size) // (slice_size - ol_size) + 1\n", " wdsquare = (oldwidth - ol_size) // (slice_size - ol_size) + 1\n", " \n", " tbpad = ((htsquare * slice_size - (htsquare-1) * ol_size) - oldheight + 1)//2\n", " lrpad = ((wdsquare * slice_size - (wdsquare-1) * ol_size) - oldwidth + 1)//2\n", " \n", " return (wdsquare, htsquare, tbpad, lrpad)\n", "\n", "def map_tile_xy_to_orig(og_w, og_h, tile_sz, ol, tile_row, tile_col, tile_x, tile_y):\n", " '''\n", " Takes in coordinates in tile space and translates them to original image space\n", " y is 0 at top of the image and increases in the downward direction \n", " x is 0 at left of the image and increases in the rightward direction\n", " \n", " inputs:\n", " og_w - width of original (unpadded) image\n", " og_h - height of original (unpadded) image\n", " tile_sz - size of q square tile image in pixels\n", " ol - overlap between adjacent tile images in pixels\n", " tile_row - row number of the tile of interest\n", " tile_col - column number of the tile of interest\n", " tile_x - x value on tile of interest that we want to convert to original image space\n", " tile_y - y value on tile of interest that we want to convert to original image space\n", " \n", " returns a tuple of (og_img_x, og_img_y, in_overlap) \n", " og_img_x - the x on the tile point as in would be located in the original image\n", " og_img_y - the y on the tile point as it would be located in the original image\n", " in_ol - a boolean True if the point passed is in the overlap zone, False if not\n", " '''\n", " \n", " w_tiles, h_tiles, tbpad, lrpad = get_pad_and_nsquares(og_w,og_h,tile_sz, ol)\n", " padded_img_x = tile_x + (tile_col * (tile_sz - ol)) # where x_tile maps to on padded image\n", " padded_img_y = tile_y + (tile_row * (tile_sz - ol)) # where y_tile maps to on padded image\n", " og_img_x = padded_img_x - lrpad\n", " og_img_y = padded_img_y - tbpad\n", " \n", " last_row, last_col = h_tiles - 1, w_tiles - 1\n", " in_r_ol = (tile_x > (tile_sz - ol)) and (tile_col != last_col)\n", " in_l_ol = (tile_x < ol) and (tile_col != 0)\n", " in_t_ol = (tile_y > (tile_sz - ol)) and (tile_row != last_row)\n", " in_b_ol = (tile_y < ol) and (tile_row != 0)\n", " in_ol = in_r_ol or in_l_ol or in_t_ol or in_b_ol\n", " \n", " return (og_img_x, og_img_y, in_ol)\n", "\n", "def tile_one_overlap(imname, labname, newpath, newlabpath, falsepath, slice_size, ol_size): #, ext):\n", " '''\n", " imname - name of given image file with path ex. \"/workspace/data/train/images/rhino-63.JPG\"\n", " labname - name of a given label file with path ex. \"/workspace/data/train/labels/rhino-63.txt\"\n", " newpath - path to folder where image tiles that have bounded regions will be stored\n", " newlabpath - path to folder where labels associated with new image tiles will be stored\n", " falsepath - path to folder where image tiles that have no bounded regions will be stored\n", " slice_size - size of tiles in px (tiles are square)\n", " ol_size - size in px of the overlapped region between tiles\n", " \n", " returns a tuple with information about any \"edge case\" yolo boxes it encounters\n", " '''\n", " \n", " ext = \".\"+imname.split(\".\")[-1]\n", " \n", " im = Image.open(imname)\n", " imr = np.array(im, dtype=np.uint8)\n", " oldheight = imr.shape[0]\n", " oldwidth = imr.shape[1]\n", " \n", " ## Pad the image with grey such that we can evenly divide it into squares\n", " padcolor = 128\n", " \n", " wdsquare, htsquare ,tbpad, lrpad = get_pad_and_nsquares(oldwidth,oldheight,slice_size, ol_size)\n", " \n", " lrpadding = np.ones((oldheight,lrpad,3),dtype=np.uint8)*padcolor\n", " \n", " imr = np.hstack((lrpadding,imr,lrpadding))\n", " width = imr.shape[1]\n", "\n", " tbpadding = np.ones((tbpad, width,3),dtype=np.uint8)*padcolor\n", " \n", " imr = np.vstack((tbpadding,imr,tbpadding))\n", " height = imr.shape[0]\n", " \n", " labels = pd.read_csv(labname, sep=' ', names=['class', 'x1', 'y1', 'w', 'h'])\n", "\n", " # we need to rescale coordinates from 0-1 to real image height and width\n", " labels[['x1']] = labels[['x1']] * oldwidth + lrpad\n", " labels[['w']] = labels[['w']] * oldwidth\n", " labels[['y1']] = labels[['y1']] * oldheight + tbpad\n", " labels[['h']] = labels[['h']] * oldheight\n", "\n", " \n", " boxes = []\n", " badboxfound = \"\" #empty string evaluates to false\n", " nonpolyfound = \"\"\n", "\n", " # convert bounding boxes to shapely polygons. We need to invert Y and find polygon vertices from center points\n", " for row in labels.iterrows():\n", " x1 = row[1]['x1'] - row[1]['w']/2\n", " y1 = (height - row[1]['y1']) - row[1]['h']/2\n", " x2 = row[1]['x1'] + row[1]['w']/2\n", " y2 = (height - row[1]['y1']) + row[1]['h']/2\n", " \n", " if x1 == x2 or y1 ==y2: \n", " badboxfound = imname # will evaluate as True in conditionals\n", " else:\n", " boxes.append((int(row[1]['class']), Polygon([(x1, y1), (x2, y1), (x2, y2), (x1, y2)])))\n", "\n", " counter = 0\n", " # print('Image:', imname)\n", " # create tiles and find intersection with bounding boxes for each tile\n", " for i in range(htsquare):\n", " for j in range(wdsquare):\n", " x1 = j*(slice_size-ol_size)\n", " y1 = height - (i*(slice_size-ol_size))\n", " x2 = (j+1)*(slice_size-ol_size) + ol_size - 1\n", " y2 = height - ((i+1)*(slice_size-ol_size) + ol_size) + 1\n", "\n", " pol = Polygon([(x1, y1), (x2, y1), (x2, y2), (x1, y2)])\n", " imsaved = False\n", " slice_labels = []\n", "\n", " for box in boxes:\n", " if pol.intersects(box[1]):\n", " inter = pol.intersection(box[1])\n", " \n", " if inter.geom_type != 'Polygon':\n", " nonpolyfound = imname #pretty sure this would only happen if edge of bound is exactly on edge of tile\n", " \n", " else:\n", " \n", " if not imsaved:\n", " sliced = imr[height-y1:height-y2+1, x1:x2+1]\n", " sliced_im = Image.fromarray(sliced)\n", " filename = imname.split('/')[-1]\n", " slice_path = newpath + \"/\" + filename.replace(ext, f'_{i}_{j}{ext}') \n", " slice_labels_path = newlabpath + \"/\" + filename.replace(ext, f'_{i}_{j}.txt') \n", " # print(slice_path)\n", " sliced_im.save(slice_path)\n", " imsaved = True \n", "\n", " # get smallest rectangular polygon (with sides parallel to the coordinate axes) that contains the intersection\n", " # new_box = inter.envelope #Not sure envelope is needed. Sides should be parallel already\n", " new_box = inter\n", "\n", " # get central point for the new bounding box \n", " centre = new_box.centroid\n", "\n", " # get coordinates of polygon vertices\n", " x, y = new_box.exterior.coords.xy\n", "\n", " # get bounding box width and height normalized to slice size\n", " new_width = (max(x) - min(x)) / slice_size\n", " new_height = (max(y) - min(y)) / slice_size\n", "\n", " # we have to normalize central x and invert y for yolo format\n", " new_x = (centre.coords.xy[0][0] - x1) / slice_size\n", " new_y = (y1 - centre.coords.xy[1][0]) / slice_size\n", "\n", " if (new_box.area/box[1].area) > 0.5:\n", " counter += 1\n", " slice_labels.append([box[0], new_x, new_y, new_width, new_height])\n", "\n", " if len(slice_labels) > 0:\n", " slice_df = pd.DataFrame(slice_labels, columns=['class', 'x1', 'y1', 'w', 'h'])\n", " # print(slice_df)\n", " slice_df.to_csv(slice_labels_path, sep=' ', index=False, header=False, float_format='%.6f')\n", "\n", " if not imsaved and falsepath:\n", " sliced = imr[height-y1:height-y2+1, x1:x2+1]\n", " sliced_im = Image.fromarray(sliced)\n", " filename = imname.split('/')[-1]\n", " slice_path = falsepath + \"/\" + filename.replace(ext, f'_{i}_{j}{ext}') \n", "\n", " sliced_im.save(slice_path)\n", " # print('Slice without boxes saved')\n", " imsaved = True\n", " \n", " return (badboxfound,nonpolyfound)\n", " \n", "\n", "def tile_one_ol_multi_wrap(args):\n", " '''\n", " arg - a single dict with all args for tile_one_overlap named as arguments taken by that function\n", " '''\n", " try:\n", " if os.path.isfile(args[1]):\n", " return tile_one_overlap(*args)\n", " except Exception as e: \n", " print(\"Error on img\",args[0])\n", " raise e\n", "\n", " \n", "def tile_train_multi(path, slice_size, ol_size = 0, nthread = None):\n", " '''\n", " path - abs path to folder containing train and val ex. /workspace/data\n", " slice_size - lenth in px of each side of tile (they're square)\n", " ol_size - lenght in px of overlab petween tiles\n", " nthread - number of threads to use when tiling out the images (if none, one for each available cpu) \n", " \n", " assumed directory sturcture:\n", " - path\n", " - train (existing images and labels)\n", " - images\n", " - labels\n", " - val (existing images and labels)\n", " - images\n", " - labels\n", " - tiled_train (will be created)\n", " - images\n", " - labels\n", " - empty_imgs\n", " '''\n", " \n", " badboximgs = []\n", " nonpolyimgs = []\n", " \n", " ## Make strings for new directories\n", " train_img_path = path + \"/train/images\"\n", " train_lab_path = path + \"/train/labels\"\n", " tiled_path = path + \"/tiled_train\"\n", " tiled_good_img_path = tiled_path + \"/images\"\n", " tiled_empty_img_path = tiled_path + \"/empty_imgs\"\n", " tiled_lab_path = tiled_path + \"/labels\"\n", " \n", " ## Delete the old tiled directory if it exists and start a anew\n", " if os.path.isdir(path):\n", " ! rm -rf $tiled_path\n", " os.makedirs(tiled_good_img_path)\n", " os.makedirs(tiled_empty_img_path)\n", " os.makedirs(tiled_lab_path)\n", " \n", " ## Get list of existing path \n", " og_train_list = os.listdir(train_img_path)\n", " \n", " ## Make iterables so that we can use pool.map function and multithread\n", " fimg = lambda x : train_img_path + \"/\" + x \n", " imgfileiter = map(fimg, og_train_list)\n", "\n", " def flab(img):\n", " ext = img.split(\".\")[-1]\n", " return train_lab_path+\"/\"+img.replace(ext,\"txt\")\n", " labfileiter = map(flab, og_train_list)\n", "\n", " tgiplist = [tiled_good_img_path]*len(og_train_list)\n", " tlplist = [tiled_lab_path]*len(og_train_list)\n", " teiplist = [tiled_empty_img_path]*len(og_train_list)\n", " sslist = [slice_size]*len(og_train_list)\n", " olslist = [ol_size]*len(og_train_list)\n", " \n", " inputtuplelist = list(zip(imgfileiter,labfileiter,tgiplist,tlplist,teiplist,sslist,olslist))\n", " \n", " norescount = 0\n", " \n", " p = mp.Pool(processes = nthread)\n", " \n", " for res in tqdm(p.imap_unordered(tile_one_ol_multi_wrap, inputtuplelist), total= len(inputtuplelist)):\n", " if res != None:\n", " if res[0]:\n", " badboximgs.append(res[0])\n", " if res[1]:\n", " nonpolyimgs.append(res[1])\n", " else:\n", " norescount += 1\n", "\n", " p.close()\n", " p.join()\n", " \n", " \n", " print(f\"{norescount} background images (these are not tiled)\")\n", " print()\n", " print(\"{} bad box images found (images processed with zero height boxes ignored)\".format(len(badboximgs)))\n", " for img in badboximgs:\n", " print(img)\n", " print()\n", " print(\"{} nonpolygon interesections found (nothing wrong with this image just diagnostics for kevin)\".format(len(nonpolyimgs)))\n", " for img in nonpolyimgs:\n", " print(img)\n", " \n", " return None" ] }, { "cell_type": "markdown", "id": "4b79b7f4", "metadata": {}, "source": [ "## Call the function to tile the images" ] }, { "cell_type": "code", "execution_count": 4, "id": "07fe026c", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 1812/1812 [03:30<00:00, 8.62it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "164 background images (these are not tiled)\n", "\n", "3 bad box images found (images processed with zero height boxes ignored)\n", "/workspace/data/train/images/ostrich-10.jpg\n", "/workspace/data/train/images/human_07-0024.jpg\n", "/workspace/data/train/images/ostrich_03-1001.JPG\n", "\n", "1 nonpolygon interesections found (nothing wrong with this image just diagnostics for kevin)\n", "/workspace/data/train/images/human1-1080.jpg\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "working_dir = \"/workspace/data\"\n", "tile_size = 512\n", "overlap = 128\n", "threads = None # passing None uses all available threads/CPUs\n", "\n", "tile_train_multi(working_dir,tile_size, overlap, threads)" ] }, { "cell_type": "markdown", "id": "6a311e3e", "metadata": {}, "source": [ "## Combine Tiled and Un-Tiled Images in `train`\n", "\n", "This creates an additional directory (`original_train`) and copies the images and labels from train into it before adding the images and labels in `tiled_train` that we created from tiling into the `train` directory" ] }, { "cell_type": "markdown", "id": "5751bf2b", "metadata": {}, "source": [ "#### Combine tiled images and untiled images into `train` folder" ] }, { "cell_type": "code", "execution_count": 7, "id": "9fead7f7", "metadata": {}, "outputs": [], "source": [ "## Make a copy of the original training data, if we already have it, make sure that's what in train folde before combining\n", "if not os.path.isdir(working_dir+\"/original_train\"):\n", " ! cp -r $working_dir/train $working_dir/original_train\n", "else:\n", " ! rm -rf $working_dir/train\n", " ! cp -r $working_dir/original_train $working_dir/train\n", "\n", "## combine the tiled images and labels into the train folder with the original training images \n", "! cp -r $working_dir/tiled_train/images/* $working_dir/train/images/\n", "! cp -r $working_dir/tiled_train/labels/* $working_dir/train/labels/" ] }, { "cell_type": "markdown", "id": "55baefd2", "metadata": {}, "source": [ "## See if I can get the original bounding boxes back out of my tiled images" ] }, { "cell_type": "code", "execution_count": 12, "id": "b68ef413", "metadata": {}, "outputs": [], "source": [ "def consolidate_boxes():\n", " pass" ] }, { "cell_type": "markdown", "id": "647161ba", "metadata": {}, "source": [ "## Optional: Return the `train` folder to its original state" ] }, { "cell_type": "markdown", "id": "b4068aab", "metadata": {}, "source": [ "#### Return the `train` folder to its original state (de-combine)" ] }, { "cell_type": "code", "execution_count": 3, "id": "e109675e", "metadata": {}, "outputs": [], "source": [ "if os.path.isdir(working_dir+\"/original_train\"):\n", " ! rm -rf $working_dir/train\n", " ! cp -r $working_dir/original_train $working_dir/train\n", " \n", " # ## Optionally delete original_train when you're done \n", " # # You probably DON'T want to do this\n", " # # ! rm -rf $working_dir/original_train" ] }, { "cell_type": "markdown", "id": "48b4b969", "metadata": {}, "source": [ "## Optional: Testing Code" ] }, { "cell_type": "markdown", "id": "09e4773c", "metadata": {}, "source": [ "#### Test the point mapper mapping back to orig space. " ] }, { "cell_type": "code", "execution_count": 17, "id": "85315b15", "metadata": {}, "outputs": [], "source": [ "def test_tile_map_on_image(img_w_path, slice_size, ol_size, dot_density):\n", " '''\n", " Test the tile point mapping function on a single image. The image will get covered with points where\n", " the hypothetical tiles would be. Corner points for tiles will be blue on the image. Points in overlap\n", " will be red. Non overlap region points will be green.\n", " \n", " inputs\n", " - img_w_path - The complete path to the image file we'll be testing\n", " '''\n", " \n", " ## Read in image\n", " im = Image.open(img_w_path)\n", " imr = np.array(im, dtype=np.uint8)\n", " oldheight = imr.shape[0]\n", " oldwidth = imr.shape[1]\n", " \n", " wdsquare, htsquare ,tbpad, lrpad = get_pad_and_nsquares(oldwidth,oldheight,slice_size, ol_size)\n", " \n", " ## Create some colors\n", " red = [255,0,0]\n", " green = [0,255,0]\n", " blue = [0,0,255]\n", " \n", " ## Add some colored points to the figure\n", " for ii in range(wdsquare):\n", " for jj in range(htsquare):\n", " for kk in range(slice_size//dot_density):\n", " for ll in range(slice_size//dot_density):\n", " og_img_x, og_img_y, in_ol = map_tile_xy_to_orig(oldwidth,\n", " oldheight, slice_size, ol_size, \n", " jj, ii, kk*dot_density, ll*dot_density)\n", " \n", " if (og_img_x >= 0) and (og_img_x < oldwidth) and (og_img_y >=0) and (og_img_y < oldheight):\n", " if in_ol:\n", " imr[og_img_y,og_img_x] = red\n", " else:\n", " imr[og_img_y,og_img_x] = green\n", " \n", " corners = [(0,0), (0,slice_size-1), (slice_size-1, 0), (slice_size-1, slice_size-1)]\n", " for corner in corners:\n", " og_img_x, og_img_y, _ = map_tile_xy_to_orig(oldwidth,\n", " oldheight, slice_size, ol_size, \n", " jj, ii, corner[0], corner[1])\n", " if (og_img_x >= 0) and (og_img_x < oldwidth) and (og_img_y >=0) and (og_img_y < oldheight):\n", " imr[og_img_y,og_img_x] = blue\n", "\n", " ext = \".\" + img_w_path.split(\".\")[-1]\n", " test_image = Image.fromarray(imr)\n", " test_path = img_w_path.replace(ext, f'_dottest{ext}') \n", " # print(slice_path)\n", " test_image.save(test_path)\n", "\n", "test_tile_map_on_image(\"/workspace/test_data/train/images/Blank_Test.JPG\",512,128,10)" ] }, { "cell_type": "code", "execution_count": 6, "id": "4e7dc54e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4\n" ] } ], "source": [ "def SCRATCH():\n", " a = np.array([[1,2,3],[4,5,6]])\n", " print(a[1,0]) #(should be \"4\", row 1, col 0)\n", "SCRATCH()" ] }, { "cell_type": "code", "execution_count": null, "id": "6e194568", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 5 }