Back | Next | Contents
Transfer Learning
In order to collect your own datasets for training customized models to classify objects or scenes of your choosing, we've created an easy-to-use tool called camera-capture
for capturing and labelling images on your Jetson from live video:
The tool will create datasets with the following directory structure on disk:
‣ train/
• class-A/
• class-B/
• ...
‣ val/
• class-A/
• class-B/
• ...
‣ test/
• class-A/
• class-B/
• ...
where class-A
, class-B
, ect. will be subdirectories containing the data for each object class that you've defined in a class label file. The names of these class subdirectories will match the class label names that we'll create below. These subdirectories will automatically be populated by the tool for the train
, val
, and test
sets from the classes listed in the label file, and a sequence of JPEG images will be saved under each.
Note that above is the organization structure expected by the PyTorch training script that we've been using. If you inspect the Cat/Dog and PlantCLEF datasets, they're also organized in the same way.
First, create an empty directory for storing your dataset and a text file that will define the class labels (usually called labels.txt
). The label file contains one class label per line, and is alphabetized (this is important so the ordering of the classes in the label file matches the ordering of the corresponding subdirectories on disk). As mentioned above, the camera-capture
tool will automatically populate the necessary subdirectories for each class from this label file.
Here's an example labels.txt
file with 5 classes:
background
brontosaurus
tree
triceratops
velociraptor
And here's the corresponding directory structure that the tool will create:
‣ train/
• background/
• brontosaurus/
• tree/
• triceratops/
• velociraptor/
‣ val/
• background/
• brontosaurus/
• tree/
• triceratops/
• velociraptor/
‣ test/
• background/
• brontosaurus/
• tree/
• triceratops/
• velociraptor/
Next, we'll cover the command-line options for starting the tool.
The source for the camera-capture
tool can be found under jetson-inference/tools/camera-capture/
, and like the other programs from the repo it gets built to the aarch64/bin
directory and installed under /usr/local/bin/
The camera-capture
tool accepts 3 optional command-line arguments:
--camera
flag setting the camera device to use- MIPI CSI cameras are used by specifying the sensor index (
0
or1
, ect.) - V4L2 USB cameras are used by specifying their
/dev/video
node (/dev/video0
,/dev/video1
, ect.) - The default is to use MIPI CSI sensor 0 (
--camera=0
)
- MIPI CSI cameras are used by specifying the sensor index (
--width
and--height
flags setting the camera resolution (default is1280x720
)- The resolution should be set to a format that the camera supports.
- Query the available formats with the following commands:
$ sudo apt-get install v4l-utils $ v4l2-ctl --list-formats-ext
Below are some example commands for launching the tool:
$ camera-capture # using default MIPI CSI camera (1280x720)
$ camera-capture --camera=/dev/video0 # using V4L2 camera /dev/video0 (1280x720)
$ camera-capture --width=640 --height=480 # using default MIPI CSI camera (640x480)
note: for example cameras to use, see these sections of the Jetson Wiki:
- Nano:https://eLinux.org/Jetson_Nano#Cameras
- Xavier:https://eLinux.org/Jetson_AGX_Xavier#Ecosystem_Products_.26_Cameras
- TX1/TX2: developer kits include an onboard MIPI CSI sensor module (0V5693)
Below is the Data Capture Control
window, which allows you to pick the desired path to the dataset and load the class label file that you created above, and then presents options for selecting the current object class and train/val/test set that you are currently collecting data for:
First, open the dataset path and class labels. The tool will then create the dataset structure discussed above (unless these subdirectories already exist), and you will see your object labels populated inside the Current Class
drop-down.
Then position the camera at the object or scene you have currently selected in the drop-down, and click the Capture
button (or press the spacebar) when you're ready to take an image. The images will be saved under that class subdirectory in the train, val, or test set. The status bar displays how many images have been saved under that category.
It's recommended to collect at least 100 training images per class before attempting training. A rule of thumb for the validation set is that it should be roughly 10-20% the size of the training set, and the size of the test set is simply dictated by how many static images you want to test on. You can also just run the camera to test your model if you'd like.
It's important that your data is collected from varying object orientations, camera viewpoints, lighting conditions, and ideally with different backgrounds to create a model that is robust to noise and changes in environment. If you find that you're model isn't performing as well as you'd like, try adding more training data and playing around with the conditions.
When you've collected a bunch of data, then you can try training a model on it, just like we've done before. The training process is the same as the previous examples, and the same PyTorch scripts are used:
$ cd jetson-inference/python/training/classification
$ python train.py --model-dir=<YOUR-MODEL> <PATH-TO-YOUR-DATASET>
Like before, after training you'll need to convert your PyTorch model to ONNX:
$ python onnx_export.py --model-dir=<YOUR-MODEL>
The converted model will be saved under <YOUR-MODEL>/resnet18.onnx
, which you can then load with the imagenet-console
and imagenet-camera
programs like we did in the previous examples:
DATASET=<PATH-TO-YOUR-DATASET>
# C++
imagenet-camera --model=<YOUR-MODEL>/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=$DATASET/labels.txt
# Python
imagenet-camera.py --model=<YOUR-MODEL>/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=$DATASET/labels.txt
If you need to, go back and collect more training data and re-train your model again. You can restart the again and pick up where you left off using the --resume
and --epoch-start
flags (run python train.py --help
for more info). Remember to re-export the model to ONNX after re-training.
This is the last step of the Hello AI World tutorial, which covers inferencing and transfer learning on Jetson with TensorRT and PyTorch. To recap, together we've covered:
- Using image recognition networks to classify images
- Coding your own image recognition programs in Python and C++
- Classifying video from a live camera stream
- Performing object detection to locate object coordinates
- Re-training models with PyTorch using transfer learning
- Collecting your own datasets and training your own models
Next we encourage you to experiment and apply what you've learned to other projects, perhaps taking advantage of Jetson's embedded form-factor - for example an autonomous robot or intelligent camera-based system. Here are some example ideas that you could play around with:
- use GPIO to trigger external actuators or LEDs when an object is detected
- an autonomous robot that can find or follow an object
- a handheld battery-powered camera + Jetson + mini-display
- an interactive toy or treat dispenser for your pet
- a smart doorbell camera that greets your guests
For more examples to inspire your creativity, see the Jetson Projects page. Have fun and good luck!
You can also follow our Two Days to a Demo tutorial, which covers training of even larger datasets in the cloud or on a PC using discrete NVIDIA GPU(s). Two Days to a Demo also covers semantic segmentation, which is like image classification, but on a per-pixel level instead of predicting one class for the entire image.
Back | Re-training on the PlantCLEF Dataset
© 2016-2019 NVIDIA | Table of Contents