Deep Learning with Caffe, help

nightpoison

NES Member
Joined
Mar 28, 2010
Messages
1,955
Likes
1,355
Location
Beverly, MA
Feedback: 31 / 0 / 0
Hello all,

I’m looking for some help with a new area of development I’ve been working with. I’m brand new to the Deep Learning field. I’ve been assigned some projects at work, and have being using NVIDIA DIGITS, which in earnest dumbs down the process. The newest project needs me to use Caffe directly, my end goal is to train a network model that will be used for vehicle detection. The model description I’m using is SSD Mobilenet, as the end product will be running on a Raspberry Pi 3 or something similar, as a Movidius NCS will be handling the brunt of the workload for the detection. I already have my software running on the Pi, with the NCS, and its running about 7fps, Which I’m good with for my current needs. I’m using a pre-trained model taken from NCAPPZOOV2 for the NCS. I would like to use that pretrained model to finetune it for only “Vehicles”. Right now it does cars, buses, people, bikes, and 16 other items for a total of 20. I just want one object, “Vehicle”.

I would like to use the SSD mobile net model found in the NCAPPZOO as its already proven to be able to compile for the NCS. The prototxt file is already configured for the NCS compiler. I have a couple questions.

When I was working with DIGITS it was very simple and straight forward. I would take my images and using annotation software like labelIMG, I would create a directory of .xml files with the annotation information. I would then convert the .xml files to .txt files, one for each image, each with a full list of all the objects I wanted labeled within a single image. In the end the directories would look something like this.

- train
- - - images
- - - - - 0.png
- - - - - …
- - - - - 99.png
- - - labels
- - - - - 0.txt
- - - - - …
- - - - - 99.png
- val
- - - images
- - - - - 100.png
- - - - - …
- - - - - 130.png
- - - labels
- - - - - 100.txt
- - - - - …
- - - - - 130.txt

The annotation .txt files would look like this

Vehicle 0.00 0 0.00 1171.00 142.00 1231.00 203.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Vehicle 0.00 0 0.00 1110.00 100.00 1177.00 175.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

I know that DIGITS use Caffe, but specifically NVCaffe that is developed by Nvidia. So, while they are similar they are different. The GUI for DIGITS simplifies things, but when you switch to vanilla cafe it throws you for a loop. I don’t understand what the difference is, documentation on Caffe seems to be very poor, compared to say TensorFlow. I was able to find very clear documentation on TensorFlow, but it seems all the documentation for Caffe is geared toward image classification and not Detection. Does anyone know of clear documentation for setting up the dataset and training a new model off of a pretrained model for caffe?

I know to retrain I need the solver and weights. I need a dataset with properly formatted annotations, how to annotate them I’m not sure. I also need to make modifications to the solver to identify I’m only looking for 1 output as opposed to the 20 in the original.

If you have experience with Caffe, Movidius, Mobilenet, etc, please let me know. I could really use some help.

Michael
 
hahahaha This is funny. I know there are better forums for this, but I've asked software questions here in the past and have received more helpful support and information than when I've posted on forums like stackoverflow, etc.

I've found that its hard to beat the massive wealth and variety of knowledge available here.

drumenigma, I have visited stackoverflow, and it I've found many discussions and blog posts and I'm having difficulty understanding. Most blog posts attempt to go into the process from the point of view of image classification, rather than object detection.

HarryPottar, thanks I'll check both your links out. Python is going to be an issue, as I've installed Caffe, but the python doesn't seem to want to import caffe. So the installation of the Python dependencies seemed to have failed.
 
Why would you retrain the model, if all you need is to group cars/buses/etc as your new vehicle class? Can’t you just take the fitted label and do a few if/else statements to create a vehicle/non-vehicle label as the last step?
 
Why would you retrain the model, if all you need is to group cars/buses/etc as your new vehicle class? Can’t you just take the fitted label and do a few if/else statements to create a vehicle/non-vehicle label as the last step?

I'm fine with grouping them for identification purposes, but the model as is is not accurate enough for my needs. I need to run it with more images at a much wider variety of sizes and conditions. for instance it doesn't do well at all if its raining. So I need to finetune it with vehicles in the rain, probably at night, etc. Yea I didn't make that clear in my original post, so I absolutely agree with your suggestion.

I'm ultimately probably going to ditch the pre-trained model all together and train from scratch for best performance, but use the same model description. I'm just in a phase one proof of concept. I would rather not train from scratch on an AWS system that could possibly take weeks. Using a pretrained model will be much faster for this stage.
 
Back
Top Bottom