Used Pytorch, OpenCV and Pillow in Python
Trained a character agnostic text detector on Chars74K dataset along with images consisting of indoor/outdoor scenes without text. The character agnostic model is an Alexnet network pre-trained on Imagenet.
Reduced time to label a single image by 34% by training a smaller network using Knowledge Distillation, making the model capable of labelling thousands of images in a few hours. Also explored and analyzed other methods for Neural Network compression.
Used the distilled network and sliding windows to annotate images in UCSD SVT and NEOCR dataset to derive bounding boxes
Trained a street text localisation and detection Fully Convolutional Network(FCN) on the weakly supervised labelled dataset.