PyData NYC by Akira Shibata

2. Who am I Akira Shibata, PhD. TW: @punkphysicist CEO, Shiroyagi Corporation (shiroyagi.co.jp) Kamelio: Personalised News Curation Kamect: Contents Discovery Platform 2004 - 2010: Data Scientist @ NYU Statistical data modelling @ LHC, CERN 2010 - 2013 Boston Consulting Group Copyright 2014 Shiroyagi Corporation. All rights reserved. 2

13. Python puts all our tools together Image in Detect regions Object recog. Scoring Cropping 0 1 2 3 4 Matlab +Scipy C++ +Libraries Numpy PIL IPython and Python script Copyright 2014 Shiroyagi Corporation. All rights reserved. 13

16. Region detection: Telling where to look at How do we find regions to feed into object recognition? Default strategy was to look at the center 1 Copyright 2014 Shiroyagi Corporation. All rights reserved. 16

17. Exhaustive windows -> segmentation Search over position, scale, aspect ratio Grouping parts of image at different scales Exhaustive search far too time inefficient for use with Deep Learning 1 Copyright 2014 Shiroyagi Corporation. All rights reserved. 17

18. 1 Region detection: in practice Install Malab and Selective Search algorithm from author Run matlab as subprocess pid = subprocess.Popen(shlex.split(mc), stdout=open('/dev/null', 'w'), cwd=script_dirname) matlab -nojvm -r "try; selective_search({‘image_file.jpg’}, ‘output.mat'); catch; exit; end; exit” 1 2 3 Import output using scipy.io all_boxes = list(scipy.io.loadmat(‘output.mat')['all_boxes'][0]) subtractor = np.array((1, 1, 0, 0))[np.newaxis, :] all_boxes = [boxes - subtractor for boxes in all_boxes] Copyright 2014 Shiroyagi Corporation. All rights reserved. 18

26. 2 Object recognition: in practice Install a bunch of libraries and Caffe CUDA, Boost, OpenCV, BLAS… Import wrapper and configure MODEL_FILE=‘models/bvlc_…_ilsvrc13/deploy.prototxt’ PRETRAINED_FILE = ‘models/…/bvlc_…_ilsvrc13.caffemodel’ MEAN_FILE = 'caffe/imagenet/ilsvrc_2012_mean.npy' detector = caffe.Detector(MODEL_FILE, PRETRAINED_FILE, mean=np.load(MEAN_FILE), raw_scale=255, channel_swap=[2,1,0]) 1 2 3 Pass found regions for object detection self.detect_windows(zip(image_fnames, windows_list)) Copyright 2014 Shiroyagi Corporation. All rights reserved. 26

27. 2 Object recognition: Result Obj Score 0 domestic cat 1.03649377823 1 domestic cat 0.0617411136627 2 domestic cat -0.097744345665 3 domestic cat -0.738470971584 4 chair -0.988844156265 5 skunk -0.999914288521 6 tv or monitor -1.00460898876 7 rubber eraser -1.01068615913 8 chair -1.04896986485 9 rubber eraser -1.09035253525 10 band aid -1.09691572189 Takes minutes to detect all windows Copyright 2014 Shiroyagi Corporation. All rights reserved. 27

28. 2 Object recognition: Result Obj Score 0 person 0.126184225082 1 person 0.0311727523804 2 person -0.0777613520622 3 neck brace -0.39757412672 4 person -0.415030777454 5 drum -0.421649754047 6 neck brace -0.481261610985 7 tie -0.649109125137 8 neck brace -0.719438135624 9 face powder -0.789100408554 10 face powder -0.838757038116 Copyright 2014 Shiroyagi Corporation. All rights reserved. 28

30. 3 Scoring 1 For every pixel, sum up score from all detections for i in xrange(len(detec0ons)): arr[ymin:ymax, xmin:xmax] += math.exp(score) Copyright 2014 Shiroyagi Corporation. All rights reserved. 30

33. 4 Cropping Generate all possible crop areas while y+hws <= h: while x+hws <= w: window_locs = np.vstack((window_locs, [x, y, x+hws, y+hws])) Find the crop that encloses the highest point of interest in the centre for i, window_loc in enumerate(window_locs): x1, y1, x2, y2 = window_loc if max_val != np.max(arr_con[y1:y2, x1:x2]): scores[i]=np.nan else: scores[i] = ((x1+x2)/2.-‐xp)**2+ ((y1+y2)/2.-‐yp)**2 1 2 3 Crop and save! img_pil = Image.open(fn) crop_area=map(lambda x: int(x), window_locs[scores.argmax()]) img_crop = img_pil.crop(crop_area) Copyright 2014 Shiroyagi Corporation. All rights reserved. 33

PyData NYC by Akira Shibata

Akira Shibata

Views

Actions

Embeds 0

Report content

Transcript