牛骨文教育服务平台(让学习变的简单)
博文笔记

caffe学习笔记7--Image Classification and Filter Visualization

创建时间:2016-02-24 投稿人: 浏览次数:159

这是caffe文档中Notebook Examples的第一篇,链接地址http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb

这个例子利用CaffeNet模型对caffe文件夹下的那张小猫的图像进行分类,逐层可视化图像特征,CaffeNet基于ImageNet。同时比较了CPU和GPU操作。


1. 准备模型,引入必要的模块:

由于这里是用CaffeNet的测试阶段因此,需要下载参数文件,在caffe的根目录下运行:

./scripts/download_model_binary.py
或者直接在http://dl.caffe.berkeleyvision.org/上下载bvlc_reference_caffenet.caffemodel文件,下载完成后放到

$CAFFE-ROOT/models/bvlc_reference_caffenet文件夹下, $CAFFE-ROOT为caffe根目录

在ipython下运行

import numpy as np           #调用numpy模块,调用名称为np
import matplotlib.pyplot as plt               #调用matplotlib.pyplot模块,调用名称为plt
import sys

caffe_root = "/home/username/caffe-master"   #caffe根目录
sys.path.append("/usr/lib/python2.7/dist-packages")    
model_file = caffe_root + "/models/bvlc_reference_caffenet/deploy.prototxt"  #CaffeNet网络结构
pretrained = caffe_root + "/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel"    #参数文件

image_file = caffe_root+"/examples/images/cat.jpg"    #测试数据

npload = caffe_root + "/python/caffe/imagenet/ilsvrc_2012_mean.npy"    #计算平均值

import caffe

plt.rcParams["figure.figsize"] = (10, 10)       # 显示图标大小为10
plt.rcParams["image.interpolation"] = "nearest"     # 图形差值以最近为原则
plt.rcParams["image.cmap"] = "gray"        #背景颜色为灰色


2. 设置CPU摸式运行,加载模型和参数文件,配置输入预处理


caffe.set_mode_cpu()
net = caffe.Net(model_file, pretrained, caffe.TEST)    #构建网络
transformer = caffe.io.Transformer({"data":net.blobs["data"].data.shape })
transformer.set_transpose("data",(2,0,1))
transformer.set_mean("data",np.load(npload).mean(1).mean(1))    #计算像素平均值
transformer.set_raw_scale("data",255) #将图像转到灰白空间
transformer.set_channel_swap("data",(2,1,0)) # 参考模型通道为BGR,需转换成RGB,括号中的数字表示排列顺序

4测试数据

net.blobs["data"].reshape(50, 3, 227, 227)
net.blobs["data"].data[...] = transformer.preprocess("data",caffe.io.load_image(image_file))  #读取文件 
out = net.forward()
print("Predicted class is #{}.".format(out["prob"][0].argmax()))

5. 显示图片:

plt.imshow(transformer.deprocess("data", net.blobs["data"].data[0]))


6. 获取标签

命令行caffe-根目录输入语句

./data/ilsvrc12/get_ilsvrc_aux.sh

之后代码输入:

imagenet_labels_filename = caffe_root + "/data/ilsvrc12/synset_words.txt"
labels = np.loadtxt(imagenet_labels_filename, str, delimiter="	")
top_k = net.blobs["prob"].data[0].flatten().argsort()[-1:-6:-1]
print labels[top_k]



可以看到如下结果
["n02123045 tabby, tabby cat" "n02123159 tiger cat"
 "n02124075 Egyptian cat" "n02119022 red fox, Vulpes vulpes"
 "n02127052 lynx, catamount"]

7.CPU下获取一次运行事件

net.forward()  # call once for allocation
%timeit net.forward()
输出:1 loops, best of 3: 4.53 s per loop

8.在GPU下运行

 caffe.set_device(0)
 caffe.set_mode_gpu()
 net.forward()  # call once for allocation
 %timeit net.forward()
输出:1 loops, best of 3: 397 ms per loop

9. 网络各层特征和结构

[(k, v.data.shape) for k, v in net.blobs.items()]

输出:参数中第一个为网络名,后面四个数分别为批处理大小,滤波器个数,每个神经元中图像大小

[("data", (50, 3, 227, 227)),
 ("conv1", (50, 96, 55, 55)),
 ("pool1", (50, 96, 27, 27)),
 ("norm1", (50, 96, 27, 27)),
 ("conv2", (50, 256, 27, 27)),
 ("pool2", (50, 256, 13, 13)),
 ("norm2", (50, 256, 13, 13)),
 ("conv3", (50, 384, 13, 13)),
 ("conv4", (50, 384, 13, 13)),
 ("conv5", (50, 256, 13, 13)),
 ("pool5", (50, 256, 6, 6)),
 ("fc6", (50, 4096)),
 ("fc7", (50, 4096)),
 ("fc8", (50, 1000)),
 ("prob", (50, 1000))]

10. 参数的形状,

[(k, v[0].data.shape) for k, v in net.params.items()]
输出网络参数:

[("conv1", (96, 3, 11, 11)),
 ("conv2", (256, 48, 5, 5)),
 ("conv3", (384, 256, 3, 3)),
 ("conv4", (384, 192, 3, 3)),
 ("conv5", (256, 192, 3, 3)),
 ("fc6", (4096, 9216)),
 ("fc7", (4096, 4096)),
 ("fc8", (1000, 4096))]

11. 用于可视化的函数

# take an array of shape (n, height, width) or (n, height, width, channels)
# and visualize each (height, width) thing in a grid of size approx. sqrt(n) by sqrt(n)
def vis_square(data, padsize=1, padval=0):
    data -= data.min()
    data /= data.max()
    
    # force the number of filters to be square
    n = int(np.ceil(np.sqrt(data.shape[0])))
    padding = ((0, n ** 2 - data.shape[0]), (0, padsize), (0, padsize)) + ((0, 0),) * (data.ndim - 3)
    data = np.pad(data, padding, mode="constant", constant_values=(padval, padval))
    
    # tile the filters into an image
    data = data.reshape((n, n) + data.shape[1:]).transpose((0, 2, 1, 3) + tuple(range(4, data.ndim + 1)))
    data = data.reshape((n * data.shape[1], n * data.shape[3]) + data.shape[4:])
    
    plt.imshow(data)



12. 显示conv1的滤波器:共96个

# the parameters are a list of [weights, biases]
filters = net.params["conv1"][0].data
vis_square(filters.transpose(0, 2, 3, 1))


声明:该文观点仅代表作者本人,牛骨文系教育信息发布平台,牛骨文仅提供信息存储空间服务。