詳解tensorflow訓(xùn)練自己的數(shù)據(jù)集實(shí)現(xiàn)CNN圖像分類

2020-01-04 15:55:49

字體：大中小

供稿：網(wǎng)友

利用卷積神經(jīng)網(wǎng)絡(luò)訓(xùn)練圖像數(shù)據(jù)分為以下幾個(gè)步驟

1.讀取圖片文件
2.產(chǎn)生用于訓(xùn)練的批次
3.定義訓(xùn)練的模型（包括初始化參數(shù)，卷積、池化層等參數(shù)、網(wǎng)絡(luò)）
4.訓(xùn)練

1 讀取圖片文件

def get_files(filename):  class_train = []  label_train = []  for train_class in os.listdir(filename):    for pic in os.listdir(filename+train_class):      class_train.append(filename+train_class+'/'+pic)      label_train.append(train_class)  temp = np.array([class_train,label_train])  temp = temp.transpose()  #shuffle the samples  np.random.shuffle(temp)  #after transpose, images is in dimension 0 and label in dimension 1  image_list = list(temp[:,0])  label_list = list(temp[:,1])  label_list = [int(i) for i in label_list]  #print(label_list)  return image_list,label_list

這里文件名作為標(biāo)簽，即類別（其數(shù)據(jù)類型要確定，后面要轉(zhuǎn)為tensor類型數(shù)據(jù)）。

然后將image和label轉(zhuǎn)為list格式數(shù)據(jù)，因?yàn)楹筮呌玫降牡囊恍?a href="/news/dongtai/150833.html">tensorflow函數(shù)接收的是list格式數(shù)據(jù)。

2 產(chǎn)生用于訓(xùn)練的批次

def get_batches(image,label,resize_w,resize_h,batch_size,capacity):  #convert the list of images and labels to tensor  image = tf.cast(image,tf.string)  label = tf.cast(label,tf.int64)  queue = tf.train.slice_input_producer([image,label])  label = queue[1]  image_c = tf.read_file(queue[0])  image = tf.image.decode_jpeg(image_c,channels = 3)  #resize  image = tf.image.resize_image_with_crop_or_pad(image,resize_w,resize_h)  #(x - mean) / adjusted_stddev  image = tf.image.per_image_standardization(image)    image_batch,label_batch = tf.train.batch([image,label],                       batch_size = batch_size,                       num_threads = 64,                       capacity = capacity)  images_batch = tf.cast(image_batch,tf.float32)  labels_batch = tf.reshape(label_batch,[batch_size])  return images_batch,labels_batch

首先使用tf.cast轉(zhuǎn)化為tensorflow數(shù)據(jù)格式，使用tf.train.slice_input_producer實(shí)現(xiàn)一個(gè)輸入的隊(duì)列。

label不需要處理，image存儲(chǔ)的是路徑，需要讀取為圖片，接下來的幾步就是讀取路徑轉(zhuǎn)為圖片，用于訓(xùn)練。

CNN對(duì)圖像大小是敏感的，第10行圖片resize處理為大小一致，12行將其標(biāo)準(zhǔn)化，即減去所有圖片的均值，方便訓(xùn)練。

接下來使用tf.train.batch函數(shù)產(chǎn)生訓(xùn)練的批次。

最后將產(chǎn)生的批次做數(shù)據(jù)類型的轉(zhuǎn)換和shape的處理即可產(chǎn)生用于訓(xùn)練的批次。

3 定義訓(xùn)練的模型

（1）訓(xùn)練參數(shù)的定義及初始化

def init_weights(shape):  return tf.Variable(tf.random_normal(shape,stddev = 0.01))#init weightsweights = {  "w1":init_weights([3,3,3,16]),  "w2":init_weights([3,3,16,128]),  "w3":init_weights([3,3,128,256]),  "w4":init_weights([4096,4096]),  "wo":init_weights([4096,2])  }#init biasesbiases = {  "b1":init_weights([16]),  "b2":init_weights([128]),  "b3":init_weights([256]),  "b4":init_weights([4096]),  "bo":init_weights([2])  }

CNN的每層是y=wx+b的決策模型，卷積層產(chǎn)生特征向量，根據(jù)這些特征向量帶入x進(jìn)行計(jì)算，因此，需要定義卷積層的初始化參數(shù)，包括權(quán)重和偏置。其中第8行的參數(shù)形狀后邊再解釋。

（2）定義不同層的操作

 def conv2d(x,w,b):  x = tf.nn.conv2d(x,w,strides = [1,1,1,1],padding = "SAME")  x = tf.nn.bias_add(x,b)  return tf.nn.relu(x)def pooling(x):  return tf.nn.max_pool(x,ksize = [1,2,2,1],strides = [1,2,2,1],padding = "SAME")def norm(x,lsize = 4):  return tf.nn.lrn(x,depth_radius = lsize,bias = 1,alpha = 0.001/9.0,beta = 0.75)

這里只定義了三種層，即卷積層、池化層和正則化層

（3）定義訓(xùn)練模型

def mmodel(images):  l1 = conv2d(images,weights["w1"],biases["b1"])  l2 = pooling(l1)  l2 = norm(l2)  l3 = conv2d(l2,weights["w2"],biases["b2"])  l4 = pooling(l3)  l4 = norm(l4)  l5 = conv2d(l4,weights["w3"],biases["b3"])  #same as the batch size  l6 = pooling(l5)  l6 = tf.reshape(l6,[-1,weights["w4"].get_shape().as_list()[0]])  l7 = tf.nn.relu(tf.matmul(l6,weights["w4"])+biases["b4"])  soft_max = tf.add(tf.matmul(l7,weights["wo"]),biases["bo"])  return soft_max

模型比較簡(jiǎn)單，使用三層卷積，第11行使用全連接，需要對(duì)特征向量進(jìn)行reshape，其中l(wèi)6的形狀為[-1，w4的第1維的參數(shù)]，因此，將其按照“w4”reshape的時(shí)候，要使得-1位置的大小為batch_size，這樣，最終再乘以“wo”時(shí)，最終的輸出大小為[batch_size,class_num]

（4）定義評(píng)估量

 def loss(logits,label_batches):   cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,labels=label_batches)   cost = tf.reduce_mean(cross_entropy)   return cost　　首先定義損失函數(shù)，這是用于訓(xùn)練最小化損失的必需量 def get_accuracy(logits,labels):   acc = tf.nn.in_top_k(logits,labels,1)   acc = tf.cast(acc,tf.float32)   acc = tf.reduce_mean(acc)   return acc

評(píng)價(jià)分類準(zhǔn)確率的量，訓(xùn)練時(shí)，需要loss值減小，準(zhǔn)確率增加，這樣的訓(xùn)練才是收斂的。

（5）定義訓(xùn)練方式

 def training(loss,lr):   train_op = tf.train.RMSPropOptimizer(lr,0.9).minimize(loss)   return train_op

有很多種訓(xùn)練方式，可以自行去官網(wǎng)查看，但是不同的訓(xùn)練方式可能對(duì)應(yīng)前面的參數(shù)定義不一樣，需要另行處理，否則可能報(bào)錯(cuò)。

4 訓(xùn)練

def run_training():  data_dir = 'C:/Users/wk/Desktop/bky/dataSet/'  image,label = inputData.get_files(data_dir)  image_batches,label_batches = inputData.get_batches(image,label,32,32,16,20)  p = model.mmodel(image_batches)  cost = model.loss(p,label_batches)  train_op = model.training(cost,0.001)  acc = model.get_accuracy(p,label_batches)    sess = tf.Session()  init = tf.global_variables_initializer()  sess.run(init)    coord = tf.train.Coordinator()  threads = tf.train.start_queue_runners(sess = sess,coord = coord)    try:    for step in np.arange(1000):      print(step)      if coord.should_stop():        break      _,train_acc,train_loss = sess.run([train_op,acc,cost])      print("loss:{} accuracy:{}".format(train_loss,train_acc))  except tf.errors.OutOfRangeError:    print("Done!!!")  finally:    coord.request_stop()  coord.join(threads)  sess.close()

神經(jīng)網(wǎng)絡(luò)訓(xùn)練的時(shí)候，我們需要將模型保存下來，方便后面繼續(xù)訓(xùn)練或者用訓(xùn)練好的模型進(jìn)行測(cè)試。因此，我們需要?jiǎng)?chuàng)建一個(gè)saver保存模型。

def run_training():  data_dir = 'C:/Users/wk/Desktop/bky/dataSet/'  log_dir = 'C:/Users/wk/Desktop/bky/log/'  image,label = inputData.get_files(data_dir)  image_batches,label_batches = inputData.get_batches(image,label,32,32,16,20)  print(image_batches.shape)  p = model.mmodel(image_batches,16)  cost = model.loss(p,label_batches)  train_op = model.training(cost,0.001)  acc = model.get_accuracy(p,label_batches)    sess = tf.Session()  init = tf.global_variables_initializer()  sess.run(init)  saver = tf.train.Saver()  coord = tf.train.Coordinator()  threads = tf.train.start_queue_runners(sess = sess,coord = coord)    try:    for step in np.arange(1000):      print(step)      if coord.should_stop():        break      _,train_acc,train_loss = sess.run([train_op,acc,cost])      print("loss:{} accuracy:{}".format(train_loss,train_acc))      if step % 100 == 0:        check = os.path.join(log_dir,"model.ckpt")        saver.save(sess,check,global_step = step)  except tf.errors.OutOfRangeError:    print("Done!!!")  finally:    coord.request_stop()  coord.join(threads)  sess.close()

訓(xùn)練好的模型信息會(huì)記錄在checkpoint文件中，大致如下：

model_checkpoint_path: "C:/Users/wk/Desktop/bky/log/model.ckpt-100"
all_model_checkpoint_paths: "C:/Users/wk/Desktop/bky/log/model.ckpt-0"
all_model_checkpoint_paths: "C:/Users/wk/Desktop/bky/log/model.ckpt-100"

其余還會(huì)生成一些文件，分別記錄了模型參數(shù)等信息，后邊測(cè)試的時(shí)候程序會(huì)讀取checkpoint文件去加載這些真正的數(shù)據(jù)文件

tensorflow,CNN,圖像分類,數(shù)據(jù)集

構(gòu)建好神經(jīng)網(wǎng)絡(luò)進(jìn)行訓(xùn)練完成后，如果用之前的代碼直接進(jìn)行測(cè)試，會(huì)報(bào)shape不符合的錯(cuò)誤，大致是卷積層的輸入與圖像的shape不一致，這是因?yàn)樯掀拇a，將weights和biases定義在了模型的外面，調(diào)用模型的時(shí)候，出現(xiàn)valueError的錯(cuò)誤。

tensorflow,CNN,圖像分類,數(shù)據(jù)集

因此，我們需要將參數(shù)定義在模型里面，加載訓(xùn)練好的模型參數(shù)時(shí)，訓(xùn)練好的參數(shù)才能夠真正初始化模型。重寫模型函數(shù)如下

def mmodel(images,batch_size):  with tf.variable_scope('conv1') as scope:    weights = tf.get_variable('weights',                  shape = [3,3,3, 16],                 dtype = tf.float32,                  initializer=tf.truncated_normal_initializer(stddev=0.1,dtype=tf.float32))    biases = tf.get_variable('biases',                  shape=[16],                 dtype=tf.float32,                 initializer=tf.constant_initializer(0.1))    conv = tf.nn.conv2d(images, weights, strides=[1,1,1,1], padding='SAME')    pre_activation = tf.nn.bias_add(conv, biases)    conv1 = tf.nn.relu(pre_activation, name= scope.name)  with tf.variable_scope('pooling1_lrn') as scope:    pool1 = tf.nn.max_pool(conv1, ksize=[1,2,2,1],strides=[1,2,2,1],                padding='SAME', name='pooling1')    norm1 = tf.nn.lrn(pool1, depth_radius=4, bias=1.0, alpha=0.001/9.0,             beta=0.75,name='norm1')  with tf.variable_scope('conv2') as scope:    weights = tf.get_variable('weights',                 shape=[3,3,16,128],                 dtype=tf.float32,                 initializer=tf.truncated_normal_initializer(stddev=0.1,dtype=tf.float32))    biases = tf.get_variable('biases',                 shape=[128],                  dtype=tf.float32,                 initializer=tf.constant_initializer(0.1))    conv = tf.nn.conv2d(norm1, weights, strides=[1,1,1,1],padding='SAME')    pre_activation = tf.nn.bias_add(conv, biases)    conv2 = tf.nn.relu(pre_activation, name='conv2')    with tf.variable_scope('pooling2_lrn') as scope:    norm2 = tf.nn.lrn(conv2, depth_radius=4, bias=1.0, alpha=0.001/9.0,             beta=0.75,name='norm2')    pool2 = tf.nn.max_pool(norm2, ksize=[1,2,2,1], strides=[1,1,1,1],                padding='SAME',name='pooling2')  with tf.variable_scope('local3') as scope:    reshape = tf.reshape(pool2, shape=[batch_size, -1])    dim = reshape.get_shape()[1].value    weights = tf.get_variable('weights',                 shape=[dim,4096],                 dtype=tf.float32,                 initializer=tf.truncated_normal_initializer(stddev=0.005,dtype=tf.float32))    biases = tf.get_variable('biases',                 shape=[4096],                 dtype=tf.float32,                  initializer=tf.constant_initializer(0.1))    local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name)   with tf.variable_scope('softmax_linear') as scope:    weights = tf.get_variable('softmax_linear',                 shape=[4096, 2],                 dtype=tf.float32,                 initializer=tf.truncated_normal_initializer(stddev=0.005,dtype=tf.float32))    biases = tf.get_variable('biases',                  shape=[2],                 dtype=tf.float32,                  initializer=tf.constant_initializer(0.1))    softmax_linear = tf.add(tf.matmul(local3, weights), biases, name='softmax_linear')  return softmax_linear

測(cè)試訓(xùn)練好的模型

首先獲取一張測(cè)試圖像

 def get_one_image(img_dir):   image = Image.open(img_dir)   plt.imshow(image)   image = image.resize([32, 32])   image_arr = np.array(image)   return image_arr

加載模型，計(jì)算測(cè)試結(jié)果

def test(test_file):  log_dir = 'C:/Users/wk/Desktop/bky/log/'  image_arr = get_one_image(test_file)    with tf.Graph().as_default():    image = tf.cast(image_arr, tf.float32)    image = tf.image.per_image_standardization(image)    image = tf.reshape(image, [1,32, 32, 3])    print(image.shape)    p = model.mmodel(image,1)    logits = tf.nn.softmax(p)    x = tf.placeholder(tf.float32,shape = [32,32,3])    saver = tf.train.Saver()    with tf.Session() as sess:      ckpt = tf.train.get_checkpoint_state(log_dir)      if ckpt and ckpt.model_checkpoint_path:        global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]        saver.restore(sess, ckpt.model_checkpoint_path)        print('Loading success)      else:        print('No checkpoint')      prediction = sess.run(logits, feed_dict={x: image_arr})      max_index = np.argmax(prediction)      print(max_index)

前面主要是將測(cè)試圖片標(biāo)準(zhǔn)化為網(wǎng)絡(luò)的輸入圖像，15-19是加載模型文件，然后將圖像輸入到模型里即可

以上就是本文的全部?jī)?nèi)容，希望對(duì)大家的學(xué)習(xí)有所幫助，也希望大家多多支持VEVB武林網(wǎng)。

注：相關(guān)教程知識(shí)閱讀請(qǐng)移步到python教程頻道。

上一篇：淺談python可視化包Bokeh

下一篇：全面分析Python的優(yōu)點(diǎn)和缺點(diǎn)