如何用 CNN 玩转 AlphaGo 版的五子棋？（五子棋达人）

作者 | 李秋键责编 | 郭芮

出品 | CSDN（ID：CSDNnews）

近几年来，AI在游戏方面的发展如火如荼，尤其是自从阿法狗AI围棋战胜围棋之后，更是引起了AI发展的狂潮，同时也引起了很多AI游戏的应用与深化发展。其实游戏中的AI有着非常悠久的历史，相当多的游戏都是围绕着对抗“敌人”展开，而这个“敌人”,就是AI，其中包含一些行为方式固定没有一丁点变化的低级AI，也有一些引入随机因素稍微高级一点的AI，不过这里的AI本质上是一段固定的程序脚本，如果玩家掌握到其中的规律，游戏性就会瞬间降低。

而深度学习的AI版本却是不同，他有着多层位的参数与多方向的选择，拓展了其中AI的智能性，让玩家找到其中的规律性变得基本不可能，这也是深度学习的重要意义之一。今天，我们就将利用CNN实现智能五子棋。

实验前的准备

首先我们使用的python版本是3.6.5。所测试的系统有windows10，windows7，Linux系统以及苹果系统。从这点也可以看出python多平台和多拓展性、易于迁移的优点。

所使用的的python库有tkinter，其目的是用来规划棋盘布局，实现下棋功能；SGFfile用来读取棋谱和加载训练模型；os库用来读取和存储本地文件；TensorFlow库用来建立CNN网络模型以及训练等事项。

棋盘的建立

1、初始化棋盘：

其中各参数设定意义如下：初始化：someoneWin:标识是否有人赢了；humanChessed:人类玩家是否下了；IsStart:是否开始游戏了；player:玩家是哪一方；playmethod:模式，和robot下棋，还是和ai下棋；bla_start_pos:黑棋开局时下在正中间的位置；bla_chessed:保存黑棋已经下过的棋子；whi_chessed:保存白棋已经下过的棋子；board:棋盘；window:窗口；var:用于标记选择玩家颜色的一个变量；var1:用于标记选择robot或者ai的一个变量；can:画布，用于绘出棋盘；net_board:棋盘的点信息；robot:机器人；sgf:处理棋谱；cnn:cnnc神经网络。

其中代码如下：

def __init__(self): self.someoneWin = False self.humanChessed = False self.IsStart = False self.player = 0 self.playmethod = 0 self.bla_start_pos = [235, 235] self.whi_chessed = self.bla_chessed = self.board = self.init_board self.window = Tk self.var = IntVar self.var.set(0) self.var1 = IntVar self.var1.set(0) self.window.title("myGoBang") self.window.geometry("600x470 80 80") self.window.resizable(0, 0) self.can = Canvas(self.window, bg="#EEE8AC", width=470, height=470) self.draw_board self.can.grid(row=0, column=0) self.net_board = self.get_net_board self.robot = Robot(self.board) self.sgf = SGFflie self.cnn = myCNN self.cnn.restore_save def init_board(self): """初始化棋盘""" list1 = [[-1]*15 for i in range(15)] return list1

2、棋盘布局：

其主要功能就是画出棋盘和棋子。具体代码如下：

def draw_board(self): """画出棋盘""" for row in range(15): if row == 0 or row == 14: self.can.create_line((25, 25 row * 30), (445, 25 row * 30), width=2) else: self.can.create_line((25, 25 row * 30), (445, 25 row * 30), width=1) for col in range(15): if col == 0 or col == 14: self.can.create_line((25 col * 30, 25), (25 col * 30, 445), width=2) else: self.can.create_line((25 col * 30, 25), (25 col * 30, 445), width=1) self.can.create_oval(112, 112, 118, 118, fill="black") self.can.create_oval(352, 112, 358, 118, fill="black") self.can.create_oval(112, 352, 118, 358, fill="black") self.can.create_oval(232, 232, 238, 238, fill="black") self.can.create_oval(352, 352, 358, 358, fill="black")def draw_chessed(self): """在棋盘中画出已经下过的棋子""" if len(self.whi_chessed) != 0: for tmp in self.whi_chessed: oval = pos_to_draw(*tmp[0:2]) self.can.create_oval(oval, fill="white") if len(self.bla_chessed) != 0: for tmp in self.bla_chessed: oval = pos_to_draw(*tmp[0:2]) self.can.create_oval(oval, fill="black") def draw_a_chess(self, x, y, player=None): """在棋盘中画一个棋子""" _x, _y = pos_in_qiju(x, y) oval = pos_to_draw(x, y) if player == 0: self.can.create_oval(oval, fill="black") self.bla_chessed.append([x, y, 0]) self.board[_x][_y] = 1 elif player == 1: self.can.create_oval(oval, fill="white") self.whi_chessed.append([x, y, 1]) self.board[_x][_y] = 0 else: print(AttributeError("请选择棋手")) return

3、判断胜负条件：

根据是否是五子连在一线判断输赢。

def have_five(self, chessed): """检测是否存在连五了""" if len(chessed) == 0: return False for row in range(15): for col in range(15): x = 25 row * 30 y = 25 col * 30 if self.check_chessed((x, y), chessed) == True and \ self.check_chessed((x, y 30), chessed) == True and \ self.check_chessed((x, y 60), chessed) == True and \ self.check_chessed((x, y 90), chessed) == True and \ self.check_chessed((x, y 120), chessed) == True: return True elif self.check_chessed((x, y), chessed) == True and \ self.check_chessed((x 30, y), chessed) == True and \ self.check_chessed((x 60, y), chessed) == True and \ self.check_chessed((x 90, y), chessed) == True and \ self.check_chessed((x 120, y), chessed) == True: return True elif self.check_chessed((x, y), chessed) == True and \ self.check_chessed((x 30, y 30), chessed) == True and \ self.check_chessed((x 60, y 60), chessed) == True and \ self.check_chessed((x 90, y 90), chessed) == True and \ self.check_chessed((x 120, y 120), chessed) == True: return True elif self.check_chessed((x, y), chessed) == True and \ self.check_chessed((x 30, y - 30), chessed) == True and \ self.check_chessed((x 60, y - 60), chessed) == True and \ self.check_chessed((x 90, y - 90), chessed) == True and \ self.check_chessed((x 120, y - 120), chessed) == True: return True else: pass return False def check_win(self): """检测是否有人赢了""" if self.have_five(self.whi_chessed) == True: label = Label(self.window, text="White Win!", background='#FFF8DC', font=("宋体", 15, "bold")) label.place(relx=0, rely=0, x=480, y=40) return True elif self.have_five(self.bla_chessed) == True: label = Label(self.window, text="Black Win!", background='#FFF8DC', font=("宋体", 15, "bold")) label.place(relx=0, rely=0, x=480, y=40) return True else: return False

得到的UI界面如下：

深度学习建模

1、初始化神经网络：

其中第一层和第二层为卷积层，第四层为全连接层，接着紧接着连接池化和softmax。和一般的CNN网络基本无异。基本参数见代码，如下：

def __init__(self): '''初始化神经网络''' self.sess = tf.InteractiveSession # paras self.W_conv1 = self.weight_varible([5, 5, 1, 32]) self.b_conv1 = self.bias_variable([32]) # conv layer-1 self.x = tf.placeholder(tf.float32, [None, 225]) self.y = tf.placeholder(tf.float32, [None, 225]) self.x_image = tf.reshape(self.x, [-1, 15, 15, 1]) self.h_conv1 = tf.nn.relu(self.conv2d(self.x_image, self.W_conv1) self.b_conv1) self.h_pool1 = self.max_pool_2x2(self.h_conv1) # conv layer-2 self.W_conv2 = self.weight_varible([5, 5, 32, 64]) self.b_conv2 = self.bias_variable([64]) self.h_conv2 = tf.nn.relu(self.conv2d(self.h_pool1, self.W_conv2) self.b_conv2) self.h_pool2 = self.max_pool_2x2(self.h_conv2) # full connection self.W_fc1 = self.weight_varible([4 * 4 * 64, 1024]) self.b_fc1 = self.bias_variable([1024]) self.h_pool2_flat = tf.reshape(self.h_pool2, [-1, 4 * 4 * 64]) self.h_fc1 = tf.nn.relu(tf.matmul(self.h_pool2_flat, self.W_fc1) self.b_fc1) # dropout self.keep_prob = tf.placeholder(tf.float32) self.h_fc1_drop = tf.nn.dropout(self.h_fc1, self.keep_prob) # output layer: softmax self.W_fc2 = self.weight_varible([1024, 225]) self.b_fc2 = self.bias_variable([225]) self.y_conv = tf.nn.softmax(tf.matmul(self.h_fc1_drop, self.W_fc2) self.b_fc2) # model training self.cross_entropy = -tf.reduce_sum(self.y * tf.log(self.y_conv)) self.train_step = tf.train.AdamOptimizer(1e-3).minimize(self.cross_entropy) self.correct_prediction = tf.equal(tf.argmax(self.y_conv, 1), tf.argmax(self.y, 1)) self.accuracy = tf.reduce_mean(tf.cast(self.correct_prediction, tf.float32)) self.saver = tf.train.Saver init = tf.global_variables_initializer # 不存在就初始化变量 self.sess.run(init) def weight_varible(self, shape): '''权重变量''' initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(self, shape): '''偏置变量''' initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) def conv2d(self, x, W): '''卷积核''' return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') def max_pool_2x2(self, x): '''池化核''' return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

2、保存和读取模型：

def restore_save(self, method=1): '''保存和读取模型''' if method == 1: self.saver.restore(self.sess, 'save\model.ckpt') #print("已读取数据") elif method == 0: saver = tf.train.Saver(write_version=tf.train.SaverDef.V2) saver.save(self.sess, 'save\model.ckpt') #print('已保存')

3、建立预测函数和训练函数：

def predition(self, qiju): '''预测函数''' _qiju = self.createdataformqiju(qiju) pre = self.sess.run(tf.argmax(self.y_conv, 1), feed_dict={self.x: _qiju, self.keep_prob: 1.0}) point = [0, 0] l = pre[0] for i in range(15): if ((i 1) * 15) > l: point[0] = int(i*30 25) point[1] = int((l - i * 15) * 30 25) break return point def train(self, qiju): '''训练函数''' sgf = SGFflie _x, _y = sgf.createTraindataFromqipu(qiju) for i in range(10): self.sess.run(self.train_step, feed_dict={ self.x: _x, self.y: _y }) self.restore_save(method=0) def train1(self, x, y): '''另一个训练函数''' for i in range(100): self.sess.run(self.train_step, feed_dict={ self.x: x, self.y: y, self.keep_prob: 0.5 }) print('训练好了一次') #self.restore_save(method=0)

4、生成数据：

def createdataformqiju(self, qiju): '''生成数据''' data = tmp = for row in qiju: for point in row: if point == -1: tmp.append(0.0) elif point == 0: tmp.append(2.0) elif point == 1: tmp.append(1.0) data.append(tmp) return data

其中此处CNN在棋盘应用和图像识别的不同之处在于，图像识别加载的参数来自于图像本身的像素值作为训练的参数，而此处训练的参数则是自定义的棋盘棋谱参数，比如说棋盘左上角的位置参数等等各个位置参数都是预先设定好的，通过加载棋谱即可以让电脑知道此时黑白棋子在哪个位置。然后通过加载各个位置以及胜负情况进行判断，最终电脑加载模型即可预测可能胜利的下棋位置，达到智能下棋效果。

最终效果：