OpenCV Recipes:人工神经网络

In this post, you will learn how to build an ANN and train it to perform image classification and object recongnition.

机器学习与人工神经网络

人工神经网络(ANN)是机器学习(ML)的子集。ANN 受人类理解过程的启发,像人类大脑一样工作,由不同的相互连接的神经元层组成,每个神经元元接收来自前一个神经元的信息,对其进行处理,并将其发送到下一个神经元,直到最终输出。在监督学习的情况下,来自标记输出,在无监督学习的情况下,来自标准匹配。

ANN 的特点是什么?机器学习被定义为计算机科学中专注于在数据集内寻找模式的领域,而人工神经网络更倾向于通过模拟人脑的连接来实现这一目的的,将模式检测分成几个层/节点,我们称之为神经元。

同时,其他机器学习算法如支持向量机(SVM)也广受欢迎,其建立在目标模式识别和分类上。支持向量机是机器学习算法中精度最好的一种。ANN 有更大的应用范围,能够检测大多数类型的数据结构上的模式(SVM 主要与特征向量一起工作),并且可以更加参数化以在同一实现中达成不同的目标。

此外,人工神经网络相对于其他 ML 策略的另一个优势是,人工神经网络是一种允许多类分类的概率分类器。这意味着它可以检测图像中不止一个物体;而另一方面,支持向量机是一种非概率二类分类器。

ANN 是如何工作的?

MLP 网络由至少三层形成:

  • 输入层:这是一个被动层,这意味着它不会修改数据。它从外部世界接收信息并将其发送到网络。该层中的节点(神经元)数量取决于图像中提取的特征或描述性信息的数量。例如,在使用特征向量的情况下,向量中的每一列都有一个节点。
  • 隐层:它将输入转换成输出层或另一个隐藏层可用的形式。这一层就像一个黑盒,感知接收到的输入中的模式,并评估每个输入的权重。它的行为将由它的激活函数提供的方程来定义。
  • 输出层:该层节点的数量将由所选择的神经网络来定义。

假设我们想建立一个三层神经网络,每种类型的层各一个。输入层中的节点数量由数据的维度决定。输出层中的节点数量由模型数量来定义。隐层中节点甚至层的数量取决于问题的复杂性和我们想要加入网络的准确性。高维度将提高结果的准确性,但也会增加计算成本。隐层的另一个决定因素是使用的激活函数。激活函数的一个常见选择是 Sigmoid 函数,也有其他选择,如 tanh 或 ReLU。

从更深的角度来看,每一个有隐层的神经元,我们可以说它们的行为方式都相似。从上一层(输入节点)检索值,用特定权重(每个神经元各不相同)加上偏置项之和。使用激活函数 f 变换总和:

如何定义多层感知器(MLP)?

MLP 是 ANN 的一个分支,由于它能够在嘈杂或意想不到的环境中识别模式,所以被广泛用于模式识别。MLP 可以用于实现监督学习和无监督学习。除此之外,MLP 还可以用于实施另一种学习方式,强化学习,其中网络通过奖励/惩罚进行调整。

定义一个 ANN-MLP 包括决定层的结构,以及每个层中有多少节点。首先,需要决定我们网络的目标是什么。例如,我们可以实现一个目标识别器,在这种情况下,属于输出层的节点数量将与我们想要识别的不同目标的数量相同。如前文中“目标识别”的例子,在识别手袋、鞋类和服装的情况下,输出层将有三个节点,它们的值将被映射为概率元组,而不是如 [ 1, 0, 0 )、[ 0, 1, 0] 和 [ 0, 0, 1] 之类的固定值。因此,有可能在同一图像中识别多个类别,例如,一个背着背包穿着拖鞋的女孩。

一旦决定了我们网络的结果,我们应该定义每个目标的哪些有意义的信息可以被插入到网络中。有几种方法可以作为图像的特征描述符,可以使用方向梯度直方图(HOG)来计算图像局部区域中梯度的方向,或者使用颜色直方图来表示图像中颜色的分布,或者也可以使用带有 SIFT 或 SURF 算法的 dense 特征检测器来提取图像特征。由于插入输入层的每个图像的描述符数量需要相同,我们将使用词袋策略,将所有描述符集合收集到一个视觉单词直方图中,就像我们在前文中“目标识别”中所做的那样。

最后,我们来到隐层。这一层没有严格定义的结构,因此这将是一个复杂的决定。不同的研究人员就如何决定隐层的数量和其中的节点数量进行了大量讨论。所有这些都依赖于问题的复杂性,并需要在性能和准确性之间找到平衡。在只有三个模型的简单目标识别器的情况下,它不需要超过一个隐藏层,关于其中的节点数量,我们可以采用例如 Heaton 的研究,它设置了以下规则:

  • 隐藏神经元的数量应该介于输入层的大小和输出层的大小之间
  • 隐藏神经元的数量应该是输入层大小加上输出层大小的三分之二
  • 隐藏神经元的数量应该少于输入层的两倍

如何实现一个 ANN-MLP 分类器?

在对如何实现人工神经网络进行了理论解释之后,我们将自己实现它。

为了节省一些时间和代码,我们将利用之前创建的 create_features.py 文件来提取我们将用作 MLP 网络输入的所有特征描述符。

通过运行以下命令,我们将获得下一步所需的每个映射文件:

1
$ python create_features.py --samples bag images/bagpack/ --samples dress images/dress/ --samples footwear images/footwear/ --codebook-file models/codebook.pkl --feature-map-file models/feature_map.pkl

feature_map.pkl 文件中,我们有每个图像的特征向量,这些特征将参加训练阶段。首先,为我们的 ANN 分类器创建一个类,用于设置网络层的大小:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class ClassifierANN(object):
def __init__(self, feature_vector_size, label_words):
self.ann = cv2.ml.ANN_MLP_create()
# Number of centroids used to build the feature vectors
input_size = feature_vector_size
# Number of models to recongnize
output_size = len(label_words)
# Applying Heaton rules
hidden_size = (input_size * (2 / 3)) + output_size
nn_config = np.array([input_size, hidden_size, output_size], dtype=np.uint8)
self.label_words = label_words
self.ann.setLayerSizes(np.array(nn_config))
# Symmetrical Sigmoid as activation function
self.ann.setActivationFunction(cv2.ml.ANN_MLP_SIGMOID_SYM)
# Map models as tuples of probabilities
self.le = preprocessing.LabelBinarizer()
self.le.fit(label_words) # Label words are ['dress', 'footwear', 'backpack']

作为输出,我们决定用二进制数 [0, 0, 1]、[0, 1, 0]、[1, 0, 0] 实现一个概率元组,目标是通过这种方式获得多类检测。对称 Sigmoid(NN_MLP_Sigmoid_SYM)作为激活函数,这是 MLP 的默认选择,其输出将在 [-1, 1] 的范围内。这样,我们网络产生的输出将定义概率,而不是分类结果,能够识别同一样本图像中的两三个物体。

对于培训过程,我们把数据集分成两个不同的集合:训练和测试。我们为它定义一个比率(大多数情况建议使用 75% 作为训练集),并随机选择以防止偏差。这是怎么工作的?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class ClassifierANN(object):
... ...

def train(self, training_set):
label_words = [item['label'] for item in training_set]
dim_size = training_set[0]['feature_vector'].shape[1]
train_samples = np.asarray(
[np.reshape(x['feature_vector'], (dim_size,)) for x in training_set]
)
# Convert item labels into encoded binary tuples
train_response = np.array(self.le.transform(label_words), dtype=np.float32)
self.ann.train(np.array(train_samples,
dtype=np.float32), cv2.ml.ROW_SAMPLE,
np.array(train_response, dtype=np.float32)
)

评估训练好的网络

为了评估训练好的 MLP 网络的鲁棒性和准确性,我们将计算混淆矩阵(也称为误差矩阵)。这个矩阵能够描述我们分类模型的性能。我们使用测试集来评估它:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class ClassifierANN(object):
... ...

def get_confusion_matrix(self, testing_set):
feature_vectors, expected_labels = self._get_network_io(testing_set)
confusion_matrix = self._init_confusion_matrix(self.label_words)
retval, test_outputs = self.ann.predict(feature_vectors)
for expected_output, test_output in zip(expected_labels, test_outputs):
expected_model = self.classify(expected_output)
predicted_model = self.classify(test_output)
confusion_matrix[expected_model][predicted_model] += 1
return confusion_matrix

def classify(self, encoded_word, threshold = 0.5):
models = self.le.inverse_transform(np.asarray([encoded_word]), threshold)
return models[0]

def _init_confusion_matrix(self, label_words):
confusion_matrix = OrderedDict()
for label in label_words:
confusion_matrix[label] = OrderedDict()
for label2 in label_words: confusion_matrix[label][label2] = 0
return confusion_matrix

通过真阳性(TP)、真阴性(TN)、假阳性(FP)和假阴性(FN) 计算我们训练好的网络的准确性:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def print_accuracy(confusion_matrix):
acc_models = OrderedDict()
for model in confusion_matrix.keys():
acc_models[model] = {'TP':0, 'TN':0, 'FP':0, 'FN': 0}
for expected_model, predicted_models in confusion_matrix.items():
for predicted_model, value in predicted_models.items():
if predicted_model == expected_model:
acc_models[expected_model]['TP'] += value
acc_models[predicted_model]['TN'] += value
else:
acc_models[expected_model]['FN'] += value
acc_models[predicted_model]['FP'] += value


for model, rep in acc_models.items():
acc = (rep['TP']+rep['TN'])/(rep['TP']+rep['TN']+rep['FN']+rep['FP'])
print('%s \t %f' % (model,acc))

收集本节中的每一个代码块,我们实现了 ClassifierANN 类以备使用:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
import argparse
import random
import cv2
import numpy as np
import pickle
import math
from sklearn import preprocessing
from collections import OrderedDict

class ClassifierANN(object):
def __init__(self, feature_vector_size, label_words):
self.ann = cv2.ml.ANN_MLP_create()
# Number of centroids used to build the feature vectors
input_size = feature_vector_size
# Number of models to recongnize
output_size = len(label_words)
# Applying Heaton rules
hidden_size = (input_size * (2 / 3)) + output_size
nn_config = np.array([input_size, hidden_size, output_size], dtype=np.uint8)
self.label_words = label_words
self.ann.setLayerSizes(np.array(nn_config))
# Symmetrical Sigmoid as activation function
self.ann.setActivationFunction(cv2.ml.ANN_MLP_SIGMOID_SYM)
# Map models as tuples of probabilities
self.le = preprocessing.LabelBinarizer()
self.le.fit(label_words) # Label words are ['dress', 'footwear', 'backpack']

def train(self, training_set):
label_words = [item['label'] for item in training_set]
dim_size = training_set[0]['feature_vector'].shape[1]
train_samples = np.asarray(
[np.reshape(x['feature_vector'], (dim_size,)) for x in training_set]
)
# Convert item labels into encoded binary tuples
train_response = np.array(self.le.transform(label_words), dtype=np.float32)
self.ann.train(np.array(train_samples,
dtype=np.float32), cv2.ml.ROW_SAMPLE,
np.array(train_response, dtype=np.float32)
)

def get_confusion_matrix(self, testing_set):
feature_vectors, expected_labels = self._get_network_io(testing_set)
confusion_matrix = self._init_confusion_matrix(self.label_words)
retval, test_outputs = self.ann.predict(feature_vectors)
for expected_output, test_output in zip(expected_labels, test_outputs):
expected_model = self.classify(expected_output)
predicted_model = self.classify(test_output)
confusion_matrix[expected_model][predicted_model] += 1
return confusion_matrix

def classify(self, encoded_word, threshold = 0.5):
models = self.le.inverse_transform(np.asarray([encoded_word]), threshold)
return models[0]

def _get_network_io(self, features_map):
label_words = [ item['label'] for item in features_map]
dim_size = features_map[0]['feature_vector'].shape[1]
inputs = np.asarray([np.reshape(x['feature_vector'], (dim_size,)) for x in features_map])
outputs = np.array(self.le.transform(label_words), dtype=np.float32)
return inputs, outputs

def _init_confusion_matrix(self, label_words):
confusion_matrix = OrderedDict()
for label in label_words:
confusion_matrix[label] = OrderedDict()
for label2 in label_words: confusion_matrix[label][label2] = 0
return confusion_matrix

def build_arg_parser():
parser = argparse.ArgumentParser(description='Creates features for given images')
parser.add_argument("--feature-map-file", dest="feature_map_file", required=True,
help="Input pickle file containing the feature map")
parser.add_argument("--training-set", dest="training_set", required=True,
help="Percentage taken for training. ie 0.75")
parser.add_argument("--ann-file", dest="ann_file", required=False,
help="Output file where ANN will be stored")
parser.add_argument("--le-file", dest="le_file", required=False,
help="Output file where LabelEncoder class will be stored")
return parser

def print_confusion_matrix(confusion_matrix):
expected_model = confusion_matrix.keys()
print ('\t\t', '\t'.join([expected_type.ljust(15) for expected_type in expected_model]))

for expected_type in expected_model:
values = []
for predicted_values in confusion_matrix.values():
for predicted_model, value in predicted_values.items():
if predicted_model == expected_type: values.append(str(value).ljust(10))
print('%s\t\t%s' % (expected_type.ljust(15), '\t'.join(values)))


def print_accuracy(confusion_matrix):
acc_models = OrderedDict()
for model in confusion_matrix.keys():
acc_models[model] = {'TP':0, 'TN':0, 'FP':0, 'FN': 0}
for expected_model, predicted_models in confusion_matrix.items():
for predicted_model, value in predicted_models.items():
if predicted_model == expected_model:
acc_models[expected_model]['TP'] += value
acc_models[predicted_model]['TN'] += value
else:
acc_models[expected_model]['FN'] += value
acc_models[predicted_model]['FP'] += value


for model, rep in acc_models.items():
acc = (rep['TP']+rep['TN'])/(rep['TP']+rep['TN']+rep['FN']+rep['FP'])
print('%s \t %f' % (model,acc))


def split_feature_map(feature_map, training_set_per):
feature_map_dict = dict()
for item in feature_map:
label = item['label']
if label not in feature_map_dict: feature_map_dict[label] = list()
feature_map_dict[label].append(item)

training_feature_map = []
testing_feature_map = []
for label, feature_map_list in feature_map_dict.items():
slice = math.trunc(len(feature_map_list) * training_set_per)
random.shuffle(feature_map_list)
training_feature_map += feature_map_list[:slice]
testing_feature_map += feature_map_list[slice:]
return training_feature_map, testing_feature_map

if __name__ == '__main__':
args = build_arg_parser().parse_args()

# Load the Feature Map
with open(args.feature_map_file, 'rb') as f:
feature_map = pickle.load(f)

training_set, testing_set = split_feature_map(feature_map, float(args.training_set))
label_words = np.unique([item['label'] for item in training_set])
cnn = ClassifierANN(len(feature_map[0]['feature_vector'][0]), label_words)
cnn.train(training_set)
print("===== Confusion Matrix =====")
confusion_matrix = cnn.get_confusion_matrix(testing_set)
print_confusion_matrix(confusion_matrix)
print("===== ANN Accuracy =====")
print_accuracy(confusion_matrix)

if 'ann_file' in args and 'le_file' in args:
print("===== Saving ANN =====")
with open(args.ann_file, 'wb') as f:
cnn.ann.save(args.ann_file)
with open(args.le_file, 'wb') as f:
pickle.dump(cnn.le, f)
print('Saved in: ', args.ann_file)

正如你可能已经注意到的,我们已经将 ANN 保存到了两个单独的文件中,因为 ANN_MLP 类有自己的保存和加载方法。我们需要保存用来训练网络的 label_words。Pickle 为我们提供了序列化和反序列化对象结构以及从磁盘保存和加载它们的功能。

运行以下命令以获取模型文件。混淆矩阵和准确度将随它一起显示:

1
$ python training.py --feature-map-file models/feature_map.pkl --training-set 0.8 --ann-file models/ann.yaml --le-file models/le.pkl

分类图像

为了实现我们的 ANN 分类器,我们需要重用前文“目标识别”中 create_feature.py 文件中的FeatureExtractor类的方法,这将允许我们从我们想要评估的图像中计算特征向量。

考虑将 create_feature 文件包含在同一文件夹中。现在,我们准备实现分类器:

create_feature.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
import os
import sys
import argparse
import _pickle as pickle
import json

import cv2
import numpy as np
from sklearn.cluster import KMeans


class DenseDetector():
def __init__(self, step_size=20, feature_scale=20, img_bound=20):
# Create a dense feature detector
self.initXyStep = step_size
self.initFeatureScale = feature_scale
self.initImgBound = img_bound

def detect(self, img):
keypoints = []
rows, cols = img.shape[:2]
for x in range(self.initImgBound, rows, self.initFeatureScale):
for y in range(self.initImgBound, cols, self.initFeatureScale):
keypoints.append(cv2.KeyPoint(float(x), float(y), self.initXyStep))
return keypoints


class SIFTExtractor():
def __init__(self):
self.extractor = cv2.xfeatures2d.SIFT_create()

def compute(self, image, kps):
if image is None:
print("Not a valid image")
raise TypeError

gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
kps, des = self.extractor.compute(gray_image, kps)
return kps, des


# Vector quantization
class Quantizer(object):
def __init__(self, num_clusters=32):
self.num_dims = 128
self.extractor = SIFTExtractor()
self.num_clusters = num_clusters
self.num_retries = 10

def quantize(self, datapoints):
# Create KMeans object
kmeans = KMeans(self.num_clusters,
n_init=max(self.num_retries, 1),
max_iter=10, tol=1.0)

# Run KMeans on the datapoints
res = kmeans.fit(datapoints)

# Extract the centroids of those clusters
centroids = res.cluster_centers_

return kmeans, centroids

def normalize(self, input_data):
sum_input = np.sum(input_data)
if sum_input > 0:
return input_data / sum_input
else:
return input_data

# Extract feature vector from the image

def get_feature_vector(self, img, kmeans, centroids):
kps = DenseDetector().detect(img)
kps, fvs = self.extractor.compute(img, kps)
labels = kmeans.predict(fvs)
fv = np.zeros(self.num_clusters)

for i, item in enumerate(fvs):
fv[labels[i]] += 1

fv_image = np.reshape(fv, ((1, fv.shape[0])))
return self.normalize(fv_image)


class FeatureExtractor(object):
def extract_image_features(self, img):
# Dense feature detector
kps = DenseDetector().detect(img)

# SIFT feature extractor
kps, fvs = SIFTExtractor().compute(img, kps)

return fvs

# Extract the centroids from the feature points

def get_centroids(self, input_map, num_samples_to_fit=10):
kps_all = []

count = 0
cur_label = ''
for item in input_map:
if count >= num_samples_to_fit:
if cur_label != item['label']:
count = 0
else:
continue

count += 1

if count == num_samples_to_fit:
print("Built centroids for", item['label'])

cur_label = item['label']
img = cv2.imread(item['image'])
img = resize_to_size(img, 150)

fvs = self.extract_image_features(img)
kps_all.extend(fvs)

kmeans, centroids = Quantizer().quantize(kps_all)
return kmeans, centroids

def get_feature_vector(self, img, kmeans, centroids):
return Quantizer().get_feature_vector(img, kmeans, centroids)


def build_arg_parser():
parser = argparse.ArgumentParser(description='Creates features for given images')
parser.add_argument("--samples", dest="cls", nargs="+", action="append", required=True, \
help="Folders containing the training images.\nThe first element needs to be the class label.")
parser.add_argument("--codebook-file", dest='codebook_file', required=True,
help="Base file name to store the codebook")
parser.add_argument("--feature-map-file", dest='feature_map_file', required=True, \
help="Base file name to store the feature map")

return parser


# Loading the images from the input folder
def load_input_map(label, input_folder):
combined_data = []

if not os.path.isdir(input_folder):
raise IOError("The folder " + input_folder + " doesn't exist")

# Parse the input folder and assign the labels
for root, dirs, files in os.walk(input_folder):
for filename in (x for x in files if x.endswith('.jpg')):
combined_data.append({'label': label, 'image':
os.path.join(root, filename)})

return combined_data


def extract_feature_map(input_map, kmeans, centroids):
feature_map = []

for item in input_map:
temp_dict = {}
temp_dict['label'] = item['label']

print("Extracting features for", item['image'])
img = cv2.imread(item['image'])
img = resize_to_size(img, 150)

temp_dict['feature_vector'] = FeatureExtractor().get_feature_vector(img, kmeans, centroids)

if temp_dict['feature_vector'] is not None:
feature_map.append(temp_dict)

return feature_map


# Resize the shorter dimension to 'new_size'
# while maintaining the aspect ratio
def resize_to_size(input_image, new_size=150):
h, w = input_image.shape[0], input_image.shape[1]
ds_factor = new_size / float(h)

if w < h:
ds_factor = new_size / float(w)

new_size = (int(w * ds_factor), int(h * ds_factor))
return cv2.resize(input_image, new_size)


if __name__ == '__main__':
args = build_arg_parser().parse_args()

input_map = []
for cls in args.cls:
assert len(cls) >= 2, "Format for classes is `<label> file`"
label = cls[0]
input_map += load_input_map(label, cls[1])

# Building the codebook
print("===== Building codebook =====")
kmeans, centroids = FeatureExtractor().get_centroids(input_map)
if args.codebook_file:
with open(args.codebook_file, 'wb') as f:
print('kmeans', kmeans)
print('centroids', centroids)
pickle.dump((kmeans, centroids), f)

# Input data and labels
print("===== Building feature map =====")
feature_map = extract_feature_map(input_map, kmeans, centroids)
if args.feature_map_file:
with open(args.feature_map_file, 'wb') as f:
pickle.dump(feature_map, f)

classify_data.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
import argparse
import _pickle as pickle

import cv2
import numpy as np

import create_features as cf

# Classifying an image
class ImageClassifier(object):
def __init__(self, ann_file, le_file, codebook_file):
with open(ann_file, 'rb') as f:
self.ann = cv2.ml.ANN_MLP_load(ann_file)
with open(le_file, 'rb') as f:
self.le = pickle.load(f)

# Load the codebook
with open(codebook_file, 'rb') as f:
self.kmeans, self.centroids = pickle.load(f)

def classify(self, encoded_word, threshold=None):
models = self.le.inverse_transform(np.asarray(encoded_word), threshold)
return models[0]

# Method to get the output image tag
def getImageTag(self, img):
# Resize the input image
img = cf.resize_to_size(img)
# Extract the feature vector
feature_vector = cf.FeatureExtractor().get_feature_vector(img, self.kmeans, self.centroids)
# Classify the feature vector and get the output tag
retval, image_tag = self.ann.predict(feature_vector)
return self.classify(image_tag)


def build_arg_parser():
parser = argparse.ArgumentParser(description='Extracts features from each line and classifies the data')
parser.add_argument("--input-image", dest="input_image", required=True,
help="Input image to be classified")
parser.add_argument("--codebook-file", dest="codebook_file", required=True,
help="File containing the codebook")
parser.add_argument("--ann-file", dest="ann_file", required=True,
help="File containing trained ANN")
parser.add_argument("--le-file", dest="le_file", required=True,
help="File containing LabelEncoder class")
return parser

if __name__=='__main__':
args = build_arg_parser().parse_args()
codebook_file = args.codebook_file
input_image = cv2.imread(args.input_image)

tag = ImageClassifier(args.ann_file, args.le_file, codebook_file).getImageTag(input_image)
print("Output class:", tag)

运行以下命令对图像进行分类:

1
$ python classify_data.py --codebook-file models/codebook.pkl --ann-file models/ann.yaml --le-file models/le.pkl --input-image ./images/test.png
GreatX wechat
关注我的公众号,推送优质文章。