Anatomy of High-Performance Matrix Multiplication

Anatomy of High-Performance Matrix Multiplication

​ 现在我们进行机器学习训练,通常都会使用一些机器学习库,比如TensorFlow这样的库,并且在训练机器学习模型时,通常这些库对性能的提升是数量级的提升。以下以卷积计算为例,去剖析高性能矩阵计算。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
'''
Convolve `input` with `kernel` to generate `output`
input.shape = [input_channels, input_height, input_width]
kernel.shape = [num_filters, input_channels, kernel_height, kernel_width]
output.shape = [num_filters, output_height, output_width]
'''
for filter in 0..num_filters
for channel in 0..input_channels
for out_h in 0..output_height
for out_w in 0..output_width
for k_h in 0..kernel_height
for k_w in 0..kernel_width
output[filter, out_h, out_w] +=
kernel[filter, channel, k_h, k_w] *
input[channel, out_h + k_h, out_w + k_w]

阅读更多
Your browser is out-of-date!

Update your browser to view this website correctly.&npsb;Update my browser now

×