NumPy ufunc通用函數(shù)

NumPy 提供了兩種基本的對(duì)象，即 ndarray 和 ufunc 對(duì)象。ufunc 是 universal function的縮寫(xiě)，意思是“通用函數(shù)”，它是一種能對(duì)數(shù)組的每個(gè)元素進(jìn)行操作的函數(shù)。
許多 ufunc 函數(shù)都是用C語(yǔ)言級(jí)別實(shí)現(xiàn)的，因此它們的計(jì)算速度非常快。
此外，ufun 比 math 模塊中的函數(shù)更靈活。math 模塊的輸入一般是標(biāo)量，但 NumPy 中的函數(shù)可以是向量或矩陣，而利用向量或矩陣可以避免使用循環(huán)語(yǔ)句，這點(diǎn)在機(jī)器學(xué)習(xí)、深度學(xué)習(xí)中非常重要。

為什么要使用 ufuncs？

ufunc 用于在 NumPy 中實(shí)現(xiàn)矢量化，這比迭代元素要快得多。
它們還提供廣播和其他方法，例如減少、累加等，它們對(duì)計(jì)算非常有幫助。
ufuncs 還接受其他參數(shù)，比如：
where 布爾值數(shù)組或條件，用于定義應(yīng)在何處進(jìn)行操作。
dtype 定義元素的返回類(lèi)型。
out 返回值應(yīng)被復(fù)制到的輸出數(shù)組。

NumPy 中的幾個(gè)常用通用函數(shù)

函數(shù)	使用方法
sqrt()	計(jì)算序列化數(shù)據(jù)的平方根
sin()、cos()	三角函數(shù)
abs()	計(jì)算序列化數(shù)據(jù)的絕對(duì)值
dot()	矩陣運(yùn)算
log()、logl()、log2()	對(duì)數(shù)函數(shù)
exp()	指數(shù)函數(shù)
cumsum()、cumproduct()	累計(jì)求和、求積
sum()	對(duì)一個(gè)序列化數(shù)據(jù)進(jìn)行求和
mean()	計(jì)算均值
median()	計(jì)算中位數(shù)
std()	計(jì)算標(biāo)準(zhǔn)差
var()	計(jì)算方差
corrcoef()	計(jì)算相關(guān)系數(shù)

math 與 numpy 函數(shù)的性能比較

示例

import time
 import math
 import numpy as np
 x = [i * 0.001 for i in np.arange(1000000)]
 start = time.clock()
 for i, t in enumerate(x):
 x[i] = math.sin(t)
 print ("math.sin:", time.clock() - start )
 x = [i * 0.001 for i in np.arange(1000000)]
 x = np.array(x)
 start = time.clock()
 np.sin(x)
 print ("numpy.sin:", time.clock() - start )

運(yùn)行結(jié)果：

math.sin: 0.5169950000000005
 numpy.sin: 0.05381199999999886

由此可見(jiàn)，numpy.sin 比 math.sin 快近 10 倍。

向量化

將迭代語(yǔ)句轉(zhuǎn)換為基于向量的操作稱(chēng)為向量化。
由于現(xiàn)代 CPU 已針對(duì)此類(lèi)操作進(jìn)行了優(yōu)化，因此速度更快。
對(duì)兩個(gè)列表的元素進(jìn)行相加：
list 1: [1, 2, 3, 4]
list 2: [4, 5, 6, 7]
一種方法是遍歷兩個(gè)列表，然后對(duì)每個(gè)元素求和。

如果沒(méi)有 ufunc，我們可以使用 Python 的內(nèi)置 zip() 方法：

x = [1, 2, 3, 4]
 y = [4, 5, 6, 7]
 z = []
 for i, j in zip(x, y):
   z.append(i + j)
 print(z)

運(yùn)行結(jié)果：

[5, 7, 9, 11]

對(duì)此，NumPy 有一個(gè) ufunc，名為 add(x, y)，它會(huì)輸出相同的結(jié)果，通過(guò) ufunc，我們可以使用 add() 函數(shù)：

示例

import numpy as np
 x = [1, 2, 3, 4]
 y = [4, 5, 6, 7]
 z = np.add(x, y)
 print(z)

運(yùn)行結(jié)果：

[5, 7, 9, 11]

循環(huán)與向量運(yùn)算比較

充分使用 Python 的 NumPy 庫(kù)中的內(nèi)建函數(shù)（Built-in Function），來(lái)實(shí)現(xiàn)計(jì)算的向量化，可大大地提高運(yùn)行速度。NumPy 庫(kù)中的內(nèi)建函數(shù)使用了 SIMD 指令。如下使用的向量化要比使用循環(huán)計(jì)算速度快得多。如果使用 GPU，其性能將更強(qiáng)大，不過(guò) Numpy 不支持 GPU。
請(qǐng)看下面的代碼：

示例

import time
 import numpy as np
 x1 = np.random.rand(1000000)
 x2 = np.random.rand(1000000)
 ##使用循環(huán)計(jì)算向量點(diǎn)積
 tic = time.process_time()
 dot = 0
 for i in range(len(x1)):
 dot+= x1[i]*x2[i]
 toc = time.process_time()
 print ("dot = " + str(dot) + "\n for循環(huán)-----計(jì)算時(shí)間 = " + str(1000*(toc - tic)) + "ms")
 ##使用numpy函數(shù)求點(diǎn)積
 tic = time.process_time()
 dot = 0
 dot = np.dot(x1,x2)
 toc = time.process_time()
 print ("dot = " + str(dot) + "\n Verctor 版本---- 計(jì)算時(shí)間 = " + str(1000*(toc - tic)) + "ms")

運(yùn)行結(jié)果：

 dot = 250215.601995
 for循環(huán)-----計(jì)算時(shí)間 = 798.3389819999998ms
 dot = 250215.601995
 Verctor 版本---- 計(jì)算時(shí)間 = 1.885051999999554ms

NumPy 數(shù)組形狀修改 NumPy 數(shù)組迭代

Numpy 教程

NumPy ufunc通用函數(shù)

為什么要使用 ufuncs？

math 與 numpy 函數(shù)的性能比較

向量化

循環(huán)與向量運(yùn)算比較

為什么要使用 ufuncs？