Sunday/Mar/20 17:30
https://www.intel.ai/plaidml/
Note: The 2.3.0 release will be the last major release of multi-backend Keras. Multi-backend Keras is superseded by tf.keras. Bugs present in multi-backend Keras will only be fixed until April 2020 (as part of minor releases).
So will need https://www.ngraph.ai in the future to use Keras with PlaidML
conda create --name plaidml
conda activate plaidml
pip install pybind11 pyopencl plaidml-keras plaidbench
Setup shell variables:
export KERAS_BACKEND="plaidml.keras.backend"
export RUNFILES_DIR="/Users/ahlawat/opt/miniconda3/envs/plaidml/share/plaidml"
export PLAIDML_NATIVE_PATH="/Users/ahlawat/opt/miniconda3/envs/plaidml/lib/libplaidml.dylib"
OR set variables in your code:
import os
os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"
os.environ["RUNFILES_DIR"] = "/Users/ahlawat/opt/miniconda3/envs/plaidml/share/plaidml"
os.environ["PLAIDML_NATIVE_PATH"] = "/Users/ahlawat/opt/miniconda3/envs/plaidml/lib/libplaidml.dylib"
#alternate file locations in different scenarios:
#os.environ["RUNFILES_DIR"] = "/usr/local/share/plaidml"
#os.environ["PLAIDML_NATIVE_PATH"] = "/usr/local/lib/libplaidml.dylib"
#os.environ["RUNFILES_DIR"] = "/Library/Frameworks/Python.framework/Versions/3.7/share/plaidml"
#os.environ["PLAIDML_NATIVE_PATH"] = "/Library/Frameworks/Python.framework/Versions/3.7/lib/libplaidml.dylib"
(plaidml) ➜ ~ plaidml-setup
PlaidML Setup (0.7.0)
Thanks for using PlaidML!
The feedback we have received from our users indicates an ever-increasing need
for performance, programmability, and portability. During the past few months,
we have been restructuring PlaidML to address those needs. To make all the
changes we need to make while supporting our current user base, all development
of PlaidML has moved to a branch — plaidml-v1. We will continue to maintain and
support the master branch of PlaidML and the stable 0.7.0 release.
Read more here: https://github.com/plaidml/plaidml
Some Notes:
* Bugs and other issues: https://github.com/plaidml/plaidml/issues
* Questions: https://stackoverflow.com/questions/tagged/plaidml
* Say hello: https://groups.google.com/forum/#!forum/plaidml-dev
* PlaidML is licensed under the Apache License 2.0
Default Config Devices:
llvm_cpu.0 : CPU (via LLVM)
metal_intel(r)_uhd_graphics_630.0 : Intel(R) UHD Graphics 630 (Metal)
metal_amd_radeon_pro_5500m.0 : AMD Radeon Pro 5500M (Metal)
Experimental Config Devices:
llvm_cpu.0 : CPU (via LLVM)
opencl_amd_radeon_pro_5500m_compute_engine.0 : AMD AMD Radeon Pro 5500M Compute Engine (OpenCL)
opencl_intel_uhd_graphics_630.0 : Intel Inc. Intel(R) UHD Graphics 630 (OpenCL)
metal_intel(r)_uhd_graphics_630.0 : Intel(R) UHD Graphics 630 (Metal)
metal_amd_radeon_pro_5500m.0 : AMD Radeon Pro 5500M (Metal)
Using experimental devices can cause poor performance, crashes, and other nastiness.
Enable experimental device support? (y,n)[n]:
Multiple devices detected (You can override by setting PLAIDML_DEVICE_IDS).
Please choose a default device:
1 : llvm_cpu.0
2 : metal_intel(r)_uhd_graphics_630.0
3 : metal_amd_radeon_pro_5500m.0
Default device? (1,2,3)[1]:3
Selected device:
metal_amd_radeon_pro_5500m.0
Almost done. Multiplying some matrices...
Tile code:
function (B[X,Z], C[Z,Y]) -> (A) { A[x,y : X,Y] = +(B[x,z] * C[z,y]); }
Whew. That worked.
Save settings to /Users/ahlawat/.plaidml? (y,n)[y]:y
Success!
(plaidml) ➜ ~ plaidbench keras mobilenet
Running 1024 examples with mobilenet, batch size 1, on backend plaid
INFO:plaidml:Opening device "metal_amd_radeon_pro_5500m.0"
Compiling network... Warming up... Running...
Example finished, elapsed: 0.392s (compile), 14.776s (execution)
-----------------------------------------------------------------------------------------
Network Name Inference Latency Time / FPS
-----------------------------------------------------------------------------------------
mobilenet 14.43 ms 0.00 ms / 1000000000.00 fps
Correctness: PASS, max_error: 1.675534622336272e-05, max_abs_error: 7.674098014831543e-07, fail_ratio: 0.0
(plaidml) ➜ ~
(plaidml) ➜ ~ plaidbench --batch-size 16 keras --train mobilenet
Running 1024 examples with mobilenet, batch size 16, on backend plaid
Loading CIFAR data
INFO:plaidml:Opening device "metal_amd_radeon_pro_5500m.0"
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.6/mobilenet_1_0_224_tf.h5
17227776/17225924 [==============================] - 2s 0us/step
Compiling network...Epoch 1/1
INFO:plaidml:Analyzing Ops: 2213 of 2804 operations complete
16/16 [==============================] - 11s 696ms/step - loss: 10.1056 - acc: 0.0000e+00
Warming up...Epoch 1/1
32/32 [==============================] - 1s 36ms/step - loss: 8.9861 - acc: 0.0000e+00
Running...
Epoch 1/1
1024/1024 [==============================] - 37s 36ms/step - loss: 3.5184 - acc: 0.1035
Example finished, elapsed: 11.346s (compile), 37.112s (execution)
-----------------------------------------------------------------------------------------
Network Name Inference Latency Time / FPS
-----------------------------------------------------------------------------------------
mobilenet 36.24 ms 0.00 ms / 1000000000.00 fps
Correctness: untested. Could not find golden data to compare against.
(plaidml) ➜ ~
(plaidml) ➜ Development python dl-test.py
Using plaidml.keras.backend backend.
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
11493376/11490434 [==============================] - 2s 0us/step
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
INFO:plaidml:Opening device "metal_amd_radeon_pro_5500m.0"
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
60000/60000 [==============================] - 17s 277us/step - loss: 0.2737 - acc: 0.9167 - val_loss: 0.0598 - val_acc: 0.9809
Epoch 2/12
60000/60000 [==============================] - 13s 218us/step - loss: 0.0894 - acc: 0.9737 - val_loss: 0.0388 - val_acc: 0.9871
Epoch 3/12
60000/60000 [==============================] - 13s 215us/step - loss: 0.0658 - acc: 0.9806 - val_loss: 0.0372 - val_acc: 0.9870
Epoch 4/12
60000/60000 [==============================] - 13s 213us/step - loss: 0.0549 - acc: 0.9832 - val_loss: 0.0316 - val_acc: 0.9893
Epoch 5/12
60000/60000 [==============================] - 13s 215us/step - loss: 0.0470 - acc: 0.9855 - val_loss: 0.0290 - val_acc: 0.9906
Epoch 6/12
60000/60000 [==============================] - 13s 218us/step - loss: 0.0419 - acc: 0.9872 - val_loss: 0.0324 - val_acc: 0.9889
Epoch 7/12
60000/60000 [==============================] - 13s 218us/step - loss: 0.0376 - acc: 0.9887 - val_loss: 0.0279 - val_acc: 0.9918
Epoch 8/12
60000/60000 [==============================] - 13s 218us/step - loss: 0.0345 - acc: 0.9897 - val_loss: 0.0307 - val_acc: 0.9904
Epoch 9/12
60000/60000 [==============================] - 13s 219us/step - loss: 0.0318 - acc: 0.9903 - val_loss: 0.0275 - val_acc: 0.9913
Epoch 10/12
60000/60000 [==============================] - 13s 219us/step - loss: 0.0304 - acc: 0.9909 - val_loss: 0.0304 - val_acc: 0.9914
Epoch 11/12
60000/60000 [==============================] - 13s 218us/step - loss: 0.0288 - acc: 0.9909 - val_loss: 0.0289 - val_acc: 0.9920
Epoch 12/12
60000/60000 [==============================] - 13s 219us/step - loss: 0.0262 - acc: 0.9920 - val_loss: 0.0291 - val_acc: 0.9909
Test loss: 0.02908747911453247
Test accuracy: 0.9909
(plaidml) ➜ Development
(plaidml) ➜ Development cat dl-test.py
import os
os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"
os.environ["RUNFILES_DIR"] = "/Users/ahlawat/opt/miniconda3/envs/plaidml/share/plaidml"
os.environ["PLAIDML_NATIVE_PATH"] = "/Users/ahlawat/opt/miniconda3/envs/plaidml/lib/libplaidml.dylib"
# Don't use tensorflow.keras anywhere, instead use keras
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
batch_size = 128
num_classes = 10
epochs = 12
# input image dimensions
img_rows, img_cols = 28, 28
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
(plaidml) ➜ Development
(plaidml) ➜ Development python dl-vgg.py
Using plaidml.keras.backend backend.
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170500096/170498071 [==============================] - 10s 0us/step
INFO:plaidml:Opening device "metal_amd_radeon_pro_5500m.0"
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels.h5
574717952/574710816 [==============================] - 61s 0us/step
Running initial batch (compiling tile program)
Timing inference...
Ran in 2.783154249191284 seconds
(plaidml) ➜ Development
(plaidml) ➜ Development cat dl-vgg.py
import numpy as np
import os
import time
os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"
import keras
import keras.applications as kapp
from keras.datasets import cifar10
(x_train, y_train_cats), (x_test, y_test_cats) = cifar10.load_data()
batch_size = 8
x_train = x_train[:batch_size]
x_train = np.repeat(np.repeat(x_train, 7, axis=1), 7, axis=2)
model = kapp.VGG19()
model.compile(optimizer='sgd', loss='categorical_crossentropy',
metrics=['accuracy'])
print("Running initial batch (compiling tile program)")
y = model.predict(x=x_train, batch_size=batch_size)
# Now start the clock and run 10 batches
print("Timing inference...")
start = time.time()
for i in range(10):
y = model.predict(x=x_train, batch_size=batch_size)
print("Ran in {} seconds".format(time.time() - start))
(plaidml) ➜ Development