AI-Labs | Sharad Ahlawat

AI-Labs


MBP AI rig

https://www.intel.ai/plaidml/
Note: The 2.3.0 release will be the last major release of multi-backend Keras. Multi-backend Keras is superseded by tf.keras. Bugs present in multi-backend Keras will only be fixed until April 2020 (as part of minor releases).

So will need https://www.ngraph.ai in the future to use Keras with PlaidML

conda create --name plaidml
conda activate plaidml
pip install pybind11 pyopencl plaidml-keras plaidbench

Setup shell variables:
export KERAS_BACKEND="plaidml.keras.backend"
export RUNFILES_DIR="/Users/ahlawat/opt/miniconda3/envs/plaidml/share/plaidml"
export PLAIDML_NATIVE_PATH="/Users/ahlawat/opt/miniconda3/envs/plaidml/lib/libplaidml.dylib"

OR set variables in your code:
import os
os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"
os.environ["RUNFILES_DIR"] = "/Users/ahlawat/opt/miniconda3/envs/plaidml/share/plaidml"
os.environ["PLAIDML_NATIVE_PATH"] = "/Users/ahlawat/opt/miniconda3/envs/plaidml/lib/libplaidml.dylib"

#alternate file locations in different scenarios:
#os.environ["RUNFILES_DIR"] = "/usr/local/share/plaidml"
#os.environ["PLAIDML_NATIVE_PATH"] = "/usr/local/lib/libplaidml.dylib"
#os.environ["RUNFILES_DIR"] = "/Library/Frameworks/Python.framework/Versions/3.7/share/plaidml"
#os.environ["PLAIDML_NATIVE_PATH"] = "/Library/Frameworks/Python.framework/Versions/3.7/lib/libplaidml.dylib"


(plaidml) ➜ ~ plaidml-setup

PlaidML Setup (0.7.0)

Thanks for using PlaidML!

The feedback we have received from our users indicates an ever-increasing need
for performance, programmability, and portability. During the past few months,
we have been restructuring PlaidML to address those needs. To make all the
changes we need to make while supporting our current user base, all development
of PlaidML has moved to a branch — plaidml-v1. We will continue to maintain and
support the master branch of PlaidML and the stable 0.7.0 release.

Read more here: https://github.com/plaidml/plaidml

Some Notes:
* Bugs and other issues: https://github.com/plaidml/plaidml/issues
* Questions: https://stackoverflow.com/questions/tagged/plaidml
* Say hello: https://groups.google.com/forum/#!forum/plaidml-dev
* PlaidML is licensed under the Apache License 2.0


Default Config Devices:
llvm_cpu.0 : CPU (via LLVM)
metal_intel(r)_uhd_graphics_630.0 : Intel(R) UHD Graphics 630 (Metal)
metal_amd_radeon_pro_5500m.0 : AMD Radeon Pro 5500M (Metal)

Experimental Config Devices:
llvm_cpu.0 : CPU (via LLVM)
opencl_amd_radeon_pro_5500m_compute_engine.0 : AMD AMD Radeon Pro 5500M Compute Engine (OpenCL)
opencl_intel_uhd_graphics_630.0 : Intel Inc. Intel(R) UHD Graphics 630 (OpenCL)
metal_intel(r)_uhd_graphics_630.0 : Intel(R) UHD Graphics 630 (Metal)
metal_amd_radeon_pro_5500m.0 : AMD Radeon Pro 5500M (Metal)

Using experimental devices can cause poor performance, crashes, and other nastiness.

Enable experimental device support? (y,n)[n]:

Multiple devices detected (You can override by setting PLAIDML_DEVICE_IDS).
Please choose a default device:

1 : llvm_cpu.0
2 : metal_intel(r)_uhd_graphics_630.0
3 : metal_amd_radeon_pro_5500m.0

Default device? (1,2,3)[1]:3

Selected device:
metal_amd_radeon_pro_5500m.0

Almost done. Multiplying some matrices...
Tile code:
function (B[X,Z], C[Z,Y]) -> (A) { A[x,y : X,Y] = +(B[x,z] * C[z,y]); }
Whew. That worked.

Save settings to /Users/ahlawat/.plaidml? (y,n)[y]:y
Success!


(plaidml) ➜ ~ plaidbench keras mobilenet
Running 1024 examples with mobilenet, batch size 1, on backend plaid
INFO:plaidml:Opening device "metal_amd_radeon_pro_5500m.0"
Compiling network... Warming up... Running...
Example finished, elapsed: 0.392s (compile), 14.776s (execution)

-----------------------------------------------------------------------------------------
Network Name Inference Latency Time / FPS
-----------------------------------------------------------------------------------------
mobilenet 14.43 ms 0.00 ms / 1000000000.00 fps
Correctness: PASS, max_error: 1.675534622336272e-05, max_abs_error: 7.674098014831543e-07, fail_ratio: 0.0
(plaidml) ➜ ~


(plaidml) ➜ ~ plaidbench --batch-size 16 keras --train mobilenet
Running 1024 examples with mobilenet, batch size 16, on backend plaid
Loading CIFAR data
INFO:plaidml:Opening device "metal_amd_radeon_pro_5500m.0"
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.6/mobilenet_1_0_224_tf.h5
17227776/17225924 [==============================] - 2s 0us/step
Compiling network...Epoch 1/1
INFO:plaidml:Analyzing Ops: 2213 of 2804 operations complete
16/16 [==============================] - 11s 696ms/step - loss: 10.1056 - acc: 0.0000e+00
Warming up...Epoch 1/1
32/32 [==============================] - 1s 36ms/step - loss: 8.9861 - acc: 0.0000e+00
Running...
Epoch 1/1
1024/1024 [==============================] - 37s 36ms/step - loss: 3.5184 - acc: 0.1035
Example finished, elapsed: 11.346s (compile), 37.112s (execution)

-----------------------------------------------------------------------------------------
Network Name Inference Latency Time / FPS
-----------------------------------------------------------------------------------------
mobilenet 36.24 ms 0.00 ms / 1000000000.00 fps
Correctness: untested. Could not find golden data to compare against.
(plaidml) ➜ ~


(plaidml) ➜ Development python dl-test.py
Using plaidml.keras.backend backend.
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
11493376/11490434 [==============================] - 2s 0us/step
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
INFO:plaidml:Opening device "metal_amd_radeon_pro_5500m.0"
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
60000/60000 [==============================] - 17s 277us/step - loss: 0.2737 - acc: 0.9167 - val_loss: 0.0598 - val_acc: 0.9809
Epoch 2/12
60000/60000 [==============================] - 13s 218us/step - loss: 0.0894 - acc: 0.9737 - val_loss: 0.0388 - val_acc: 0.9871
Epoch 3/12
60000/60000 [==============================] - 13s 215us/step - loss: 0.0658 - acc: 0.9806 - val_loss: 0.0372 - val_acc: 0.9870
Epoch 4/12
60000/60000 [==============================] - 13s 213us/step - loss: 0.0549 - acc: 0.9832 - val_loss: 0.0316 - val_acc: 0.9893
Epoch 5/12
60000/60000 [==============================] - 13s 215us/step - loss: 0.0470 - acc: 0.9855 - val_loss: 0.0290 - val_acc: 0.9906
Epoch 6/12
60000/60000 [==============================] - 13s 218us/step - loss: 0.0419 - acc: 0.9872 - val_loss: 0.0324 - val_acc: 0.9889
Epoch 7/12
60000/60000 [==============================] - 13s 218us/step - loss: 0.0376 - acc: 0.9887 - val_loss: 0.0279 - val_acc: 0.9918
Epoch 8/12
60000/60000 [==============================] - 13s 218us/step - loss: 0.0345 - acc: 0.9897 - val_loss: 0.0307 - val_acc: 0.9904
Epoch 9/12
60000/60000 [==============================] - 13s 219us/step - loss: 0.0318 - acc: 0.9903 - val_loss: 0.0275 - val_acc: 0.9913
Epoch 10/12
60000/60000 [==============================] - 13s 219us/step - loss: 0.0304 - acc: 0.9909 - val_loss: 0.0304 - val_acc: 0.9914
Epoch 11/12
60000/60000 [==============================] - 13s 218us/step - loss: 0.0288 - acc: 0.9909 - val_loss: 0.0289 - val_acc: 0.9920
Epoch 12/12
60000/60000 [==============================] - 13s 219us/step - loss: 0.0262 - acc: 0.9920 - val_loss: 0.0291 - val_acc: 0.9909
Test loss: 0.02908747911453247
Test accuracy: 0.9909
(plaidml) ➜ Development


(plaidml) ➜ Development cat dl-test.py
import os
os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"
os.environ["RUNFILES_DIR"] = "/Users/ahlawat/opt/miniconda3/envs/plaidml/share/plaidml"
os.environ["PLAIDML_NATIVE_PATH"] = "/Users/ahlawat/opt/miniconda3/envs/plaidml/lib/libplaidml.dylib"

# Don't use tensorflow.keras anywhere, instead use keras

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
batch_size = 128
num_classes = 10
epochs = 12
# input image dimensions
img_rows, img_cols = 28, 28
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
(plaidml) ➜ Development


(plaidml) ➜ Development python dl-vgg.py
Using plaidml.keras.backend backend.
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170500096/170498071 [==============================] - 10s 0us/step
INFO:plaidml:Opening device "metal_amd_radeon_pro_5500m.0"
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels.h5
574717952/574710816 [==============================] - 61s 0us/step
Running initial batch (compiling tile program)
Timing inference...
Ran in 2.783154249191284 seconds
(plaidml) ➜ Development


(plaidml) ➜ Development cat dl-vgg.py
import numpy as np
import os
import time

os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"

import keras
import keras.applications as kapp
from keras.datasets import cifar10

(x_train, y_train_cats), (x_test, y_test_cats) = cifar10.load_data()
batch_size = 8
x_train = x_train[:batch_size]
x_train = np.repeat(np.repeat(x_train, 7, axis=1), 7, axis=2)
model = kapp.VGG19()
model.compile(optimizer='sgd', loss='categorical_crossentropy',
metrics=['accuracy'])

print("Running initial batch (compiling tile program)")
y = model.predict(x=x_train, batch_size=batch_size)

# Now start the clock and run 10 batches
print("Timing inference...")
start = time.time()
for i in range(10):
y = model.predict(x=x_train, batch_size=batch_size)
print("Ran in {} seconds".format(time.time() - start))
(plaidml) ➜ Development
Comments

PC AI rig

Repurpose the Gaming Rig for running an AI ML/DL development/testing machine.

For a FreeBSD jail configuration check out - https://diyit.org/jails/mage.html

Install:
- Ubuntu 18.04 server from USB drive
- For Ubuntu 20.04 desktop check notes at end

Netplan:
ahlawat@game:/etc/netplan$ cat 01-netcfg.yaml

# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
  version: 2
  renderer: networkd
  wifis:
    wlo1:
      dhcp4: true
      dhcp6: true
      access-points:
        "sadmzas-a":
          password: "your key"


Missing wpasupplicant:
- download to a usb-drive and mount on server these two packages
https://packages.ubuntu.com/bionic/wpasupplicant
https://packages.ubuntu.com/bionic/libpcsclite1
- sudo dpkg -i *.deb - install both packages
- sudo netplan apply - to connect wifi

ZFS Setup:
- apt install zfsutils-linux
- zpool create -f tank mirror /dev/sde /dev/sdf

Docker Setup:
- Create apt source-list file: /etc/apt/sources.list.d/docker.list

deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable

- sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 7EA0A9C3F273FCD8
- sudo apt update
- sudo apt install docker-ce
- sudo usermod -aG docker $USER
 
Docker with ZFS: 
- sudo service docker stop
- sudo cp -au /var/lib/docker /var/lib/docker.bk
- sudo rm -rf /var/lib/docker/*
- zfs create tank/docker
- zfs set mountpoint=/var/lib/docker tank/docker
- Edit /etc/docker/daemon.json

{
  "storage-driver": "zfs"
}

- service docker start
- docker info | grep zfs
 
 TurboVNC Setup:
- curl -o turbovnc_2.2.5_amd64.deb https://sourceforge.net/projects/turbovnc/files/2.2.5/turbovnc_2.2.5_amd64.deb/download#
- sudo apt install gdebi-core
- sudo gdebi turbovnc_2.2.5_amd64.deb
- sudo apt install libsm6 x11-xkb-utils fluxbox xterm firefox
- /opt/TurboVNC/bin/vncserver -name game -geometry 1920x1080 :4
- Edit .vnc/ xstartup.turbovnc - replace references to "twm" with "fluxbox"
- sudo killall Xvnc; /opt/TurboVNC/bin/vncserver -name game -geometry 1920x1080 :4

Other Packages:
- sudo apt install tmux mc
- ahlawat@game:~$ cat .tmux.conf

unbind C-b
set -g prefix C-a
bind C-a send-prefix
setw -g mouse on
# Set the default terminal mode to 256color mode
set -g default-terminal "screen-256color"
# enable activity alerts
setw -g monitor-activity on
set -g visual-activity on

 
NVIDIA Docker:
- reference: https://github.com/NVIDIA/nvidia-docker#quickstart
- distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
- curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
- curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
- sudo apt update && sudo apt install -y nvidia-container-toolkit
- sudo service docker restart

NVIDIA CUDA: 
- reference: https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal
- wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
- sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
- wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
- sudo dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
- sudo apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub
- sudo apt update
- sudo apt -y install cuda
- nvidia-container-cli info

NVRM version:   440.33.01
CUDA version:   10.2
Device Index:   0
Device Minor:   0
Model:          GeForce GTX 1080 Ti
Brand:          GeForce
GPU UUID:       GPU-0707abd3-94dc-f089-3977-6c23d8f3a540
Bus Location:   00000000:01:00.0
Architecture:   6.1

- docker run --gpus all nvidia/cuda:10.2-base nvidia-smi
 
Python Setup: 
- reference: https://www.anaconda.com/distribution/
- wget https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh
- chmod 711 Anaconda3-2019.10-Linux-x86_64.sh
- ./Anaconda3-2019.10-Linux-x86_64.sh
- conda update -n base -c defaults conda
- conda install ipykernel

TensorFlow environment:
- conda create -y --name tfgpu python=3
- conda activate tfgpu
- conda install -y -c jupyter nb_conda
- conda install tensorflow-gpu
- python -c "import tensorflow as tf; print(tf.config.experimental.list_physical_devices('GPU'))"
 
Pytorch environment:
- conda create -y --name pytorch python=3
- conda activate pytorch
- conda install -y -c jupyter nb_conda
- conda install -y -c pytorch torchvision cudatoolkit pytorch
- python -c "import torch; print(torch.cuda.get_device_name(0))"

Jupyter Notebook environment:
- conda create -y --name jnb python=3
- conda activate jnb
- conda install -y -c jupyter nb_conda
- conda install -y -c numpy matplotlib
- jupyter notebook --generate-config
- Edit config-file

c.NotebookApp.allow_remote_access = True
c.NotebookApp.ip = '*' c.NotebookApp.open_browser = False

- launch and connect to http://game:8888
- jupyter notebook &

Add environments to Jupyter
- python -m ipykernel install --user --name=tfgpu
- python -m ipykernel install --user --name=pytorch
- python -m ipykernel install --user --name=jnb

Visual Studio Code:
- wget https://go.microsoft.com/fwlink/?LinkID=760868 -O vs.deb
- sudo gdebi vs.deb

Ubuntu 20.04 Desktop version
- Ubuntu 20.04 desktop from USB drive
- for
desktop vnc from MAC/Windows - vino disable incompatible encryption - $ gsettings set org.gnome.Vino require-encryption false
- enable screen sharing and automatic user login under settings
- Nvidia CUDA, get commands from here - https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=2004&target_type=deblocal


Current: 7/2020
- Anaconda - wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh

FreeBSD in grub

root@ahlawat-u20:/etc/grub.d# cat 40_custom
#!/bin/sh
exec tail -n +3 $0
# This file provides an easy way to add custom menu entries. Simply type the
# menu entries you want to add after this comment. Be careful not to change
# the 'exec tail' line above.

menuentry "FreeBSD" {
set root=(hd1,2)
kfreebsd /boot/loader
}
root@ahlawat-u20:/etc/grub.d#

grub-mkconfig -o /boot/grub/grub.cfg


(base) ahlawat@ahlawat-u20:~/.vnc$ cat xstartup.turbovnc
#!/bin/sh

unset SESSION_MANAGER
unset DBUS_SESSION_BUS_ADDRESS
XDG_SESSION_TYPE=x11; export XDG_SESSION_TYPE

OS=`uname -s`

which fluxbox >/dev/null && {
if [ -f $HOME/.Xresources ]; then xrdb $HOME/.Xresources; fi
xsetroot -solid grey
xterm -geometry 80x24+10+10 -ls -title "$VNCDESKTOP Desktop" &
fluxbox
} || {
echo "fluxbox not found. I give up."
exit 1
}
(base) ahlawat@ahlawat-u20:~/.vnc$

Comments

NVIDIA Jetson

to stop the “Desktop Sharing” panel from crashing
 
$ sudo vi /usr/share/glib-2.0/schemas/org.gnome.Vino.gschema.xml

   Enable remote access to the desktop
  
   If true, allows remote access to the desktop via the RFB
   protocol. Users on remote machines may then connect to the
   desktop using a VNC viewer.
  

   false

 
$ sudo glib-compile-schemas /usr/share/glib-2.0/schemas
 
export DISPLAY=:0
gsettings set org.gnome.Vino enabled true
gsettings set org.gnome.Vino prompt-enabled false
gsettings set org.gnome.Vino require-encryption false
 
$ /usr/lib/vino/vino-server
 
OR
 
$ sudo apt-get install xrdp cmake git
 
https://github.com/JetsonHacksNano/CSI-Camera
sudo apt-get install v4l-utils
gst-launch-1.0 nvarguscamerasrc ! 'video/x-raw(memory:NVMM),width=3820, height=2464, framerate=21/1, format=NV12' ! nvvidconv flip-method=0 ! 'video/x-raw,width=960, height=616' ! nvvidconv ! nvegltransform ! nveglglessink -e
 
 
https://developer.nvidia.com/embedded/twodaystoademo
$ git clone https://github.com/dusty-nv/jetson-inference
$ cd jetson-inference
$ git submodule update --init
$ mkdir build
$ cd build
$ cmake ../
$ make
$ sudo make install
$ cd aarch64/bin
https://github.com/dusty-nv/jetson-inference/blob/master/docs/detectnet-console-2.md
$ ./detectnet-console ~/dog-simba.jpeg ~/out.jpeg coco-dog
https://github.com/dusty-nv/jetson-inference/blob/master/docs/detectnet-camera-2.md
$ ./detectnet-camera coco-dog
 
 
https://www.jetsonhacks.com/2019/04/25/jetson-nano-run-on-usb-drive/
https://medium.com/@jerry_liang/deploy-gpu-enabled-kubernetes-pod-on-nvidia-jetson-nano-ce738e3bcda9
 
https://developer.nvidia.com/embedded/community/jetson-projects
 
 
 
VINO : Enable VNC Server Remotely
#!/bin/sh
systemctl disable xrdp.service
mkdir /root/.config/autostart
touch /root/.config/autostart/vino-server.desktop
echo "[Desktop Entry]" >> /root/.config/autostart/vino-server.desktop
echo "Type=Application" >> /root/.config/autostart/vino-server.desktop
echo "Name=Vino VNC server" >> /root/.config/autostart/vino-server.desktop
echo "Exec=/usr/lib/vino/vino-server" >> /root/.config/autostart/vino-server.desktop
echo "NoDisplay=true" >> /root/.config/autostart/vino-server.desktop
dbus-launch gsettings set org.gnome.Vino require-encryption false
dbus-launch gsettings set org.gnome.Vino prompt-enabled false
dbus-launch gsettings set org.gnome.Vino notify-on-connect false
dbus-launch gsettings set org.gnome.Vino authentication-methods "['vnc']"
dbus-launch gsettings set org.gnome.Vino vnc-password $(echo -n "password"|base64)
dbus-launch gsettings set org.gnome.desktop.lockdown disable-user-switching true
dbus-launch gsettings set org.gnome.desktop.lockdown disable-lock-screen true
dbus-launch gsettings set org.gnome.desktop.lockdown disable-log-out true
dbus-launch gsettings set org.gnome.desktop.interface enable-animations false
dbus-launch gsettings set org.gnome.desktop.session session-name gnome
sed -i 's/#  AutomaticLoginEnable = true/  AutomaticLoginEnable = true/g' /etc/gdm3/daemon.conf
sed -i 's/#  AutomaticLogin = root/  AutomaticLogin = root/g' /etc/gdm3/daemon.conf
reboot
exit 0
 



https://github.com/dusty-nv/jetson-inference/blob/master/README.md#two-days-to-a-demo-training--inference
 
https://github.com/dusty-nv/jetson-inference/blob/master/docs/digits-setup.md
 
ubuntu-18.04.2-server-amd64.iso
 
https://docs.nvidia.com/sdk-manager/download-run-sdkm/index.html
 
scp sdkmanager_0.9.11-3405_amd64.deb htpc:
 
$ sudo apt install ./sdkmanager_0.9.11-3405_amd64.deb
$ sudo apt install libxss1 libnss3

https://github.com/dusty-nv/jetson-inference/blob/master/docs/digits-setup.md
https://github.com/NVIDIA/nvidia-docker#quick-start
## https://ngc.nvidia.com/catalog/containers/nvidia:digits
## https://devblogs.nvidia.com/gpu-containers-runtime/
 
OR manual below
 
https://github.com/dusty-nv/jetson-inference/blob/master/docs/digits-native.md
 
sudo add-apt-repository ppa:graphics-drivers
$ sudo apt install nvidia-410
$ sudo reboot
 
$ lsmod | grep nvidia
 
$ cd /usr/local/cuda/samples $ sudo make "-j$(nproc)" $ cd bin/x86_64/linux/release/ $ ./deviceQuery $ ./bandwidthTest --memory=pinned
 
https://developer.nvidia.com/cudnn
 
ahlawat@htpc:~$ ls -al libcudnn7*
-rw-r--r-- 1 ahlawat ahlawat 151792832 May 19 20:12 libcudnn7_7.5.1.10-1+cuda10.0_amd64.deb
-rw-r--r-- 1 ahlawat ahlawat 140165264 May 19 20:12 libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb
-rw-r--r-- 1 ahlawat ahlawat   5173724 May 19 20:12 libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.deb
ahlawat@htpc:~$
 
$ sudo apt install libcudnn*

 
Install required CMake version
https://github.com/clab/dynet/issues/1457#issuecomment-423931508
NEEDS upgrading to CMake to 3.12.2
wget http://www.cmake.org/files/v3.12/cmake-3.12.2.tar.gz
tar -xvzf cmake-3.12.2.tar.gz
cd cmake
./configure
make "-j$(nproc)"
sudo make install
sudo update-alternatives --install /usr/bin/cmake cmake /usr/local/bin/cmake 1 --force
 
 
https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/BuildDigits.md
sudo apt install --no-install-recommends git graphviz python-dev python-flask python-flaskext.wtf python-gevent python-h5py python-numpy python-pil python-pip python-scipy python-tk
 

Protobuf
sudo apt install autoconf automake libtool curl make g++ git python-dev python-setuptools unzip
 
export PROTOBUF_ROOT=~/protobuf git clone https://github.com/google/protobuf.git $PROTOBUF_ROOT -b '3.2.x'
cd $PROTOBUF_ROOT ./autogen.sh ./configure make "-j$(nproc)" sudo make install sudo ldconfig cd python sudo python setup.py install --cpp_implementation
 
 
Caffe
sudo apt install --no-install-recommends build-essential cmake git gfortran libatlas-base-dev libboost-filesystem-dev libboost-python-dev libboost-system-dev libboost-thread-dev libgflags-dev libgoogle-glog-dev libhdf5-serial-dev libleveldb-dev liblmdb-dev libopencv-dev libsnappy-dev python-all-dev python-dev python-h5py python-matplotlib python-numpy python-opencv python-pil python-pip python-pydot python-scipy python-skimage python-sklearn
 
# example location - can be customized export CAFFE_ROOT=~/caffe git clone https://github.com/NVIDIA/caffe.git $CAFFE_ROOT -b 'caffe-0.15'
sudo pip install -r $CAFFE_ROOT/python/requirements.txt
cd $CAFFE_ROOT mkdir build cd build cmake ..
 
https://devtalk.nvidia.com/default/topic/1037599/jetson-tx2/installation-of-caffe-error/
nano -w ../cmake/Dependencies.cmake
1. ---  list(APPEND Caffe_LINKER_LIBS ${HDF5_LIBRARIES})
2. +++  list(APPEND Caffe_LINKER_LIBS ${HDF5_LIBRARIES} ${HDF5_HL_LIBRARIES})
 
make -j"$(nproc)"
sudo make install
 
$ nano -w  ~/.bashrc
export CAFFE_ROOT=/home/ahlawat/caffe
export PYTHONPATH=/home/ahlawat/caffe/python:$PYTHONPATH
 
Torch
https://github.com/nagadomi/waifu2x/issues/253#issuecomment-445448928
git clone https://github.com/nagadomi/distro.git ~/torch --recursive cd ~/torch ./install-deps ./clean.sh ./update.sh
./install.sh -b
$ nano -w ~/.bashrc
export TORCH_ROOT=~/torch
export PATH=$PATH:~/torch/install/bin
$ source ~/.bashrc
 
Tensorflow
https://www.tensorflow.org/install/gpu
wget https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64
sudo apt install cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-0-local-10.0.130-410.48/7fa2af80.pub
sudo apt update
sudo apt install cuda
sudo pip install tensorflow-gpu
 
 
Finally Digits
DIGITS_ROOT=~/digits git clone https://github.com/NVIDIA/DIGITS.git $DIGITS_ROOT
sudo pip install -r $DIGITS_ROOT/requirements.txt
sudo pip install -e $DIGITS_ROOT
### if ERROR sudo pip install numpy --upgrade
cd digits
./digits-devserver
 
 
 
-rw-r--r--  1 ahlawat ahlawat  65483980 May 19 15:18 sdkmanager_0.9.11-3405_amd64.deb
 
-rw-r--r--  1 ahlawat ahlawat 151792832 May 19 20:12 libcudnn7_7.5.1.10-1+cuda10.0_amd64.deb
-rw-r--r--  1 ahlawat ahlawat 140165264 May 19 20:12 libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb
-rw-r--r--  1 ahlawat ahlawat   5173724 May 19 20:12 libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.deb
-rw-rw-r--  1 ahlawat ahlawat 1660647860 Sep 12  2018 cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
 
-rw-rw-r--  1 ahlawat ahlawat   8388114 Sep  7  2018 cmake-3.12.2.tar.gz
 
 
https://docs.nvidia.com/deeplearning/digits/digits-release-notes/rel_19-01.html#rel_19-01
The container also includes the following:
• Ubuntu 16.04 including Python 2.7
• NVIDIA CUDA 10.0.130 including CUDA® Basic Linear Algebra Subroutines library™ (cuBLAS) 10.0.130
• NVIDIA CUDA® Deep Neural Network library™ (cuDNN) 7.4.2
• NCCL 2.3.7 (optimized for NVLink™ )
• OpenMPI 3.1.3
• NVCaffe 0.17.2
• TensorFlow 1.12.0
• TensorRT 5.0.2
 
Comments

Gaming rig

CPU:
- Intel Core i5-9600K Coffee Lake 6-Core 3.7 GHz (4.6 GHz Turbo) (overclocked @ 5.0 GHz)

Memory:
- 4 x 16GB DDR4-2133 Virtium vl31a2g63f-n6sb (overclocked @ 2666)
- 4 x 4GB CORSAIR Dominator Platinum DDR4 3200MHz C16 ASUS ROG compatible (replaced in favor of above)

Graphics:
- ASUS ROG STRIX GeForce GTX 1080 TI, 11GB OC Edition
- EVGA GeForce GTX 750Ti, 2GB

5 Monitors:
- 3 x ASUS PA248Q
- 1 x ASUS PB278Q
- 1 x Dell P2314T TouchScreen

Motherboard:
- ASUS ROG STRIX Z390-E GAMING LGA 1151 Intel Motherboard

Power Supply:
- ASUS ROG Strix 750 Fully Modular 80 Plus Gold 750W ATX Power Supply

CPU Cooler:
- Noctua NH-U12A, NF-A12x25 PWM Fans (120mm)

Controllers:
- Razer BlackWidow Ultimate Keyboard
- Razer DeathAdder Chroma Mouse

Camera:
- Logitech HD Webcam C910

Gaming Controllers:
- CH Products Eclipse Yoke with 144 Programmable Functions with Control Manager Software
- CH Products Pro Pedals USB Flight Simulator Pedals (300-111)
- Razer Wildcat eSports Customizable Premium Controller for Xbox One W/ 4 Programmable Buttons

VR:
- Pimax 8K
- 2 x Vive 2.0 controllers
- 2 x Vive 1.0 lighthouses
- 1 x Vive deluxe audio strap

Cabinet:
- Rosewill ATX Mid Tower Case with Side Window, including 3 x 120mm Fans, 2 X USB 3.0 Ports (BRADLEY M)

Monitor Stand:
- Quad LCD Monitor Desk Stand Mount Free-Standing 3 + 1 = 4 / Holds Four Screens up to 27" (STAND-V004Z)

Windows 10:
- 1 x 240GB - Micron_M500_MTFDDAV240MAV, MU05, max UDMA/133
- 2 X 1TB - SanDisk Ultra 3D NAND SSD SATA III 6 Gb/s SDSSDH3-1T00-G25 (BIOS RAID stripe)

Ubuntu 18.04:
- 1 x 250GB - Samsung SSD 840PRO
- 2 x 750GB - Mediastor/Mediamax WL750GLSA854 (ZFS mirror)
Comments