TensorFlow on arm64

The VIM3L (Cortex A55 cores) I have has a NPU accelerator built-in. An interesting article about it is here, however before doing fancy NPU stuff, let’s get TensorFlow working first. Should be easy.

Famous last words. Turns out that “pip install tensorflow” does not work: on arm64 (AKA aarch64 AKA ARMv8) TensorFlow is not officially supported. So I had to compile it first.

Compiling TensorFlow

https://www.tensorflow.org/install/source described the compile process reasonable well. It is missing a lot of details though, so here is a more detailed walk-through. Start with a Ubuntu 20.xx image with an extra 70 GB disk for TensorFlow source code:

# One-time action: for the data disk, create a volume and a filesystem
# to mount under /data

sudo bash
pvcreate /dev/nvme1n1
vgcreate vg_data /dev/nvme1n1
lvcreate -L69G -n data vg_data
mke2fs -j /dev/vg_data/data
mkdir /data
echo -e '/dev/mapper/vg_data-data\t/data\text4\tdefaults\t0 1' >>/etc/fstab
mount /data
chown ubuntu:users /data
umount /data
exit

sudo apt update
sudo apt -y upgrade
sudo reboot

After a reboot, you now have a /data of about 70GB.

sudo apt -y install build-essential python3 python3-dev python3-venv pkg-config zip zlib1g-dev unzip curl tmux wget vim git htop liblapack3 libblas3 libhdf5-dev openjdk-11-jdk

# Get bazel

wget https://github.com/bazelbuild/bazel/releases/download/4.2.2/bazel-4.2.2-linux-arm64
chmod a+x bazel-4.2.2-linux-arm64
sudo cp bazel-4.2.2-linux-arm64 /usr/local/bin/bazel

# bazel uses ~/.cache/bazel

mkdir -p /data/.cache/bazel
ln -s /data/.cache/bazel ~/.cache/bazel

# Build a Python 3 virtual environment

python3 -m venv ~/venv
source ~/venv/bin/activate
pip install wheel packaging
pip install six mock numpy grpcio h5py
pip install keras_applications --no-deps
pip install keras_preprocessing --no-deps

# Get TensorFlow source

cd /data
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow/
git checkout r2.8
cd /data/tensorflow

./configure

# Build Python package:

bazel build -c opt \
--copt=-O3 \
--copt=-std=c++11 \
--copt=-funsafe-math-optimizations \
--copt=-ftree-vectorize \
--copt=-fomit-frame-pointer \
--copt=-DRASPBERRY_PI \
--host_copt=-DRASPBERRY_PI \
--verbose_failures \
--config=noaws \
--config=nogcp \
//tensorflow/tools/pip_package:build_pip_package

# Build Python whl:

BDIST_OPTS="--universal" bazel-bin/tensorflow/tools/pip_package/build_pip_package ~/tensorflow_pkg

# And for tfjs:
# (see https://github.com/tensorflow/tfjs/tree/master/tfjs-node)

bazel build --config=opt --config=monolithic //tensorflow/tools/lib_package:libtensorflow
# The result is at bazel-bin/tensorflow/tools/lib_package/libtensorflow.tar.gz

It does take a lot of time (about 2-3h for each the Python package and the tfjs-node library). When I tried 2 CPU and 8 GB RAM, some compiler runs were killed as they were running out of memory. 4 CPU and 16 GB RAM worked fine.Thus AWS m6g.xlarge recommended. m6g.large failed to build.

Using spot instances for the m6g.xlarge (regular $0.154/h, spot price $0.04/h) helped a bit to limit the financial impact.

Python and TensorFlow

It took me several tries:

  • When using Ubuntu 22.04 to compile TF, the resulting binary wanted GLIBC 2.35 which my VIM3L did not have. It had 2.31. It also used Python 3.10 to compile.
  • When using Ubuntu 20.04, it used Python 3.8 to compile. My VIM3L had Python 3.9. While GLIBC was fine, Python was not.
  • The created whl file could be loaded and used on the machine I compiled it on. No Python or GLIBC version problems here.

That covered all my Python needs. Now moving to the main target:

Node.js and TensorFlow

tfjs-node uses the libtensorflow.so library, so that should remove some of the CPython version problems I have seen. Compiling was easy now: https://github.com/tensorflow/tfjs/tree/master/tfjs-node#optional-build-optimal-tensorflow-from-source is spot on.

The biggest problem was to make Node.js understand to not use the non-existing arm64 pre-compiled library, but instead use the one I created. The instructions in the above link did not explain in enough details how to make this work. In hindsight it’s easy, but it took some tries to make me understand it. In short:

  • Do an “npm install –ignore script”
  • Add a file scripts/custom-binary.json into the modules directory for @tensorflow/tfjs-node (this gave me the hint)
  • Run “npm install” in the tfjs-node directory
  • That will download the tensorflow library archive
  • Now do the “npm install” where your application is (which is the only “npm install” you’d usually do)
❯ npm install --ignore-script
❯ pushd .
❯ cd node_modules/@tensorflow/tfjs-node/scripts
❯ cat >>custom-binary.json <<_EOF_
{
  "tf-lib": "https://MYSERVER.com/libtensorflow-2.8-arm64.tar.gz"
}
_EOF_
❯ cd ..
❯ npm install
[...]
> @tensorflow/tfjs-node@3.16.0 install
> node scripts/install.js

CPU-linux-3.16.0.tar.gz
* Downloading libtensorflow
https://MYSERVER/libtensorflow-2.8-arm64.tar.gz
[==============================] 3685756/bps 100% 0.0s
* Building TensorFlow Node.js bindings
[...]
❯ popd
❯ npm install

Benchmarks

As a benchmark I modified slightly server.js from the tfjs-examples/baseball-node to not listen to the port which means after the training it’ll exit. Then run this on the VIM3L (S905D3), my ThinkCentre m75q (Ryzen 5), and my HP T620 (GX-420CA) once with CPU backend (tfjs) and once with the C++ TF library (tfjs-node):

CPUBackendTime/s
Amlogic S905D3-N0N @ 1.9GHzcpu803
Amlogic S905D3-N0N @ 1.9GHztensorflow189
AMD Ryzen 5 PRO 3400GE @ 3.3GHzcpu122
AMD Ryzen 5 PRO 3400GE @ 3.3GHztensorflow36
AMD GX-420CA @ 2GHzcpu530
AMD GX-420CA @ 2 GHztensorflow119
All running Node.js 16.x

I did not expect Node.js to be just 4 times slower than C++. Really impressive. Still, using tfjs-node makes a lot of sense. While on x86_64 this was not an issue, with above instructions it’s doable on arm64 too.

TP-Link Kasa KC120 – Streaming without Kasa

The main problems I have with IoT devices are:

  • They might send data home without me knowing about it
    • But I can monitor their traffic pattern and if they send home way more data than expected, I could disconnect them
  • They might be vulnerable to exploits
    • But I can put them on a separate VLAN at home so they don’t see other devices unless I allow it (via firewall rules)
    • I can sometimes update firmware (definitely a problem after few years)
  • They stop to work when the company turns off their servers
    • I am able to use them without Internet connectivity

Most Kasa products I own (power switches) are supported by various projects like Home Assistant or python-kasa, so turning on my Kasa power switch on my own is a simple task. Same for my LIFX light bulbs there’s even an official API.

The TP-Link KC120 camera however does not have any supported local API and contrary to my expectation, it does not support a local stream mode via a web browser interface. I can watch a live (and local) video stream via the Kasa application on the phone, but that functionality is at the mercy of TP-Link. I don’t like that.

Following are the steps to have local streaming (resp. recording) for the KC120. And with that it’s possible to do whatever I’d like to do with the stream: publishing on the Internet, processing via OpenCV, local archiving etc.

python-kasa

python-kasa does not support the camera, so you won’t see it during a normal discovery:

❯ kasa
No host name given, trying discovery..
Discovering devices on 255.255.255.255 for 3 seconds
== Plug Three - HS105(JP) ==
        Host: 192.168.21.180
        Device state: OFF

        == Generic information ==
        Time:         2022-05-03 11:37:55 (tz: {'index': 90, 'err_code': 0}
        Hardware:     2.1
        Software:     1.0.3 Build 210506 Rel.161924
        MAC (rssi):   10:27:F5:XX:XX:XX (-62)
        Location:     {'latitude': XX.0, 'longitude': XX.0}

        == Device specific information ==
        LED state: True
        On since: None

        == Modules ==
        + <Module Schedule (schedule) for 192.168.21.130>
        + <Module Usage (schedule) for 192.168.21.130>
        + <Module Antitheft (anti_theft) for 192.168.21.130>
        + <Module Time (time) for 192.168.21.130>
        + <Module Cloud (cnCloud) for 192.168.21.130>

== Plug One - HS105(JP) ==
        Host: 192.168.21.182
        Device state: OFF

        == Generic information ==
        Time:         2022-05-03 11:37:55 (tz: {'index': 90, 'err_code': 0}
        Hardware:     1.0
        Software:     1.5.8 Build 191125 Rel.135255
        MAC (rssi):   B0:BE:76:XX:XX:XX (-54)
        Location:     {'latitude': XX.0, 'longitude': XX.0}

        == Device specific information ==
        LED state: True
        On since: None

        == Modules ==
        + <Module Schedule (schedule) for 192.168.21.182>
        + <Module Usage (schedule) for 192.168.21.182>
        + <Module Antitheft (anti_theft) for 192.168.21.182>
        + <Module Time (time) for 192.168.21.182>
        + <Module Cloud (cnCloud) for 192.168.21.182>

But the camera shows up with an additional -d switch, although it’s being ignored since the tool does not know how to handle it:

❯ kasa -d
No host name given, trying discovery..
Discovering devices on 255.255.255.255 for 3 seconds
DEBUG:kasa.discover:[DISCOVERY] ('255.255.255.255', 9999) >> {'system': {'get_sysinfo': None}}
DEBUG:kasa.discover:Waiting 3 seconds for responses...
[...]
DEBUG:kasa.discover:Unable to find device type from {'system': {'get_sysinfo': {'err_code': 0, 'system': {'sw_ver': '2.3.6 Build 20XXXXXX rel.XXXXX', 'hw_ver': '1.0', 'model': 'KC120(EU)', 'hwId': 'CBXXXXD5XXXXDEEFA98A18XXXXXX65CD', 'oemId': 'A2XXXX60XXXX108AD36597XXXXXX572D', 'deviceId': '80XXXX88XXXX76XXXX88XXXXX3AXXXXXXXXXXXB6', 'dev_name': 'Kasa Cam', 'c_opt': [0, 1], 'f_list': [], 'a_type': 2, 'type': 'IOT.IPCAMERA', 'alias': 'Camera', 'mic_mac': 'D80D17XXXXXX', 'mac': 'D8:0D:17:XX:XX:XX', 'longitude': XX, 'latitude': XX, 'rssi': -38, 'system_time': 1651545748, 'led_status': 'on', 'updating': False, 'status': 'configured', 'resolution': '720P', 'camera_switch': 'on', 'bind_status': True, 'last_activity_timestamp': 1651545210}}}}: Unable to find the device type field!
[...]

Important fields here are the deviceID and via the MAC address, you can find out what IP address the camera has (if you use DHCP). In my case 192.168.21.187 is the camera’s IP address.

nmap

nmap shows only port 9999 open which is the known TP-Link debug port. But there’s more ports:

❯ sudo nmap -p- 192.168.21.187
Starting Nmap 7.80 ( https://nmap.org ) at 2022-05-03 11:51 JST
Nmap scan report for kc120.lan (192.168.21.187)
Host is up (0.012s latency).
Not shown: 65531 closed ports
PORT      STATE SERVICE
9999/tcp  open  abyss
10443/tcp open  unknown
18443/tcp open  unknown
19443/tcp open  unknown
MAC Address: D8:0D:17:XX:XX:XX (Tp-link Technologies)

Nmap done: 1 IP address (1 host up) scanned in 9.28 seconds

And with that port information I found this article: https://medium.com/@hu3vjeen/reverse-engineering-tp-link-kc100-bac4641bf1cd. It’s about a slightly different camera model, but since the ports patch, maybe more does.

I followed it, however I could not get the authentication working: the Kasa account password as per article did not work. Time to do the ARP spoofing to see what the Android app uses to authenticate! Geistless did a great job explaining the steps he took.

My overall plan:

  1. Redirect the traffic from the Kasa app on the phone to my Linux machine (via arpspoof)
  2. Redirect the incoming HTTPS traffic to my HTTPS server (via iptables)
  3. Print the URL and headers for incoming HTTPS traffic which arrives at my HTTPS server

arpspoof

The dsniff package contains arpspoof:

❯ sudo apt install dsniff
[...]
❯ sudo setcap CAP_NET_RAW+ep /usr/sbin/arpspoof

My HTTPS Server

While the original author had a https server as part of his Rust learning, I created a NodeJS version. But first we’ll need keys. Self-signed is fine:

❯ openssl genrsa -out key.pem
❯ openssl req -new -key key.pem -out csr.pem
❯ openssl x509 -req -days 999 -in csr.pem -signkey key.pem -out cert.pem
❯ rm csr.pem

Now the simple HTTPS server listening on port 8080:

const https = require('https');
const fs = require('fs');

const options = {
  key: fs.readFileSync('key.pem'),
  cert: fs.readFileSync('cert.pem')
};

https.createServer(options, function (req, res) {
  console.log(req.url);
  console.log(req.headers);
  res.writeHead(200);
  res.end("");
}).listen(8080);

Some IP traffic routing rules to redirect all incoming TCP traffic on enp1s0 for ports 10443, 18443 and 19443 to port 8080:

❯ sudo iptables -t nat -A PREROUTING -i enp1s0 -p tcp --dport 10443 -j REDIRECT --to-port 8080
❯ sudo iptables -t nat -A PREROUTING -i enp1s0 -p tcp --dport 18443 -j REDIRECT --to-port 8080
❯ sudo iptables -t nat -A PREROUTING -i enp1s0 -p tcp --dport 19443 -j REDIRECT --to-port 8080
❯ sudo sysctl net.ipv4.ip_forward=1

Now run the https server and watch it display the URL and the headers for an incoming request on port 19443:

❯ node ./https.js

and to test, on another machine I ran:

$ curl -k -u admin:abc 'https://t621.lan:19443/test?a=3&b=5'

and this is the output of my https server:

/test?a=3&b=5
{
  host: 't621.lan:19443',
  authorization: 'Basic YWRtaW46YWJj',
  'user-agent': 'curl/7.68.0',
  accept: '*/*'
}

The basic authentication is base64 encoded. To decode:

❯ echo YWRtaW46YWJj | base64 -d
admin:abc

So that works. Now putting it all together.

  • Start the Kasa app on the phone. Make sure the KC120 is enabled and can display a live video stream. Stop the stream.
  • Have the iptables redirect rules in place. And IP forwarding in the kernel.
  • Start the HTTPS server.
  • Run arpspoof. 192.168.21.55 is the phone’s IP which runs the Kasa application. 192.168.21.187 is the IP of the KC120.
❯ arpspoof -i enp1s0 -t 192.168.21.55 192.168.21.187
7c:d3:a:xx:xx:xx 38:78:62:xx:xx:xx 0806 42: arp reply 192.168.21.187 is-at 7c:d3:a:xx:xx:xx
  • On the mobile app, try to connect to the video stream of the KC120 again
  • You should now see some output of the HTTPS server:
/https/stream/mixed?video=H264&audio=G711
{
  authorization: 'Basic aXXXXXXXXXXXXXXXXM=',
  connection: 'keep-alive',
  'user-agent': 'Dalvik/2.1.0 (Linux; U; Android 10; H8296 Build/52.1.A.3.49)',
  host: '192.168.21.187:19443',
  'accept-encoding': 'gzip'
}

And then I finally had the authentication string the camera wanted!

❯ echo 'aXXXXXXXXXXXXXXXXM=' | base64 -d
MY_KASA_ACCOUNT:THE_CAMERA_PASSWORD

Turns out that the password to use was not the Kasa password: it’s a longish string of hex digits. That might be a KC120 specialty or it might depend on the firmware version. I cannot say since I have no KC100, but whatever the password is, it’s possible to find out relatively easily using above approach.

The Result: Local Streaming!

I can connect to the video stream! And with very little CPU usage too.

❯ curl -k -u 'MY_KASA_ACCOUNT:THE_CAMERA_PASSWORD' \
--ignore-content-length \
"https://192.168.21.187:19443/https/stream/mixed?video=h264&audio=g711&resolution=hd&deviceId=80XXXX88XXXX76XXXX88XXXXX3AXXXXXXXXXXXB6" \
--output - | ffmpeg -hide_banner -y -i - -vcodec copy kc120stream.mp4

To change resolution, change it in the Kasa app. 1920×1080 (1.4Mb/s), 1280×720 (850kbit/s) and 640×360 (350kbit/s) are possible.

TODO

  • There is no audio coming from the camera. Audio works on the Kasa app.
  • It would also be nice to understand how to change the configuration of the camera (e.g. change resolution), but it’s ok to set them once via the Kasa app.
  • What options do the parameter video, audio and resolution support?

Moving Things

Servos are great to rotate/move things around, but they are limited in their capabilities. Steppers are more versatile and controlling them is not hard with the help of stepper driver modules. But since they do expect a fairly high rate of step pulses, a dedicated controller is needed. This is a solved problem though: GRBL takes care of that and it accepts G-Code which looks like this:

G0X100

To move the X axis to the position at 100mm. Generating movements are simply a stream of such strings. Sending a

G0X100.5

after 1 second will results in a moving speed of 0.5mm/s. A nice part of GRBL is that it also controls acceleration and deceleration. Important for moving heavy objects for long distances at high speed.

But traditional GRBL uses an Arduino which is not network connected. Luckily GRBL was ported to the ESP32 CPU with its WiFi interface. Even better: FluidNC was created improving on a lot of areas, like configuration (no need to recompile for a config change) and connectivity (IP or Bluetooth and of course serial).

Naturally that looked like an interesting thing to try out.

Hardware

  • A NEMA17 stepper (200 steps/rotation) with a timing belt moving a slider along an aluminium profile
  • A stepper driver (DRV8825 I think I use)
  • Makerbase MKS DLC32
  • A end-stop sensor (microswitch in my case)

Configuration

  • Get the FluidNC firmware from here
  • Erase the FLASH on the ESP32 with the included erase script (on Windows: run erase.bat)
  • Flash the WiFi version (on Windows: run install-wifi.bat)
  • You should now be able to connect via fluidterm.bat and for any debugging this is very helpful as you can see the boot process and early errors.
  • Configure WiFi according to this. That should be it as most default are sensible and thus not much to configure beside the SSID and the password:
$Sta/SSID=myssid
$Sta/Password=mypasswordforthessid
  • Reboot ($Bye) and check network parameters ($I):
$I
[VER:3.4 FluidNC v3.4.3:]
[OPT:PHS]
[MSG: Machine: Slider]
[MSG: Mode=STA:SSID=myssid:Status=Connected:IP=192.168.3.18:MAC=66-55-44-33-22-11]
ok
  • Connect to the Web UI at http://192.168.3.18 (the IP you get via $I obviously)
  • Upload a file for the configuration for the MKS DLC32 and the hardware setup you have. In my case: I only use the x-axis, so my config file looks like below. It’s almost 100% of the example config and the main changes are:
    • idle_ms=255 which keeps the stepper powered forever so it can hold things in place
    • steps_per_mm and max)travel_mm for the x-axis to match my hardware
    • turn off homing for y and z axis since I don’t use them
board: MKS-DLC32 V2.1
name: Slider
meta: (01.01.2022) by Skorpi

kinematics:
  Cartesian:

stepping:
  engine: I2S_STREAM
  idle_ms: 255
  pulse_us: 4
  dir_delay_us: 1
  disable_delay_us: 0
axes:
  shared_stepper_disable_pin: I2SO.0
  x:
    steps_per_mm: 40.7
    max_rate_mm_per_min: 15000.000
    acceleration_mm_per_sec2: 500.000
    max_travel_mm: 440.000
    soft_limits: true
    homing:
      cycle: 1
      positive_direction: false
      mpos_mm: 0.000
      feed_mm_per_min: 300.000
      seek_mm_per_min: 5000.000
      settle_ms: 500
      seek_scaler: 1.100
      feed_scaler: 1.100

    motor0:
      limit_neg_pin: gpio.36
      hard_limits: true
      pulloff_mm: 2.000
      stepstick:
        step_pin: I2SO.1
        direction_pin: I2SO.2

  y:
    steps_per_mm: 428.0
    max_rate_mm_per_min: 12000.000
    acceleration_mm_per_sec2: 300.000
    max_travel_mm: 440.000
    soft_limits: true
    homing:
      cycle: 0
      positive_direction: false
      mpos_mm: 0.000
      feed_mm_per_min: 300.000
      seek_mm_per_min: 5000.000
      settle_ms: 500
      seek_scaler: 1.100
      feed_scaler: 1.100

    motor0:
      limit_neg_pin: gpio.35
      hard_limits: false
      pulloff_mm: 2.000
      stepstick:
        step_pin: I2SO.5
        direction_pin: I2SO.6:low

  z:
    steps_per_mm: 157.750
    max_rate_mm_per_min: 12000.000
    acceleration_mm_per_sec2: 500.000
    max_travel_mm: 80.000
    soft_limits: true
    homing:
      cycle: 0
      positive_direction: false
      mpos_mm: 0.000
      feed_mm_per_min: 300.000
      seek_mm_per_min: 1000.000
      settle_ms: 500
      seek_scaler: 1.100
      feed_scaler: 1.100

    motor0:
      limit_neg_pin: gpio.34
      hard_limits: false
      pulloff_mm: 1.000
      stepstick:
        step_pin: I2SO.3
        direction_pin: I2SO.4

i2so:
  bck_pin: gpio.16
  data_pin: gpio.21
  ws_pin: gpio.17

spi:
  miso_pin: gpio.12
  mosi_pin: gpio.13
  sck_pin: gpio.14

sdcard:
  cs_pin: gpio.15
  card_detect_pin: NO_PIN

control:
  safety_door_pin: NO_PIN
  reset_pin: NO_PIN
  feed_hold_pin: NO_PIN
  cycle_start_pin: NO_PIN
  macro0_pin: gpio.33:low:pu
  macro1_pin: NO_PIN
  macro2_pin: NO_PIN
  macro3_pin: NO_PIN

macros:
  startup_line0:
  startup_line1:
  macro0: $SD/Run=lasertest.gcode
  macro1: $SD/Run=home.gcode
  macro2:
  macro3:

coolant:
  flood_pin: NO_PIN
  mist_pin: NO_PIN
  delay_ms: 0

probe:
  pin: gpio.22
  check_mode_start: true

Laser:
  pwm_hz: 5000
  #L on Beeper / IN on TTL
  output_pin: gpio.32
  enable_pin: I2SO.7
  disable_with_s0: false
  s0_with_disable: false
  tool_num: 0
  speed_map: 0=0.000% 0=12.500% 1700=100.000%
# 135=0mA 270=5mA 400=10mA 700=16mA
user_outputs:
  analog0_pin: NO_PIN
  analog1_pin: NO_PIN
  analog2_pin: NO_PIN
  analog3_pin: NO_PIN
  analog0_hz: 5000
  analog1_hz: 5000
  analog2_hz: 5000
  analog3_hz: 5000
  digital0_pin: NO_PIN
  digital1_pin: NO_PIN
  digital2_pin: NO_PIN
  digital3_pin: NO_PIN

start:
  must_home: false

  • When done, name the file you just uploaded:
$Config/Filename=config2.yaml

  • Then you have to “Home” once so the controller knows where everything is (using telnet for a change since network is up now):
❯ telnet 192.168.3.18 23
Trying 192.168.3.18...
Connected to 192.168.3.18.
Escape character is '^]'.

Grbl 3.4 [FluidNC v3.4.3 (wifi) '$' for help]
$H
ok
?
<Idle|MPos:0.000,0.000,0.000|FS:0,0|Pn:PYZ|Ov:100,100,100>
ok
  • and now you should be able to move the slider via very simple G-Code (x axis to 100mm position):
G0X100
ok
  • If you get an error for the $H command, it’s likely that you don’t have a working end-stop for the axis which are supposed to have one. A quick fix is to use $X to disable end-stop checks. It’ll allow axis movements, but it does no checks for movements.

Node.js sending commands

GRBL has no single command to do a slow controlled motion, so in order to do that, a program needs to send G-Code commands to it. Node.js to the rescue! Below test program moves the slider 2 times back and forth and when done, it closes the connection:

// Test to send commands to GRBL (FluidNC)

const net=require('net');

let stateIsIdle=false;
let statusLine='';

function gotALine(s) {
  console.log('Got a line: '+s);
  if (s.startsWith('<Idle|')) {
    if (stateIsIdle==true) {
      console.log('Idle detected again');
      client.end();
      process.exit(0);
    } else {
      stateIsIdle=true;
      console.log('Idle detected');
    }
  }
}

let client=new net.Socket();
client.connect(23, '192.168.21.118', () => { console.log('Got connected'); });
client.on('data', (data) => {
  let s=data.toString();
  if (s.indexOf('\n') < 0) {
    statusLine+=s;
  } else {
    statusLine+=s;
    gotALine(statusLine.trim());
    statusLine='';
  }
});

client.on('close', () => { console.log('Closed connection'); });

function sendStatusRequest() {
  if (client) client.write('?\n');
}

setInterval(sendStatusRequest, 1000);

for (let i=0; i<2; ++i) {
  client.write('G0X0\n');
  client.write('G0X400\n');
}

Problems

  • When requesting a status via ‘?’, it seems the stepper steps take a short break which causes a jerky movement. This is very reproducible. Issue created for this. Using I2S_STREAM helps a lot, but it’s not 100% fixed. I2S_STREAM has another problem though…
  • I2S_STREAM seems to be inaccurate: moving 4 times 100mm and then moving back to 0 leaves several mm missing. The same test with I2S_STATIC shows zero error.

U2F on the CLI

U2F works well and easily via a web browser, but you can also use it directly on the command line. You “just” have to implement the USB protocol part of U2F, namely talk to /dev/hidrawX.

u2fcli did that and it worked on my R2S (ARMv8):

harald@r2s2:~/git$ git clone git@github.com:mdp/u2fcli.git
Cloning into 'u2fcli'...
remote: Enumerating objects: 57, done.
remote: Total 57 (delta 0), reused 0 (delta 0), pack-reused 57
Receiving objects: 100% (57/57), 19.26 KiB | 1.20 MiB/s, done.
Resolving deltas: 100% (21/21), done.
harald@r2s2:~/git$ cd u2fcli
harald@r2s2:~/git/u2fcli$ go mod init u2fcli
go: creating new go.mod: module u2fcli
go: to add module requirements and sums:
        go mod tidy
harald@r2s2:~/git/u2fcli$ go mod tidy
go: finding module for package github.com/flynn/u2f/u2ftoken
go: finding module for package github.com/flynn/hid
go: finding module for package github.com/mdp/u2fcli/cmd
go: finding module for package github.com/flynn/u2f/u2fhid
go: finding module for package github.com/spf13/cobra
go: found github.com/mdp/u2fcli/cmd in github.com/mdp/u2fcli v0.0.0-20180327171945-2b7ae3bbca08
go: found github.com/flynn/hid in github.com/flynn/hid v0.0.0-20190502022136-f1b9b6cc019a
go: found github.com/flynn/u2f/u2fhid in github.com/flynn/u2f v0.0.0-20180613185708-15554eb68e5d
go: found github.com/flynn/u2f/u2ftoken in github.com/flynn/u2f v0.0.0-20180613185708-15554eb68e5d
go: found github.com/spf13/cobra in github.com/spf13/cobra v1.2.1
harald@r2s2:~/git/u2fcli$ go build
harald@r2s2:~/git/u2fcli$ ls
cmd  go.mod  go.sum  LICENSE  main.go  README.md  u2fcli

Permissions for /dev/hidrawX needs to be given:

harald@r2s2:~/git/u2fcli$ sudo chmod a+rw /dev/hidraw0

And now a full cycle of register (once), sign+verify (log in):

harald@r2s2:~/git/u2fcli$ ./u2fcli reg --challenge MyComplexChallenge --appid https://test.com
Registering, press the button on your U2F device #1 [Yubico Security Key by Yubico]{
  "KeyHandle": "-374aUcG7iWqVc5rsX8jE_8yr1iS-EEDdt106-CAKec90Gg1VVK9dv5E_JmZRIyKVaas9vhLVHb7zbbJ6rNltg",
  "PublicKey": "BHBwVKLRYZZKZGaL96FQtzis8i01M2DMw4IQwuMIKbWa2dZJSC1GlXlYiWhycig4R3DdlipdR675o_e4QfpI-UU",
  "RegisteredData": "-374aUcG7iWqVc5rsX8jE_8yr1iS-EEDdt106-CAKec90Gg1VVK9dv5E_JmZRIyKVaas9vhLVHb7zbbJ6rNltjCCAr4wggGmoAMCAQICBHSG_cIwDQYJKoZIhvcNAQELBQAwLjEsMCoGA1UEAxMjWXViaWNvIFUyRiBSb290IENBIFNlcmlhbCA0NTcyMDA2MzEwIBcNMTQwODAxMDAwMDAwWhgPMjA1MDA5MDQwMDAwMDBaMG8xCzAJBgNVBAYTAlNFMRIwEAYDVQQKDAlZdWJpY28gQUIxIjAgBgNVBAsMGUF1dGhlbnRpY2F0b3IgQXR0ZXN0YXRpb24xKDAmBgNVBAMMH1l1YmljbyBVMkYgRUUgU2VyaWFsIDE5NTUwMDM4NDIwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAASVXfOt9yR9MXXv_ZzE8xpOh4664YEJVmFQ-ziLLl9lJ79XQJqlgaUNCsUvGERcChNUihNTyKTlmnBOUjvATevto2wwajAiBgkrBgEEAYLECgIEFTEuMy42LjEuNC4xLjQxNDgyLjEuMTATBgsrBgEEAYLlHAIBAQQEAwIFIDAhBgsrBgEEAYLlHAEBBAQSBBD4oBHzjApNFYAGFxEfntx9MAwGA1UdEwEB_wQCMAAwDQYJKoZIhvcNAQELBQADggEBADFcSIDmmlJ-OGaJvWn9CqhvSeueToVFQVVvqtALOgCKHdwB-Wx29mg2GpHiMsgQp5xjB0ybbnpG6x212FxESJ-GinZD0ipchi7APwPlhIvjgH16zVX44a4e4hOsc6tLIOP71SaMsHuHgCcdH0vg5d2sc006WJe9TXO6fzV-ogjJnYpNKQLmCXoAXE3JBNwKGBIOCvfQDPyWmiiG5bGxYfPty8Z3pnjX-1MDnM2hhr40ulMxlSNDnX_ZSnDyMGIbk8TOQmjTF02UO8auP8k3wt5D1rROIRU9-FCSX5WQYi68RuDrGMZB8P5-byoJqbKQdxn2LmE1oZAyohPAmLcoPO4wRgIhANIZ7Q_cty_UkWigyQ7Ot0pC0egyI_eSUJ52Hge95vz1AiEAzf7hX_XvNQvoPQ2IvjgJjUkV3wvDPctkac2Z_8fRaik"
}

harald@r2s2:~/git/u2fcli$ ./u2fcli sig --appid https://test.com --challenge SomethingElse --keyhandle "-374aUcG7iWqVc5rsX8jE_8yr1iS-EEDdt106-CAKec90Gg1VVK9dv5E_JmZRIyKVaas9vhLVHb7zbbJ6rNltg"
Authenticating, press the button on your U2F device
{
  "Counter": 50,
  "Signature": "AQAAADIwRQIhALlZyMmormC2b9JCaOXYAdKq4wvpdKg4wMu68fLgXmclAiADDHbFxKrm5eYCoCvC-m1vEEegXzWHfwuPLpUh81qHoA"
}

harald@r2s2:~/git/u2fcli$ ./u2fcli ver --appid https://test.com --challenge SomethingElse --publickey "BHBwVKLRYZZKZGaL96FQtzis8i01M2DMw4IQwuMIKbWa2dZJSC1GlXlYiWhycig4R3DdlipdR675o_e4QfpI-UU" --signature "AQAAADIwRQIhALlZyMmormC2b9JCaOXYAdKq4wvpdKg4wMu68fLgXmclAiADDHbFxKrm5eYCoCvC-m1vEEegXzWHfwuPLpUh81qHoA"
Signature verified

Jest and ES6 Modules and userscripts

Greasemonkey (and thus userscripts) started when JavaScript was old and jQuery was well used. Since ES6 (ES2015) we have module imports, ES2017 brought us async/await and generally plenty useful features which should be used on modern browsers. Since Tampermonkey is only for newer browsers, we can use all those nice features instead of relying on old methods. Also testing…it’s a good thing once you get used to it. So developing userscripts nowadays should mean:

  • create testable code
  • use ES2017 code with module imports and async/await and fetch
  • no need for Babel and WebPack or jQuery

In the past I pretty much ignored the front-end, but now I have a use-case at work, so time to get the coding started: https://github.com/haraldkubota/userscripts-jest

GitHub Actions

For work stuff we use Jenkins as CI/CD solution. No choice. It works though, so no complains either.

On the Internet there’s many possible solutions:

Which one to use? While I did not dig too deep into each one, they are all basically very similar, simply because the problem is clear and a “solved problem”. And yes, I simplify things here a lot:

  • You do good CI if you merge often. The common solution is to use git, create short-lived branches and merge them (depending on your development strategy)
  • Run linters, formatters and tests upon a code commit (and verify during a push upstream)

The CD part is different and depends a lot on the back-end how and where your application runs. Once you have an application, a zip artifact, a container image etc., deploying those is not a technical problem. The strategy how to move from QA to PROD is an entirely different problem but it very much depends on your back-end. Kubernetes has its ArgoCD/FluxCD while in a non-container environment you have to use other solutions.

Back to GitHub Actions

I never needed it: as the only developer for my little software world, I use a locally hosted git repo. Some items I put on GitHub, but only when I think someone can benefit of it. This is a hackathon-starter template from here (quite nice) which is a NodeJS app with plenty dependencies and it served as my test sample to play with Git Hub Actions.

And this is the result:

name: NodeJS Steps
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        node: [ '14' ]
    name: My Build
    steps:
      - uses: actions/checkout@v2
      - name: Setup node
        uses: actions/setup-node@v2
        with:
          node-version: ${{ matrix.node }}
      - run: npm install
      - run: npm test
  docker:
    needs: build
    runs-on: ubuntu-latest
    name: Docker Build and Push
    steps:
      - uses: actions/checkout@v2
      - name: Build and push
        id: docker_build
        uses: mr-smithers-excellent/docker-build-push@v5
        with:
          image: hkubota/hackathon
          registry: docker.io
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}

Certain things I like a lot:

  • The matrix option is great to test various versions of (in this case) NodeJS
  • Jobs are easy to define incl. dependencies
  • Accessing secrets and env variables from GitHub is straightforward

Basically it’s like Jenkins. Which is nice. I do prefer GitHub Actions though as it’s one thing less to maintain for me.

Protocol Buffers and Node.js

Kafka can encode a message via Avro or Protocol Buffers which are both binary protocols. They are comparable to each other (see here) but since gRPC uses Protocol Buffers and it seems it can do anything Avro can do (plus more), maybe a good time to dig into it a bit.

For debugging, here is a web page which can decode a ProtoBuf message. Nice for debugging.

Google docs for ProtoBuf for JavaScript is here. Here a quick working example:

const messages = require('./addressbook_pb');

let message = new messages.Person();

message.setName("Harald K");
message.setId(321);
message.setEmail("harald@some.email.com");

let phone1 = new messages.Person.PhoneNumber();
phone1.setNumber("03-1122-3355");
phone1.setType(messages.Person.PhoneType.WORK);

let phone2 = new messages.Person.PhoneNumber();
phone2.setNumber("090-5566-7788");
phone2.setType(messages.Person.PhoneType.HOME);

message.addPhones(phone1);
message.addPhones(phone2);

console.log("message object:");
console.log(JSON.stringify(message));

console.log("message as JSON:");
console.log(JSON.stringify(message.toObject()));

let binBuffer=Buffer.from(message.serializeBinary());
console.log("Binary serialized:");
console.log(JSON.stringify(binBuffer));
console.log("And in hex:");
let s=""
for (const i of binBuffer) {
    s+=i.toString(16).padStart(2, '0')+" ";
}
console.log(s);

// Now let's convert the binary ProtoBuf message into a proper object again

let message2 = messages.Person.deserializeBinary(binBuffer);
console.log("Converted from binary:");
console.log(JSON.stringify(message2));
console.log("...and as Object:");
console.log(JSON.stringify(message2.toObject()));

The addressbook.proto is from here.

Kafka & Schema

Once you understand how Kafka works, it’s so easy to find use-cases for it. To learn to understand it better, a local install helps a lot though, plus some interactive tools and libraries to produce and consume data.

  1. Follow the Confluent quick start Docker demo
  2. Configure zoe
  3. See users and pageviews data via zoe
  4. Consume the same data via KafkaJS
  5. Make KafkaJS use the schema registry via confluent-schema-registry

Zoe config file. Note the KafkaAvroDeserializer.

❯ cat ~/.zoe/config/default.yml
---
clusters:
  default:
    props:
      bootstrap.servers: "t620.lan:9092"
      key.deserializer: "org.apache.kafka.common.serialization.StringDeserializer"
      value.deserializer: "io.confluent.kafka.serializers.KafkaAvroDeserializer"
      key.serializer: "org.apache.kafka.common.serialization.StringSerializer"
      value.serializer: "org.apache.kafka.common.serialization.ByteArraySerializer"
    registry: ${SCHEMA_REGISTRY:-http://t620.lan:8081}
    groups:
      mygroup: my-group-id
    topics:
      users:
        name: "users"
        subject: "users-value"
runners:
  default: "local"

View users via zoe:

❯ zoe --silent --cluster default topics consume users
{"registertime":1501533472288,"userid":"User_8","regionid":"Region_1","gender":"FEMALE"}
{"registertime":1511144207405,"userid":"User_4","regionid":"Region_7","gender":"FEMALE"}
{"registertime":1500937323185,"userid":"User_8","regionid":"Region_9","gender":"FEMALE"}
{"registertime":1492141732118,"userid":"User_5","regionid":"Region_8","gender":"FEMALE"}
{"registertime":1509714843903,"userid":"User_3","regionid":"Region_4","gender":"OTHER"}

And now via Node.js:

// Use with https://docs.confluent.io/platform/current/quickstart/cos-docker-quickstart.html
// And its users producer

const { Kafka } = require('kafkajs')
const { SchemaRegistry } = require('@kafkajs/confluent-schema-registry')

const kafka = new Kafka({ clientId: 'my-app', brokers: ['t620.lan:9092'] })
const registry = new SchemaRegistry({ host: 'http://t620.lan:8081/' })
const consumer = kafka.consumer({ groupId: 'test14-group' })

const run = async () => {
  await consumer.connect()
  await consumer.subscribe({ topic: 'users', fromBeginning: true })

  await consumer.run({
    eachMessage: async ({ topic, partition, message }) => {
      // const decodedKey = await registry.decode(message.key)
      const decodedKey = message.key.toString();
      const decodedValue = await registry.decode(message.value)
      console.log({ decodedKey, decodedValue })
      // console.log(`message=${JSON.stringify(message)}`)
    },
  })
}

run().catch(console.error)

which prints out users like

{
  decodedKey: 'User_2',
  decodedValue: Users {
    registertime: 1516972272723,
    userid: 'User_2',
    regionid: 'Region_5',
    gender: 'FEMALE'
  }
}

Note for ARM64 users

While I usually use my ThinkCentre for development work, I have a small ARMv8 machine which I use for small stuff. Node.js runs very well on it and so does Python.

However to install the Python package confluent-kafka, I am supposed to install the latest librdkafka-dev from here. Except this is amd64 only. So no Python confluent-kafka on ARMv8 unfortunately.

Google Cloud Platform

My AWS Certified Solution Architect – Professional is expiring in June! Since renewing it is a bit boring, it’s a great reason to get to know GCP better. I generally like their way of thinking more and today I understood why:

  • AWS has DevOps as their focus point for many products
  • GCP has the developer as the focus point for many products

Of course there’s plenty overlap, but the philosophy is fundamentally different. But that might just be my opinion. It would explain why I am more comfortable with AWS with my Sysadmin background, but more curious with GCP (as a wanna-be small-scale developer).

Pub/Sub

Beside creating VMs, traditionally one of the easiest ways to interact with a cloud environment is message queues. In GCP this is Pub/Sub. And it’s easy.

  1. Create a Topic. With a schema (to keep yourself sane).

Schema (AVRO):

{
  "type": "record",
  "name": "Avro",
  "fields": [
    {
      "name": "Sensor",
      "type": "string"
    },
    {
      "name": "Temp",
      "type": "int"
    }
  ]
}

Then you can publish via gcloud (thanks to Pavan for providing a working example):

❯ gcloud pubsub topics publish Temp --message='{"Sensor":"Storage","Temp":9}'

And in Node.js:

const {PubSub} = require('@google-cloud/pubsub');

function main(
  topicName = 'Temp',
  data = JSON.stringify({Sensor: 'Living room', Temp: 22})
) {

  const pubSubClient = new PubSub();

  async function publishMessage() {
    const dataBuffer = Buffer.from(data);

    try {
      const messageId = await pubSubClient.topic(topicName).publish(dataBuffer);
      console.log(`Message ${messageId} published.`);
    } catch (error) {
      console.error(`Received error while publishing: ${error.message}`);
      process.exitCode = 1;
    }
  }

  publishMessage();
}

process.on('unhandledRejection', err => {
  console.error(err.message);
  process.exitCode = 1;
});

main(...process.argv.slice(2));

And with plumber:

# Subscribe
❯ plumber read gcp-pubsub --project-id=training-307604 --sub-id=Temp2-sub -f

# Publish
❯ plumber write gcp-pubsub --topic-id=Temp --project-id=training-376841 --input-data='{"Sensor":"Kitchen","Temp":19}'

Mijia LYWSD03MMC, BLE and Node.js

That hard-to-remember product name is a Bluetooth LE enabled small thermometer and hygrometer from Xiaomi. The special thing about it is that Aaron Christophel created a GitHub repo about how to re-flash it with firmware which makes it way more useful: it now advertises the temperature, humidity and battery level which makes it very easy to pick up.

So I got 4 of those and re-flashed them:

  1. Access https://atc1441.github.io/TelinkFlasher.html with a WebBluetooth capable browser
  2. Click on Connect and connect
  3. Click on Do Activation
  4. Click on Choose firmware and use the ATC_Thermometer.bin from above repo
  5. Click on Start Flashing
  6. Wait about 60s
  7. Wait until it reboot. The display will show the 3 last bytes of the MAC address for easier reference later on.

Connect to the newly flashed device. You can change some settings with the buttons at the end of the web page (e.g. disable the “Show battery in LCD” if you measure and record it anyway).

Scripts for finding them and reading data out of them is here.

Reading them out is very simple with above scripts:

❯ node read-thermometer.js -s ATC_C1CADA,ATC_D01337 -e 10 -t
environment,host=m75q,sensor=ATC_C1CADA temp=22.5,humidity=37,battery=100
environment,host=m75q,sensor=ATC_D01337 temp=20.7,humidity=44,battery=100

which you can feed via telegraf. Here the telegraf.conf snippet:

[[inputs.exec]]
  commands = ["node ~/git/LYWSD03MMC/read-thermometer/read-thermometer.js -s ATC_C1CADA,ATC_D01337 -t"]
  timeout = "12s"
  data_format = "influx"

And the result are very nice graphs in Grafana (3 sensors):

Temperatures recorded via BLE from 3 LYWSD03MMC sensors