Home Alarm Analytics With AWS Kinesis

Home security system projects are fun because everything about them screams "1980s legacy hardware design". Nowhere else in the modern tech landscape does one program by typing in a three digit memory address and then entering byte values on a numeric keypad. There's no enter-key -- you fill the memory address. There's no display -- just eight LEDs that will show you a byte at a time, and you hope it's the address you think it is. Arduinos and the like are great for hobby fun, but these are real working systems whose core configuration you enter byte by byte.

The feature set reveals 30 years of crazy product requirements. You can just picture the well-meaning sales person who sold a non-existent feature to a huge potential customer, resulting in the boolean setting that lives at address 017 bit 4 and whose description in the manual is:

ON: The double hit feature will be enabled. Two violations of the same zone within the Cross Zone Timer will be considered a valid Police Code or Cross Zone Event. The system will report the event and log it to the event buffer. OFF: Two alarms from the same zone is not a valid Police Code or Cross Zone Event

I've built out alarm systems for three different homes now, and while occasionally frustrating it's always a satisfying project. This most recent time I wanted an event log larger than the 512 events I can view a byte at a time. The central dispatch service I use will sell me back my event log in a horrid web interface, but I wanted something programmatically accessible and ideally including constant status.

The hardware side of the solution came in the form of the Alarm Decoder from Nu Tech. It translates alarm panel keypad bus events into events on an RS-232 serial bus. That I'm feeding into a Raspberry Pi. From there the alarmdecoder package on PyPI lets me get at decoded events as Python objects. But, I wanted those in a real datastore.


Raspberry Pi UPS

I'm starting to do more on a raspberry pi I've got in the house, and I wanted it to survive short power outages. I looked at buying an off the shelf Uninteruptable Power Supply (UPS), but it just struck me as silly that I'd be using my house's 120V AC to power to fill a 12V DC battery to be run through an inverter into 120V AC again to be run through a transformer into DC yet again. When the house is out of power that seemed like a lot of waste.

A little searching turned up the PicoUPS-100 UPS controller. It seems like it's mostly used in car applications, but it has two DC inputs and one DC output and handles the charging and fast switching. The non-battery input needs to be greater than the desired 12 volts, so I ebayed a 15v power supply from an old laptop. I added a voltage regulator and buck converter to get solid 12v (router) and 5v (rpi) outputs. Then it caught on fire:

Scorched UPS controller

But I re-bought the charred parts, and the second time it worked just fine:

Working UPS setup


Pylint To Github

I spent a few hours trying to get the Jenkins Git & Github plugins to:

  • run pylint on all remote branch heads that:
    • arent' too old
    • haven't already had pylint run on them
  • send the repo status back to GitHub

I'm sure it's possible, but the Jenkins Git plugin doesn't like a single build to operate on multiple revisions. The repo statuses weren't posting, the wrong branches were getting built, and it was easier to write a quick script.

Now whenever someone pushes code at DramaFever pylint does its thing, and their most recent commit gets a green checkmark or a red cross. If/when they open a PR the status is already ready on the PR and warns folks not to merge it if pylint is going to fail the build. They can keep heaping on commits until the PR goes green.

I run it from Jenkins triggerd by a GitHub push hook, but it's setup so that even running it from cron on the minute is safe for those without a CI server yet.

Branches with green checks

Bitcoin Conversion In Google Spreadsheets

I've been using Charlie Lee's excellent Google Spreadsheet Bitcoin tracker sheet for awhile but it pulls data from a single exchange at a time and relies on the ordering of those exchanges on the bitcoinwatch.com site, which vary with volume.

I figured out I could get better numbers more reliably from bitcoinaverage.com, which (predictably) averages multiple exchanges over various time periods. They offer a great JSON API, but unfortunately Google spreadsheets only export JSON -- they don't have a function for importing it.

None the less I was able to fake it using a regex. You can pull the 24 hour average price in with this forumla:

=regexextract(index(importdata("https://api.bitcoinaverage.com/ticker/USD"),2,1), ": (.*),")+0

If you want that to update live (not just when you open the spreadsheet) you need to use Charlie's hack to get the sheet to think the formula depends on live stock data:

    NOW()*1E3)&REPT(GoogleFinance("GOOG");0)),2,1), ": (.*),")+0

I've put together a sample spreadsheet based on Charlie's.

Occuped: Twine + Go + App Engine

In our NY office We've got 40 people working in a space with two bathrooms. Walking to the bathrooms, finding them both occupied, and grabbing a snack instead is a regular occurrence. For a lark I took a Twine with the breakout board and a few magnetic switches and connected them to the over taxed bathroom doors.

The good folks at Twine will invoke a web hook on state change, so I created a tiny webapp in Go that takes the GET from Twine and stashes it in the App Engine datastore. I wrote a cheesy web front end to show the current state based on the most recent change. It also exposes a JSON API, allowing my excellent coworkers to build a native OS X menulet and a much nicer web version.

Occupied light

Crossed Lamps

Last weekend we bought two Rodd lamps at Ikea for the guest room, and it struck me how amused I'd be if each one switched the other. Six hours and a few new parts later, and it came out pretty well:

The remote action is especially jarring because the switches are right next to the bulbs they would normally control:

Ikea Rodd lamp head


Amazon S3 as Append Only Datastore

As a hack, when I need an append-only datastore with no authentication or validation, I use Amazon S3. S3 is usually a read-only service from the unauthenticated web client's point of view, but if you enable access logging to a bucket you get full-query-parameter URLs recorded in a text file for GETs that can come from a form's action or via XHR.

There aren't a lot of internet-safe append-only datastores out there. All my favorite noSQL solutions divide permissions into read and/or write, where write includes delete. SQL databases let you grant an account insert without update or delete, but still none suggest letting them listen on a port that's open to the world.

This is a bummer because there are plenty of use cases when you want unauthenticated client-side code to add entries to a datastore, but not read or modify them: analytics gathering, polls, guest books, etc. Instead you end up with a bit of server side code that does little more than relay the insert to the datastore using over-privileged authentication credentials that you couldn't put in the client.

To play with this, first, create a file named vote.json in a bucket with contents like {"recorded": true}, make it world readable, set Cache-Control to max-age=0,no-cache and Content-Type to application/json. Now when a browser does a GET to that file's https URL, which looks like a real API endpoint, there's a record in the bucket's log that looks something like:

bucketname [31/Jan/2013:18:37:13 +0000]
- 289335FAF3AD11B1 REST.GET.OBJECT vote.json "GET
  /bucketname/vote.json?arg=val&arg2=val2 HTTP/1.1" 200 - 12 12
9 8 "-" "lwp-request/6.03 libwww-perl/6.03" -

The full format is described by Amazon, but with client IP and user agent you have enough data for basic ballot box stuffing detection, and you can parse and tally the query arguments with two lines of your favorite scripting language.

This scheme is especially great for analytics gathering because and one never has to worry about full log on disks, load balancers, backed up queues, or unresponsive data collection servers. When you're ready to process the data it's already on S3 near EC2, AWS Data Pipeline or Elastic MapReduce. Plus, S3 has better uptime than anything else Amazon offers, so even if your app is down you're probably recording the failed usage attempts.

Creating Burn Down Charts for GitHub Repositories Using Google Apps Script

At DramaFever I got folks to buy into to burn down charts as the daily display for our weekly sprints with the rotating release person being responsible for updating a Google spreadsheet with each day's end-of-day open and closed issue counts.

It works fine, and it's only a small time burden, but if one forgets there's no good way to get the previous day's counts. Naturally I looked to automation, and GitHub has an excellent API. My usual take on something like this would be to have cron trigger a script that:

  1. polls the GitHub API
  2. extracts the counts
  3. writes them to some data store
  4. builds today's chart from the historical data in the datastore

That could be easily done in bash, Python, or Perl, but it's the sort of thing that's inevitably brittle. It's too ad hoc to run on a prod server, so it'll run from a dev shell box, which will be down, or disk full, or rebuilt w/o cron migration or any of the million other ailments that befall non-prod servers.

Looking for something different I tried Google Apps Script for the first time, and it came out very well.


GitHub Jenkins Deploy Keys Config

GitHub doesn't let you use the same deploy key for multiple repositories within a single organziation, so you have to either (a) manage multiple keys, (b) create a non-human user (boo!), or (c) use their not-yet-ready for primetime HTTP OAUTH deploy access, which can't create read-only permissions.

In the past to managee the multiple keys I've either (a) used ssh-agent or (b) specified which private key to use for each request using -i on the command line, but neither of those are convenient with Jenkins.

Today I finally thought about it a little harder and figured out I could use fake hostnames that map to the right keys within the .ssh/config file for the Jenkins user. To make it work I put stanzas like this in the config file:

Host repo_name.github.com
    Hostname github.com
    HostKeyAlias github.com
    IdentityFile ~jenkins/keys/repo_name_deploy

Then in the Jenkins GitHub Plugin config I set the repository URL as:


There is no host repo_name.github.com and it wouldn't resolve in DNS, but that's okay because the .ssh/config tells ssh to actually go to github.com, but we do get the right key.

Maybe this is obvious and everyone's doing it, but I found it the least-hassle way to maintain the accounts-for-people-only rule along with the separate-keys-for-separate-repos rule and without running the ssh-agent daemon for a non-login shell.

spdyproxy on Ubuntu 12.4 LTS

I'm often on unencrypted wireless networks, and I don't always trust everyone on the encrypted ones, so I routinely run a SOCKS proxy to tunnel my web traffic through an encrypted SSH tunnel. This works great, but I have to start the SSH tunnel before I start browsing -- that's okay IRC before reader -- but when I sleep the laptop the SSH tunnel dies and requires a restart before I can browse again. In the past I've used autossh to automate that reconnect, but it still requires more attention than it deserves.

So I was excited when I saw Ilya Grigorik's writeup on spdyproxy. With SPDY multiple independent bi-directional connections are multiplexed through a single TCP connection, in the same way some of us tried to do with web-mux in the late 90s (I've got a Java 1.1 implementation somwhere). SPDY connections can be encrypted, so when making a SPDY connection to a HTTP proxy you're getting an encrypted tunnel though which your HTTP connections to anywhere can be tunneled, and probably getting a TCP setup/teardown speed boost as well. Ilya's excellent node-spdyproxy handles the server side of that setup admirably and the rest of this writeup covers my getting it running (with production accouterments) on an Ubuntu 12.4(.1) LTS system.

With the below setup in place I can use a proxy.pac to make sure my browser never sends an unencrypted byte through whatever network I'm not -- DNS looksups excluded, you still needs SOCKSv4a or SOCKSv5 to hide those.