Geeking Out

Version Nine

Welcome back. When I started blogging in 2001, there was no Facebook, Instagram, or Twitter. Now there are all those things and more, and yet the vast social media empire seems more and more insidious and evil with every passing scandal.

There is a place for personal media, privately owned, distinct and disconnected. So I’ve embarked on another upgrade and redesign of this venerable blog, featuring a fresh new look and the latest is WordPress hypermedia magic behind the scenes.

For fun I whipped up a data visualization, and the change in posting habits over time is dramatic.

Graph of posts over time
Graph of posts over time. A live version of this graph now lives at the bottom of the homepage.

First I took a job that heavily discouraged blogging, then I shifted more fully to Facebook and other social media platforms. I find myself using those platforms less and less, in line with the shifting habits of my friends and family.

Yet I have thoughts! So many thoughts, just asking to be written down and broadcast to the indifferent world! So my goal for 2019 is to reverse this trend. I will post more here, and less those other places. With any luck this won’t be a repeat of 2014.

And so, welcome back.

Geeking Out

What does it mean to “lose” weight?

Apologies in advance to the chemists in the room, because I’m going to butcher the science on this.  But the lay explanation is fascinating.

Weight loss discussions typically focus on two pathways, or both in combination: caloric restriction (i.e. eating less) and exercise.  In both cases, the goal is to “burn” more calories than we take in and, thus, remove excess fat.  But what does this mean in practice?  Calories are a measure of heat energy, so the term “burn” seems to make intuitive sense.  But the theory of conservation of mass tells us that mass cannot be created or destroyed.  We are not losing weight through heat.

If the common wisdom is a lie, the next idea is that we lose weight through digestive excretions, i.e. feces.  But this, also, is incorrect, for somewhat obvious reasons.  The digestive system is concerned with taking in fuel, breaking it down, using it, and getting rid of all the useless bits out the other side.  Nowhere in that system is there any “burning” or converting of stored energy.  In short, we don’t lose weight through our poop.

Losing weight actually comes down to metabolizing triglycerides, the primary component of fat.  Triglycerides are essentially a bunch of carbon and hydrogen with a bit of oxygen thrown in.  This is basic chemistry, and I have forgotten most of my chemistry.  But wait, carbon?  Hydrogen?

So, it turns out that the vast majority of “burned” calories are expelled through breathingEighty-six percent, to be precise.  How?  Well, just how we were taught in elementary school — O2 in, CO2 out!  Most of the remainder, i.e. those hydrogen atoms, leaves as water, H2O coming out of all the various places that we get rid of water, such as sweat, spit, tears, and urine.

Hearing this for the first time, it seems utterly crazy.  But actually it makes a lot more sense than the idea that all that fat is being magically “burned” away.

Geeking Out

The current state of commercial vehicle autonomy, from the perspective of a casual driver

Selfishly, I would like to live as long as possible. The top causes of death among adults in the United States are heart disease, cancer, and automobile accidents. So I attempt to maintain a reasonably healthy diet, exercise regularly, and avoid smoking harmful substances. And I try to both limit my driving and drive in the safest available vehicles.

Living in LA for the last six months, daily long-distance driving has become unavoidable. Consequently, I have continued my studies of the latest in car safety, which, these days, primarily revolves around vehicle autonomy systems.

Continue reading “The current state of commercial vehicle autonomy, from the perspective of a casual driver”
Geeking Out

A month with the Touch Bar, and that’s enough

I purchased a new Apple laptop because I needed one, not because of any particular advertised feature. The Touch Bar models were better specced, so I grudgingly ended up with one. Today I finally turned off the Touch Bar’s “App Controls”, returning it to the standard function layout.

I don’t know how other people use computers, but I expect I’m in the majority as someone who keeps my eyes and attention focused on the display while typing. The term “touch typing” refers to the skill of being able to type by touch without needing to look at the keyboard. Thus, creating a “Touch Bar” — a flat capacitance screen with constantly shifting tap targets and no physical cues as to button location — is the exact opposite of a touch typing innovation.

The Touch Bar is very clever in the way that it dynamically updates with buttons relevant to each app. But we already have a mechanism for that functionality — a massive backlit screen that updates 60 times per second. I’m not sure that a touch screen laptop is useful, but being able to touch a target where I’m looking makes a lot more sense to me than having to change my focus away from the massive display screen I spend all my time working on in order to glance down at a tiny set of touch targets in a location where I have trained myself to never look.

Is the Touch Bar an innovation? Reviews are mixed, mostly taking a wait-and-see attitude. But I’m willing to call it now — the Touch Bar is a step backwards. Before, I had trained myself to know by touch how to change volume, brightness, and music. Now those buttons have no tactility. Just because something is new does not make it innovative. Just because you can create a whiz-bang bit of gadgetry does not mean you should.

Geeking Out

Update on 2016 tools and productivity enhancements

Last year this time I wrote about changes to the tools and processes I use for personal productivity. This is just a brief update on where things ended up.

Document storage

Switching from Neat to Doxie was a failure, the multi-step scanning process and poor software integration made it a non-starter. I am still stuck with a Neat scanner that works less and less reliably with each Mac OS update, and a software suite that is now officially unsupported and unmaintained. I still have not found a better solution for scanning and keeping track of the small quantity of critical paper documents that I receive.

Note taking

I have abandoned Evernote as bloated and unworkable, as planned, but found Ulysses too be overly focused on writing long-form documents, whereas I need a general note-taking application. I have been using Quiver, a notebook focused on programmers who want to store code snippets, and found it to work reasonably well for all types of notes. I frequently get into trouble due to the lack of a full-featured iOS app.

I have been playing with Bear, a late entrant that is also a plain text/Markdown note taking app, and I’m generally pleased with it. But the import from Evernote is poor, and there are a few important features that are still missing.

Bookmarks and reading

Instapaper is still my favorite app for offline reading. Using Pinboard for shared/social bookmarking, however, was a bust — if the bookmark is not in my browser, I am not going to find it or see it. Instead I have switched to using Chrome on iOS so that my bookmarks and browsing history stay in sync between platforms.

Task management

Abandoning Things for 2Do was an overwhelming success. The features of 2Do work much better for me. But the lack of integration with other tools and/or a cloud component continue to hold it back from true excellence.

For more complex project management I have taken a look at a variety of tools including the venerable Basecamp (too opinionated, too wordy) as well as Asana (poor iOS app) and Flow, but I’ve fallen back to the trusty and flexible Trello.

Conclusion

Well, it’s good to try new things. With the plethora of tools and apps available, there should be something that fits everyone, but I still haven’t found the perfect set of apps for me. In particular, the Neat hardware/software is an (expensive) disaster, and there doesn’t seem to be a better tool for simply scanning, OCRing, and searching receipts and documents. But I will keep looking in 2017!

I have been moving more of my writing to Markdown format, and that makes it much easier to switch between apps. It would be easier still if every app supported the same set of Markdown formatting options.

Geeking Out

Running the numbers on backup generators

Whenever we have a power outage (which is not an infrequent occurrence in Hull) I ponder the utility of a backup generator.  Since I’m currently sitting in the dark, I decided to run some numbers.

For simplicity, I’ll assume a reasonably-sized whole-house generator kit with a transfer switch that uses natural gas or LP. A decent price on one of these units is about $5,500, and I’ll assume another $1,500 for installation (both electrical and plumbing).  Yearly maintenance contracts, which include an annual inspection and repairs, run around $300/year.  The useful life of the generator is estimated by various sources at around 20 years.

Adding this up, we get a total lifetime cost of $13,000 for the unit, not counting fuel costs (which are a very small component if you’re using natural gas since there is nothing to store).  That comes out to a yearly amortized cost of approximately $650.  On average we have two lengthy power outages a year.  So essentially, excluding fuel and unanticipated maintenance costs, the price is around $325 per outage.

That’s pretty significant.  Although it no doubt feels worth it on the day when it’s 5 degrees F and the boiler is shut off due to lack of electricity…

Update (29 Feb 2016): Discussion about this post on Facebook revealed a few friends and acquaintances who were able to get a functional generator setup for far less than the estimates here.  This is not surprising, because I was describing a “set it and forget it” approach.  For completeness (and cost practicality), I should mention how they achieved this.  Their setups typically included a smaller, manual start gasoline generator hooked up to a manual transfer switch that protected only a few key circuits.  With no service contract and the willingness to go out in the storm to setup, start, and refuel the generator when needed, backup electrical capacity could be achieved for closer to $2,000-3,000.

Geeking Out

New tools and productivity enhancements for 2016

Around this time every year I re-evaluate my various tools and workflows and try to devote some time to productivity improvements. Last year, among other things, I spent a lot of time thinking about money management and banking. This year I’m mostly focused on knowledge management — document storage, note taking, task management, and the like.

Document storage

For years I have scanned paper contracts, records, and receipts using a NeatReceipts scanner and the Neat filing software. The app is sub-par but after extensive searching I still haven’t found anything better. I briefly flirted with Evernote, but I am just not comfortable storing more of my sensitive medical and financial records unencrypted in another cloud provider.

The scanner (from 2007) is no longer supported and stopped working with the latest OS X release, so I replaced it with the well-reviewed Doxie. Now instead of scanning and processing right in Neat, I scan, then import into Doxie, then export to Neat, then process in Neat — everything takes four times as long. As much as it pains me, I’m going to return the Doxie and pay Neat for a new scanner that looks just like my old one. I still think that keeping documents offline (and backed up) is worth the trade-off of not having access from my other devices, at least for now.

Note taking

I’ve been using Evernote for this, but inconsistently. I hate how bloated the app is, and it’s constant nagging to try new features and collaboration tools that I don’t need. In desperation I paid Evernote for their premium plan, but it didn’t make the problems go away. Instead I am switching to Ulysses, a Markdown note taking and writing app that is cleaner and simpler than Evernote but has all the features I want. Unfortunately the process for getting notes out of Evernote is not straightforward.

Bookmarks and reading

Instapaper is still my favorite app for offline reading. I send any interesting articles I see into Instapaper, where they are saved for later reading on all of my devices. I can also search the full text of articles I have saved in Instapaper, which is great for trying to find an article or fact months or years later.

This year I am adding Pinboard to the mix as well. I’m trying to bookmark and tag any interesting site or reference that I run across in Pinboard instead of relying on Google or my browser history to find it later. I’m also finally using IFTTT for the first time, to automate saving links to Instapapered articles as bookmarks in Pinboard. My goal is to have only one (or at least fewer) place to look when I want to find something, be it a code snippet, tutorial, recipe, or whatever else.

Task management

The biggest and so far best change has been abandoning Things, my task management app of several years, for 2Do. I’m finding 2Do more flexible, more pleasing to use, and just all around better than Things. The Things update cycle was very slow and new feature development almost non-existent. 2Do keeps getting better, and it really fits my workflow well. Task management is different for everyone — I use a methodology that is vaguely GTD but really just the system that works for me. 2Do is flexible and customizable but opinionated where it needs to be. It gives me all the features I need while maintaining an elegant and uncluttered user experience.

Parting thoughts

When it comes to productivity software — as in all things — I aim to be pragmatic. The tools and workflows I use all have trade-offs. I don’t like being tied too closely to any one cloud service or provider, and I like to maintain access to and backups of my own information. I choose to forego online access to more sensitive documents in favor of additional security and control, but I use Google, iCloud, and Dropbox for various aspects of my life due to their convenience and power. The choices and trade-offs are different for everyone. This is the system I am comfortable with for now, but it is likely to change dramatically as time goes on. 

Geeking Out

A week of disappointment with the Apple Watch

I got up at 3am to pre-order the Apple Watch and mine arrived on release day. My justification was simple — I have found the FitBit activity trackers to be useful but limited, and a more comprehensive device seemed like a great upgrade.

Unlike the many glowing reviews, I have found Apple’s much-hyped new gadget to be nothing but trouble. My litany of complaints is vast, so I will focus on a few major pain points that might dissuade others from purchasing this device until the next version is released.

Third-party app support
Third party app support is universally poor. I have not yet found a single third-party app that works well, they are all slow to load, quick to crash, and often fall out of sync with their phone apps. Apparently when the watch goes to sleep the watch app loses connection with the phone app, so things can’t just finish processing or loading in the background. You constantly have to stand there like a dope staring at your wrist waiting for something to happen.

Glances
The “glances” (cards) in the watch app for quick updates are also problematic for the same reason. Even Apple’s built-in glances, such as weather and stocks, do not update in the background, so I often find myself seeing yesterday’s weather or an out-of-date version of my todo list. Because I can’t trust the data to be accurate, I find myself not even bothering to use glances.

Watch faces
If you want a “digital” watch face with additional data (“complications”) rather than a pseudo-analog one, there is only one option. The level of customization is actually quite limited — you can’t put the time in the middle, for example, and while you can show your next calendar event or the phase of the moon, you can’t show your daily step count or any other third party app data. The calendar complication can only link out to Apple’s built in calendar app, the weather complication to Apple’s weather app, etc. There is no third-party integration possible, so if you like using Dark Skies to know when it is going to rain or Things for task management or basically any other of the thousands of third-party apps, there is no way to integrate them into your watch display.

Notifications
It is very difficult to tell the difference between a phone notification that is not actionable, a watch notification that can be tapped to get into an app, and the app displays themselves. The whole interface is confusing in that way — am I in a notification, an app, a glance? Will swiping work, or not? Tapping? It is complete inconsistent. And it is very easy to get lost or frustrated, tapping the screen repeatedly only to find nothing happening. Why tap repeatedly? Because sometimes in apps you need to tap multiple times to hit the tiny touch targets. And sometimes you hit the wrong one, end up somewhere else, and have no obvious way to get back.

General bugginess and unreliability
The built-in health tracking is extremely buggy. Sometimes it tells me it is “time to stand” while I am standing. Sometimes it tells me to stand when I’m in the car driving at 60 miles per hour. Sometimes it tells me I achieved a fitness goal while I’m in the middle of a run, knocking me out of my running app, which then crashes and will not reconnect with my phone, so I’m frantically navigating through the tiny app launcher while trying to keep up my pace. The scroll wheel (err, “digital crown”) gets mucked up and won’t turn until I run it under water. The maps app takes forever to update with my current location. Sometimes I get buzzes for notifications but then none display. Sometimes I send a text message reply and the whole watch freezes for 30 seconds. A couple times I’ve had to hold down both buttons to restart the watch because it got completely stuck.

If I’m going to wear a device on my wrist, I want it to integrate into my day. I want it to be effortless. I want it to show me the information I need when I need it. I don’t want to fiddle. I want the apps I already use to easily integrate and work well. I want to be able to hide the many apps that I don’t care about, making it easier to find the ones I do. I don’t want spurious notifications. I don’t want a watch that crashes.

The Apple Watch, in my experience, is a failure at its basic purpose. Even the buttons — there are two, one of which is dedicated to sending your friends drawings, which I will never do. There is no way to assign that button to something I might actually want to use, like a dedicated way to get to a single app, or back to the watch face.

Luckily, almost all of the problems I have run into are software related, so I can only hope that Apple will remedy them in software updates in the future. But will that be anytime soon? And will the updates work with this watch, or will I have to buy a newer model? In the case of the Apple Watch, it does not pay to be an early adopter.

Geeking Out

Introducing Hygroscope

Hygroscope is a command line tool for managing the launch of complex CloudFormation stacks in Amazon Web Services.

CloudFormation is a tool for creating and managing Amazon Web Services infrastructure using code. A JSON-formatted template describes the state of a “stack” including such resources as servers, S3 storage buckets, and load balancers. Utilizing the AWS Virtual Private Cloud service, entire software-defined networks can be described and repeatably created, updated, and destroyed using CloudFormation.

CloudFormation is not without its pain points:

  • Templates must be written in JSON, which, in addition to being difficult for a human to read, does not support niceties such as inline comments and repeated blocks.
  • Launching CloudFormation stacks requires knowledge of the various parameters that need to be provided, and it is difficult to repeatably launch a stack since parameters are not saved in any convenient way.
  • There is no easy mechanism to send a payload of data to an instance during stack creation (for instance scripts and recipes to bootstrap an instance).
  • Finally, it is difficult to launch stacks that build upon already-existing stacks (i.e. an application stack within an existing VPC stack) because one must manually provide a variety of identifiers (subnets, IP addresses, security groups).

Hygroscope aims to solve each of these specific problems in an opinionated way:

  • CF templates are written in YAML and processed using cfoo, which provides a variety of convenience methods that increase readability.
  • Hygroscope can interactively prompt for each parameter and save inputted parameters to a file called a paramset. Additional stack launches can make use of existing paramsets, or can use paramsets as the basis and prompt for updated parameters.
  • A payload directory, if present, will be packaged and uploaded to S3. Hygroscope will generate and pass to CF a signed time-limited URL for accessing and downloading the payload, or the CloudFormation template can manage an instance profile granting indefinite access to the payload.
  • If an existing stack is specified, its outputs will be fetched and passed through as input parameters when launching a new stack.

The latest version of Hygroscope can be installed via RubyGems. The inline help documents each command and its options. The source code for Hygroscope and additional documentation is on GitHub, and a sample template that sets up a “bare VPC” is a good introduction to creating Hygroscopic templates.

Geeking Out

Running GitHub Enterprise in Amazon EC2

Update (2015-03-29): GitHub now supports an EC2 appliance and this information is no longer accurate. It is useful only for historic reasons or general background when confronting similar challenges from other vendors.

GitHub’s hosted offering allows companies to run their own private GitHub appliance behind their firewall.  It is distributed as an OVF container that runs under VMWare or VirtualBox.  But what if you want to run it, along with your other infrastructure, on AWS?  Here is the (completely unsupported) way to do it!

The goal is to get the base GHE virtual appliance running on AWS so that we can install the latest GHE software package on top of it.  This package takes care of updating and configuring everything.  Once the software package is installed, the appliance behaves just like its on-prem cousins.

Break into the virtual appliance

First we need the virtual appliance in a form that can be moved into AWS.  Download the current virtual appliance from the GHE dashboard and find a way to get at it.  You may be able to just launch it locally in VMWare or VirtualBox, if you are able to get root, but I did not do this Instead I extracted the archive (it is just a tar file) to get at the VMDK disk image inside, and attempted to import it into EC2 using the AWS VM Import/Export tool.

This requires some fiddling, because you have to install the old EC2 command line tools and get all the options right, with some plausible guesses about what is inside.  Here is the command I ended up running:

ec2-import-volume /var/tmp/github-enterprise-11-10-320-x86-64-disk1.vmdk \
 -f vmdk -z us-east-1a -b agperson-ghe -o $AWS_ACCESS_KEY -w $AWS_SECRET_KEY

Once the import is complete (you can check the status with ec2-describe-conversion-tasks) I attempted to launch it — and failed due to an unsupported kernel.  But never fear!

Figure out what’s under the hood

If you don’t want to do this yourself skip to the end of this section where I tell you the secrets.

The VM import creates an EBS volume.  It may not be runnable, but it is mountable!  So start up a one-off Linux instance and attach the volume to it.  The data is stored in LVM, so you may need to install the lvm2 package and then run lvmdiskscan to see the volume group.

Run vgdisplay to get the name of the volume group (“enterprise”) and activate it by running vgchange -a y enterprise. Now you can mount the root volume:

mkdir /ghe
mount /dev/mapper/enterprise-root /ghe

Poke around in this volume a bit and you will establish that the virtual appliance comes with Ubuntu 11.10 Oneiric (wow!) and is 64-bit. With this information, we can launch an equivalent instance in EC2.

Setup an Amazon-happy instance

Launch a new EC2 instance using the publicly available community AMI from Ubuntu for 64-bit Oneiric (make sure you are using the released version — in us-east-1 I used ami-13ba2d7a). I chose an m3.large which is a good baseline based on GHE’s requirements. Make sure to attach a second volume for data or make the root volume large enough to hold all your repositories, and use SSD storage because it makes life better. Put your new instance in a security group that allows traffic on ports 22, 80, 443, and, if necessary, 9418 (the git:// port, which is non-authenticated so often not used on GHE installs).

When the instance launches, login as the “ubuntu” user and become root. Modify the /etc/apt/sources.list to point all archive stanzas at old-releases.ubuntu.com (including the security ones). Run an apt-get update && apt-get upgrade and wait a few minutes.

Now you need to copy over all of the files from the virtual appliance. You can either do this via SSH from the one-off instance you launched earlier, or detach the volume from that instance and repeat the steps to get LVM running and attach it to the new instance. Either way, use rsync to get everything important onto your new VM. Rackspace offers a helpful tutorial on doing this, including a good set of directory paths to exclude. I used their list and everything worked fine. The command I ran with the volume mounted locally was:

rsync --dry-run -azPx --exclude-from="exclude.txt" /ghe/ /

(and once I was satisfied, I ran it again without the “–dry-run” flag).

Bombs away!

Before rebooting, copy your SSH key into /root/.ssh/authorized_keys in case anything goes wrong (and take a moment to ponder who Ben is and why his “HacBook-Air.local” key is on our server!). Then restart the instance and, when it is done booting, visit it via HTTPS to see the beautiful GHE setup screen! Upload the latest software package and your license key and give it half an hour or so, and if everything goes well, you will have a fully-functional GitHub Enterprise instance in the cloud.

Note that after the software package installs you will no longer have root access to the server. A pity.

A few other important steps are left as an exercise to the reader — lock down access, setup SES or some other email sending capability, stay in compliance with your license, and take frequent backup snapshots! Good luck!

Geeking Out

Sending automated notifications to HipChat rooms

At work we have been piloting HipChat’s new self-hosted on-premises option for the last few months.  It has been great having a bunch of people who work in different buildings and on different schedules using shared chat rooms for communication.

I have also been experimenting with hooking HipChat into our toolchain. We now have a chat room where every Capistrano deployment is announced, and another where all of our high-priority Zabbix alerts are collected. HipChat makes this easy with their version 2 API’s room notifications feature. A room owner can simply generate a room-specific API token and plug it into a script to send notifications.

Here is an example:HipChat Zabbix Alerts

And to make it easy for the next person who wants to do this, I’ve released the code on GitHub.

Instructions for setting it up are in the README. And 15 minutes later, you’re in business with pretty and useful Zabbix notifications in HipChat.

Geeking Out

Docking the hype of Docker (sort of)

Update 4/22/14: James Turnball has covered similar territory and reached similar conclusions. In fact the more I look, the more I see this debate playing out and the first generation of solutions beginning to take form. In just two months I’m now much more optimistic about the immediate applicability and viability of Docker to real-world problems.

I originally posted this entry on the HUIT DevOps community blog

DockerWhen I first heard about Docker I knew it was something to watch.  In a nutshell Docker is a mechanism on top of Linux Containers (LXC) that makes it easy to build, manage, and share containers.  A container is a very lightweight form of virtualization, and Docker allows for quickly creating and destroying containers with very little concern for the base OS environment they are running on top of.

Because Docker is based around the idea of running “just enough” OS to accomplish your goals, and because it is focused on applications rather than systems, there is a lot of power in this model.  Imagine a base server that runs absolutely nothing but a process manager and the Docker daemon, and then everything else is isolated and managed within its own lightweight Docker container.  Well imagine no longer, because it is being built!

But with power always come responsibility, and Docker has a caveat you can drive a truck through — the ephemeral, process-oriented nature of Docker strongly favors moving back to the old “Golden Master Image” approach to software deployment.  That is to say, its great that you can easily distribute a completely isolated application environment that will run everywhere with no effort.  But in doing so, it is very easy to ignore all of the myriad problems that modern configuration management (CM) systems such as Puppet were built to address.

Continue reading “Docking the hype of Docker (sort of)”

Geeking Out

New to Mac or Linux? Try this basic shell configuration

I originally posted this entry on the HUIT DevOps community blog

I’ve recently worked with several folks who live in a Windows world and are either moving to a Mac laptop or have to do work on a Linux server.  In the DevOps world, developers are often pushed outside of their comfort zone.  Having to work in a UNIX shell can be quite disconcerting.

While I can’t give you a 5-minute primer that takes all the pain away, I can point you in the right direction.  I have created a Bash shell configuration that provides some sane and useful command line defaults, much better than what you get out of the box.

Continue reading “New to Mac or Linux? Try this basic shell configuration”

Geeking Out

Capistrano multistage deploy configuration stored in a YAML file with MultiYAML

I spend a lot of time working on deploying a variety of software applications smoothly to different environments. A tool central to my workflow is Capistrano, an SSH-based deployment framework written in Ruby.

In its Ruby-ish way, Capistrano’s multistage functionality requires stubbing out different Ruby files for each stage — staging, production, etc. In our environment, I decided it was better to instead store all of the per-stage configuration in one single configuration file, and I chose to do it in the simple YAML format.

There are several advantages to this approach:

  • The file format is straightforward and can be modified both by humans and scripts, including automatic updates from a central source of truth.
  • There are fewer configuration files, and within the single configuration file there is much less repetition of configuration, because we can use YAML’s built-in anchor/alias functionality.
  • It strongly encourages storing deployment logic in the deploy.rb file and hooking tasks using Capistrano’s before/after callback functionality, rather than building stage-specific tasks.

The module I built is inspired by Jamis Buck’s original Capistrano multistage module, as well as Lee Hambly’s prototype YAML multistage extension, which was never packaged and is no longer maintained.

My capistrano-multiyaml module is available on GitHub along with documentation, and can be installed via RubyGems.

Geeking Out

Data privacy and security in 2013: Cloudy!

On Friday I attended a Data Privacy Day (a real thing!) panel co-sponsored by HUIT and Harvard’s School of Engineering and Applied Sciences called “The Intersection of Privacy and Security“. The panelists were noted Harvard technology graybeard Scott Bradner, always interesting professor Salil Vadhan, and SEAS computing director Steve King.

After some brief introductory remarks by the panelists about balancing privacy and security, the floor was opened. I seized the opportunity to ask about something that has been much on my mind lately: how to make sensible personal choices about data privacy (and security!) in an age of highly-connected devices heavily depending on third-party hosted services.

Or to boil that down a bit more: Let’s say I have a phone, a tablet, and a laptop, a pretty common set of devices these days. And let’s say I use them all constantly. And these devices are tracking what I read and listen to, who I talk to, where I go, what I buy, and every email, chat, and text I send and receive. They are syncing this data between each other and up to an amorphous “cloud” service, where my data is being collated, cross-referenced, sold to marketers, and stored forever.

Given this fact situation, how can I, as an individual, make sensible privacy and security trade-offs, when in order to get the maximal value out of these devices, I must cede control of my data — both the privacy of it and the security of it — to a third-party vendor such as Google or Apple?

A variety of answers were given, none of them entirely comforting. From Bradner, first, came the cynical view — pay in cash, forego loyalty programs, do not use cloud services, and assume everything you store online will be there forever. This is a valid answer, and rock-solid from a data privacy perspective, but I don’t consider it very practical.

His next suggestion was an interesting one, and that was to look for natural alignments — is the corporation I’m entrusting with my data looking out for the same things as I am? His example, backed up by King, was Google’s track-record on fighting invalid data requests from governments and safeguarding customer information. They do this both because that information is valuable to Google, and because customer confidence in Google is also valuable to their bottom line. This raises some interesting and difficult questions — with a company as far-reaching and often secretive as Google, how can we know their actions and track their intentions? For how long will my interests align with Google’s, and when they inevitable stop aligning, how can I erase my digital life from Google’s clutches?

Professor Vadhan I believe was the one to bring up some of the regulatory remedies. Data privacy laws, when well crafted, could help to protect individuals from corporate data misuse, and perhaps even some types of government data misuse. Europe has tried several approaches to this, with mixed success. But such regulation is not on the docket in the United States currently, so that solution doesn’t provide any immediate guidance. And, Professor Vadhan admitted, he clicks through every terms of service notice and privacy agreement without reading it, just as we all do.

In my view, and seemingly that of the panelists, there is no clear path forward at present for this problem. For now we must all work to inform ourselves about risks, balance the trade-offs, and make decisions that we are comfortable with. So maybe I will use the CVS loyalty card, but not link it to a credit card. Or I will use Google’s Gmail service, but not Google+. This is complicated, time-consuming, and frankly difficult — Facebook’s privacy settings, for instance, shift frequently in unexpected ways, often without notice. Opting out of online services’ choices about how to use our personal data is becoming more and more difficult — perhaps because they see it as their data.

With no easy answers on individual data privacy, we can only muddle on as we have been doing, and hope for clearer, easier choices in the future. Meanwhile, the data we share ends up in unexpected places. The only silver lining, in my view, is that I’m not convinced that putting something on the internet does necessarily mean it will be there “forever”. The internet does seem to forget, or, if not forget, at least the constant deluge of new data seems to moderate and bury the old, in ways that can only be good for our lasting well being.

Geeking Out

Slow performance of Kerberos actions in RHEL 6 (and CentOS 6)

It took me several weeks to track down this problem, one that dramatically impacted the speed of specific actions that require frequent Kerberos lookups. The symptom is slow Kerberos actions such as doing a “kinit”. The backend doesn’t matter I don’t think — we have both MIT Kerberos and Active Directory, and the service is hitting both. On a RHEL 5 machine with a similar configuration, such a lookup in our environment, which requires a few hops and DNS lookups and such, takes around 80ms. On the new RHEL 6 machines, the same lookup takes around 300ms. Most of the time this is barely noticeable, because Kerberos actions are infrequent and normally only need to occur once.

It so happens that an important service we run is Subversion for source code management. Our Subversion runs under Apache (using mod_dav_svn) with Kerberos for authentication. We allow both password-based authentication and ticket-based authentication. Apache handles these as negotiate requests using the mod_auth_kerb module.

When authenticating with the password dialog, you put in your password once, Apache takes it from there and performs the Kerberos lookup, and all further actions occur speedily. But when using a ticket, the preferred authentication method, actions are very slow. This is specially noticeable for large check-ins, but is annoying most of the time, even for small actions, because SVN has to perform several requests for a simple update or small check-in.

I eventually tracked down the problem as being related to the newer version of Kerberos on RHEL 6. (For a while I was convinced the culprit was SSSD, but not so!) Specifically, newer Kerberos RPMs are patched to load in SELinux label configurations and use them when creating temp files. Unfortunately the label configurations are very large files full of regexes, all of which need to be churned through and memory mapped — on every request! In our case that portion of the operation takes about 120ms, and happens twice per request.

The solution is to disable SELinux completely (not just set it to permissive mode) and restart, or, perhaps to recompile krb5 without the selinux patch. Of course, once I finally figured out what was going on, I discovered a previously filed bug that is languishing in Red Hat’s Bugzilla that outlines this exact issue.

So for the next person who has this problem, I hope this pops up earlier in the Googles, and saves you some aggravation.

Geeking Out

The Terrible User Experiences of Modern Kitchen Appliances

About a year ago I was on a smoothie kick and was about ready to boot my blender out the window. It was your standard big-box-store model, with a dozen buttons with descriptives like “pureé” and “Liquify”. An extra switch activated a sort of “turbo” mode, in case you need to go, as Nigel Tufnel would say, one higher. Unfortunately none of the buttons did what I wanted, namely take various solid and liquid ingredients and make them into a tasty smoothie.

So I threw it away, and bought this:

Osterizer Beehive Blender

I have never been happier with a blender. It has one mode, “on,” and when I turn it on, it blends things. No mix, no crush, no frappe, and no whip. Just blend.

A couple months ago we needed to purchase a new washing machine, and I went searching for the one with good ratings. Sadly Consumer Reports has no rating category for usability. I eventually settled on this nice Maytag, which offers 11 buttons, a dial with 10 wash modes, a customizable “my cycle,” and ten additional custom options and settings.

Maytag 2000 Washing Machine Control Panel
Maytag 2000 Washing Machine Control Panel

All I want is one button that says “make my clothes clean,” but sometimes I have to put normal clothes on “heavy duty” mode or switch from “high” spin speed to “extra high” to get what I want. I have no idea what prewash does — isn’t it just more washing?

This is my toaster oven.

Toaster oven from Costco
Toaster oven from Costco

It’s fine I guess. I’ve pretty much figured it out, but guests are always confused. When I make a pizza, I put it on the temperature recommended by the recipe, not whatever mode offered by the dedicated “pizza” button. I have never defrosted anything in my toaster oven. And the clock is never correct. I have on multiple occasions got everything set, walked away for the predetermined period of time, and returned to uncooked food — only to realize that I never pressed the “Start” button.

After about three minutes of thought, I came up with a better interface for this toaster. Here it is.

My Imaginary Toaster Control Dial
My Imaginary Toaster Control Dial

For toast you go to the left, and for bake you go to the right. Seems pretty straightforward to me, there are even words that say “bake” and “toast” to clarify. There is no keypad or selector arrows, just a simple dial. And there is no clock. Because I have an iPhone. And there is no cook timer, because, right, still have the iPhone. No pizza button, because, not only do I have an iPhone that can tell me at what temperature to bake a pizza, but also because a pizza button is not a real thing.

I would buy my toaster in a heartbeat, I’d pay double what a comparable toaster costs. But I guess no one else on the toaster market is like me, and I think that is sad.

At least coffee makers are still generally pretty straightforward. Maybe I should start drinking coffee.

Geeking Out

MacBook Air overheating with OS X Lion and FileVault

I noticed today that my Mid-2009 work-provied MacBook Air, which I recently freshly installed with Lion, was generating a ton of heat, and operating really slowly. I noticed that the kernel_task process was using between 130% and 150% CPU, and the CPU meter was completely full. Meanwhile, no other processes were using a significant amount of RAM and CPU (in fact, I have barely installed anything on this machine yet). I tried quitting all of my applications, unplugging all peripherals, and disabling wifi with no change. I rebooted with no change. Even researching the problem online was highly frustrating because things were going so slowly.

After reading a variety of message board postings I came to the conclusion that the kernel_task high CPU usage was a symptom, not a cause. The machine was overheating due to a combination of very high temperatures (high-90s F) and being seated on a surface that did not allow it to dissipate heat effectively. The kernel_task was running, apparently, to keep other processes from using the CPU.

Strange as this seemed, there was an easy test — I plopped the laptop, now running nothing else, in front of an air conditioner and hit it full blast. Within a few minutes the kernel_task process dropped to 70-80% CPU usage and things became responsive again. But even though the laptop was cool to the touch, the process never went down from there.

My next thought was that perhaps the new FileVault full-disk encryption was playing a role, since it was one of the only things running once I had disabled all third-party processes and quit all apps. So I set the hard drive to decrypting, which took only 20-30 minutes on SSD. Sure enough, as soon as it completed, the kernel_task processor usage dropped to almost nothing.

I’m not going to repeat the experiment to verify the results, so use this as one data point only. But if you are experiencing similar behavior, first cool the machine and move it to a hard, flat surface. That should alleviate most of the symptoms.