Version Zero

Pre-released Ideas.

HTTP Redirects Are Monadic

In Monadic i/o and UNIX shell programming, a correspondence between a subset of the UNIX shell commands are shown to be isomorphic to operations in Haskell.

While the conclusions may be somewhat academic–in that they only help understand one in terms of the other. The connection is nevertheless worth further consideration.

HTTP redirects are a mechanism used by HTTP servers to inform a client that further work needs to be done. In some cases, the resource client seeks has been moved, meaning the server may be informing the client of the resource’s new location. More interestingly, redirects can be used to “string together”–or pipe, so to speak–a series of HTTP requests, where each step provides a proper transformation of the previous request’s results to the next request.

In what follows, I will show that many contemporary uses of HTTP redirects exhibit an analogous behaviour to UNIX shell commands. Furthermore, I will argue that due to this use, a distinction between redirection and safe redirections must be made. Where a safe redirection is one that is not reflected visually in the browser’s UI, allowing possible improvements to user experience.

<Insert shell/redirect behaviour comparison here>

Before we continue, we require a few definitions: We need distinguish between a single redirect and a transparent redirect. A transparent redirect is a series of n redirects, for n > 1. Since transparent redirects may traverse one or more domains, we require a further distinction between safe and unsafe transparent redirects. A safe redirect is a transparent redirect within a trust domains, forests or realms. In other words, if the sequence of redirects follow a path between trusted regions, then the sequence is considered safe.

How to implement safe redirect is beyond the scope of this post. For now, we assume that the infrastructure to perform them already exists. The implementation details will be addressed in a future post.

Having now defined safe redirects, we can concentrate on their use. Like existing sequences of redirects, safe redirects use one more application servers to provide more complex client/server interaction, while reducing coupling between application modules as well as separate applications. A typical example of these sequences can be seen in many authentication steps required for modern web applications. For instance, some bank web applications require a complex sequence of redirects to properly authenticate a user.

While it may comforting to know that such an institution is taking such extensive precautions to secure personal data, being reminded of it on every visit to the site provides little value. That is: if the only indication of the process is a series of redirects, which can be seen to provide a portion of the authentication process, then it may be of benefit to simply group the sequence of operations in to a single unit. Herein lies the benefit of a safe redirect.

By using a safe redirect, web applications can hide the sequence of redirects from the end user, so as to reduce the amount of navigation noise experienced by the client software. Which is to say: if a user needs to check their account balance, the login process should take them from the username and password page and land them directly in their account, rather than having the client software, i.e browser, load and render intermediate pages, as defined by the login redirect sequence. The net effect of this change would be to eliminate the need for any visual queues to a user about the nature of the authentication process.

This change may be used to provide two key user experience benefits. First, a sequence of redirects would apear as a single page navigation. And second, the return of back functionality of the client could be used to undo all redirects in the last sequence, rather than simply the last redirect of the sequence. Meaning that if a user is delivered to an account page, clicking the back button in their browser would return them to the login page.

Mac OS Gets Cilk++ Support via Clang and LLVM

Part of my yet un-finished graduate work involved using a combination of threading and cache-oblivious algorithms to improve the performance of various linear algebra computations on sparse data sets. During the course of my work, I used a few concurrency models, including Intel’s Cilk Plus data and task parallelism techniques.

Recently, the Cilk Plus language extensions were introduced to the Clang frontend for LLVM. Which mean it was available on Darwin. Though the Cilk Plus extensions and runtime have been available in GCC for some time, building the GCC code on Darwin proved to be non-trivial. (For me, at least— the Linux build was far easier.) The introduction of Cilk Plus in Clang allows for easy compilation and use on the Darwin platform.

I used the instructions given on the new Cilk Plus/LLVM GitHub site to try a copy of the compiler on my local machine.

The current implementation only provides support for a subset of the Cilk Plus language extensions: cilk_spawn, cilk_sync, and hyperobjects. Leaving features like cilk_for, array notation and SIMD work for future releases. Despite the not providing a full feature-set, the Cilk Plus/LLVM tools can already be used to make non-trivial parallel programs.

I dusted off some benchmarking code I used for previous experiments and gave it a go on my laptop. While not a high-performance big-steel machine, my laptop does weigh in well with 16GB of RAM and a quad-core Intel i7.

The code I ran uses a well-known matrix multiplication algorithm— Strassen’s algorithm—to create large parallel workloads. The code was based on example code shipped with the original 1996 MIT Cilk language and runtime. I modified the code slightly to include the new language extensions as well as to provide slightly more performance information. (I’ll post the changes shortly.)

I ran the code through a variety of benchmarks, which I’ve included bellow.

speedup

The legend labels describe the size of the matrices being used. For instance, 1024 mean that Strassen’s Algorithm is being run using inputs of size 1024 by 1024. Notice we don’t achieve linear speedup, but we do get very close. I think with a little extra fine tuning (1) and the explicit use of some SIMD instructions, near-linear speedup could be achieved. If you want the data, you can get it here.

  1. Yes, I know: cache oblivious algorithms aren’t supposed to require tuning. In practice, unfortunately, they do. Maybe not as much tuning as their cache aware brethren, but still the require some. Even the original MIT code had some tuning code in it, so I’m not alone in thinking it necessary.

Towards a Working Raspberry Pi In-car Computer

Ever since I purchased my latest vehicle, I’ve been working on a little side project to interface directly with the vehicle’s on-board computer.

The task itself is pretty straight forward: buy device capable of plugging in to the vehicle’s ODB2 port, and run some free or inexpensive Android app to gather the information.

Unfortunately, the above process requires a great deal of user intervention. First, you must always plug and unplug the ODB2 device. Second, you must either have an Android device on your person or in the vehicle capable of running the required software or a computer with a variety of command-line tools installed. The latter is impractical, and the former is error prone (I’ll forget to start the app, take the readings, etc.) and tedious (The data must later be extracted from the app manually, or a web-service must be made available to push the data to– which also requires a data plan, if the car is not always in the proximity of a wifi access point).

Raspberry Pi to the rescue.

The Raspberry Pi is a small, inexpensive low-power computing machine. It was originally developed with teaching in mind, but sparked an enormous hobby market. The Raspberry Pi, or raspi for short, is available in two models: Model A is $25 the Model B is $35. The differences between the models is nominal: the more expensive one has more USB ports, more RAM and an Ethernet port. As a developer, the Model B is preferable, since it provides easy extension through USB and trivial network access via ethernet, not to mention more room to write sloppier prototype code in the roomy 512MB main memory. The Model A would be best suited to production or permanent projects, post development. (Not true in all cases: having network access can be critical for some applications– though a USB wireless dongle may mediate this issue.)

I purchased two of the $35 Model Bs, for a separate project, but quickly found that they would be a good fit for a variety of computing projects. One of these projects ended up being the computer for my vehicle.

The original intention of car computer was to keep track of details that were important to me, but slightly time consuming and tedious. One simple motivating example was tracking fuel economy. It’s a simple calculation to make, record and compare to previous trips. But why bother doing this work by hand if it will eventually end up in a computer, if the computer itself can gather and tabulate the data automatically. It’s hardly an earth shattering improvement, but led to further ideas. For instance: it would be great to know where, when and why the vehicle experiences the worst fuel economy. By taking additional data samples, like location, speed, temperature, time, and a variety of engine performance metrics, a very clear picture of how a vehicle performs will emerge.

Towards these ends, I’ve been configuring a raspi to read from a Bluetooth ODB2 adapter and stash the information until it can be transferred to my home network for processing.

The idea is simple: poll the ODB2 adapter periodically for all data it can return. Simultaneously poll a GPS unit and stash the correlated data locally on the raspi device. If and when my home network is within range, join the wireless network and transfer the collected data for processing.

more on the way

Configuration

USB Hub Configuration

The first problem I ran in to was that the cheap USB Bluetooth dongle I purchased was not well received by my raspi. Fortunately, like many things, I was not the only one who had this issue. A quick web-search away turned up the solution: the USB hub’s speed had to be turned down to the USB 1.1 standard, as the 2.0 standard was known to no play well with some cheap devices.

The fix was simple, once I found it, and only involved changing a few boot variables the kernel uses to initialize hardware on boot. In particular, I changed:

  • dwc_otg.speed: 1 will limit USB speed to full speed 12Mbps (USB 1.1).
  • dwc_otg.microframe_schedule: 1 (default now) This should fix the error when too many periodic endpoints are present.

Bluetooth Configuration

Having no documentation on the ODB2 sensor I purchased, I used the sdptool to tell me a little about the device.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ sudo sdptool records 00:19:5D:27:12:6E
...
Service Name: SPP Dev
Service RecHandle: 0x10002
Service Class ID List:
  "Serial Port" (0x1101)
Protocol Descriptor List:
  "L2CAP" (0x0100)
  "RFCOMM" (0x0003)
    Channel: 1
Language Base Attr List:
  code_ISO639: 0x656e
  encoding:    0x6a
  base_offset: 0x100

It’s not a hugely informative dump of information—in fact it’s downright cryptic—but it does provide us with the protocol and channel to connect with.

By configuring /etc/bluetooth/rfcomm.conf correctly, we get a device /dev/rfcomm0.

1
2
3
4
5
6
rfcomm0 {
  bind yes;
  device 00:0D:3F:45:DF:8A;
  channel 1;
  comment "ODB2 Sensor";
}

Then we only need to restart the bluetooth service:

1
2
$ sudo systemctl stop  bluetooth.service && \
  sudo systemctl start bluetooth.service

Eventually one could use the command:

1
$ sudo rfcomm bind 0 00:15:A0:7A:90:F2 1

Where the 0 is the device suffix, the hardware address is that of the ODB2 sensor and the 1 is the channel we want to connect on. We should now see the RFCOMM device available for use:

1
2
$ ls -l /dev/rfcomm0
crw-rw---- 1 root uucp 216, 0 Jan  1 01:14 /dev/rfcomm0

Using C++ Templates to Generate Compile-time Arrays

There was huge amount of attention payed to template meta-programming in C++ within recent history. The idea was to let the compiler do some additional work upfront so that the runtime environment would be then forever relived of the need to fill the same role.

The goto example in Computer Science for recursion is to list the first n Fibonacci numbers. It’s easy to see why this is done, the code is very simple and highly readable:

1
2
3
4
5
6
7
8
9
10
int fib(int n)
{
  if (n <= 0) {
    return 0;
  } else if (n == 1 || n == 2) {
    return 1;
  } else {
    return fib(n-1) + fib(n-2);
  }
}

Of course, we would never actually do this in practice–since it inefficient–but it does nicely illustrate the beauty of simplicity of recursion. It’s fitting then, that we try using the same example to use the same function to illustrate compile-time array generation.

Note that gcc/clang users will not be able to take advantage of this, as it seems to only be a “feature” of the Microsoft C++ compiler. It’s also non-standard, so it’s unlikely to work at any point in the future.

1
2
3
4
5
6
7
8
9
template<class T, int n>
struct fibonacci_numbers {
    static const T end = fibonacci_numbers<T, n-1>::end
        + fibonacci_numbers<T, n-2>::end;
};

template<class T> struct fibonacci_numbers<T, 2> { static const T end = 1; };
template<class T> struct fibonacci_numbers<T, 1> { static const T end = 1; };
template<class T> struct fibonacci_numbers<T, 0> { static const T end = 0; };

The code uses a few tricks to accomplish it’s job: We use partial specialization to help define the base case of the recursion. This can be seen in the last three lines, where the numbers listed coincide with the conditions in the original code.

The more surprising property we take advantage of is that the MS C++ compiler allocates storage in the stack for the end member of fibonacci_numbers. Thus, since we are recursively defining instances of the fibonacci_numbers structure, the compiler allocates stack storage for each definition.

Once compiled, accessing the data must be done in reverse, due to the nature of the template recursion. The following loop prints the first 12 Fibonacci numbers:

1
2
3
4
5
6
7
8
9
int
main ()
{
  fibonacci_numbers<int, 12> fibonacci;
  for (int i = n; i >= 0; i--) {
    cout << (n-i) << (*((&fibonacci)-i)).end << "\n";
  }
  return 0;
}

It’s not pretty, but it is effective. It should be easy to write an iterator wrapper for the above code, but I’ll leave that for a future post or update.

Finding Balance Between Local and Remote Terminal Sessions

Introduction

Recently, a number of our externally visible school machines were compromised. It is not clear the extent of the attacks effect, but from some of the details I did get, it sounds like the work of an amateur. Either way, the point of this article is not to speak to this issue directly, but rather focus on the inconvenience of having some of the host names you are accustomed to having access to suddenly disappear. To be clear, this is not an article complaining about having to change the hostnames we use on the command line, but ones that we have embedded in scripts and configuration files.

I know I am not the only one who was caught by this, but it is not clear if others where as throughly irritated by the change. I do a great deal of work remotely and, as such, I am highly dependant on the stability of the set of host names I use. So, in order to protect myself from any similar future mishaps, I made a few changes to the way I work.

It is not clear if my approach is particularly novel, but I’ve not seen it listed elsewhere, so I thought I would document it for others to try. It occurred to me that I could easily bypass these issues by simply running a local load balancer that balances over all the remote school hosts. There are a number of issues that crop up using this approach, and I will try to address them as thoroughly as possible.

Finding Balance

First things first: the software. I used an appropriately named tool Balance. As the name implies, it is a load balancer–and a very simple and successful one at that. I used my trusty old package manager to install it, but building it from source would be equally appropriate.

The syntax for Balance is very simple, so we can discuss it very simply:

1
$ balance 2222 linux1.example.com:22 linux2.example.com:22 linux3.example.com:22

This tells Balance that connections to the local port 2222 should be forwarded in an alternating fashion to port 22 on linux1.example.com, linux2.example.com and linux3.example.com. Furthermore, it tells Balance implicitly to run in the background.

Balance can be told to bind to the local port explicitly, via the -b option, but I generally only have one network adapter active at any one time, so calls to bind() will fail because it will be attempting to bind to the same local port on the same IP address. There are ways to get around this limitation, by binding multiple IP addresses to a single network adapter. I chose not to take this route, however, as it seems needlessly complex for my needs. Anyway, moving on.

Automation

Now that we have established how to balance over a number of remote hosts, we would probably like to automate the whole thing. For my purposes, I added something similar what follows to my shell profile. I have removed my comments to improve readability, but I will expand on the behaviour following the code.

We start by choosing a port >1000. Doing this means we do not need to run our scripts with a super user privileges. The main benefit of this is simplicity. That being said, I can think of no good reason for choosing a sub-1000 port, except that we would not be required to explicitly state which port to use when using ssh (more on this point later). Furthermore, using a sub-1000 port would likely require additional security considerations that I don’t care to think about right now.

1
2
LOCAL=2222
REMOTE=22

Next we build a list of all the destination servers we will “load balance” over. Balance will just round-robin over them: if one is down, it will move on to the next. This makes it simple for writing scripts that depend on a host name: i.e. there is no need to write ugly fail-over code. Note that I have listed the server names in decreasing order. I do this because it seems that most people that work remotely tend towards using the lower numbered hosts first. So in some sense this is just a naive heuristic for finding unloaded systems, without foreknowledge of the systems’ load.

1
2
3
4
SERVERS=
for i in {1..3}; do
    SERVERS="linux${i}.example.com:${REMOTE} ${SERVERS}"
done

The final step is to start Balance. But before we try to start the Balance, we should check if it is already running. If we do not do this, Balance will attempt to bind to a port that is already bound, and die a very ungraceful and uninformative death:

1
bind(): Address already in use

We also need to check that the instance of Balance is ours. If it is not, then we will need to spawn our own. The procedure bellow is not fool-proof, mind you, but it should work for our limited purposes. The solution can always be made more robust as the need arrises.

1
2
3
4
PID=`pidof | grep ${USER} | grep balance | awk '{ printf $2 }'`
if [ "${PID}x" == "x" ]; then
    balance ${LOCAL} ${SERVERS}
fi

So that is it for getting Balance up and running. Once we have spawned Balance, we can check on it fairly simply using ps:

1
2
$ ps ax | grep [b]alance
 80989 s001  S      0:00.04 balance 2222 linux3.example.com:22 linux2.example.com:22 linux1.example.com:22

The [b]alance notation stops grep from matching itself. This trick works because the command line for grep no long contains the pattern being matched.

Usage

Now we would like to actually make use of our new load balancer. We could use the new port and localhost name directly with ssh, but this can be a tad cumbersome, and easy to forget to do:

1
$ ssh localhost -p 2222

It also suffers from a far worse problem. This command line will work the first time, but will likely fail on subsequent invocations. The reason for this should be clear from the error message:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the RSA host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
33:ab:39:23:6f:4a:d9:86:bb:de:fe:2d:1f:6a:35:4d.
Please contact your system administrator.
Add correct host key in /home/user/.ssh/known_hosts to get rid of this message.
Offending key in /home/user/.ssh/known_hosts:42
RSA host key for [localhost]:2222 has changed and you have requested strict checking.
Host key verification failed.

What has happend is that we have moved on to a different host in the list of candidate hosts we defined above. So the warning is effectively a red-herring: in the general case, it is not that the host’s key has changed, but that we are looking at a new host, but considering it the same host as the previous one.

We can fix this by telling ssh to disable strict host checking for the localhost:

1
$ ssh localhost -p 2222 -o NoHostAuthenticationForLocalhost=yes

This command will always work, but it is a rather unruly beast to type each time we want to login remotely. We could of course just write an alias for our shell, but this limits, in some sense, the flexibility of our configuration. Thus, we turn to the ssh_config file for further inspiration.

SSH Configuration

I typically have a number of different user accounts on various different systems. I will not get in to the virtues of working this way, as I am sure that would take up as much space as this article has, and it is not entirely clear to me that many people would in fact agree that this is a good idea. We shall just settle with the fact that I do it, and I need my environment to allow me to work on many accounts transparently.

The ssh_config file allows a user to define a set of hosts, and the options to go with them. We could, for instance, have two users user1 and user2 configured to use the same set of hosts:

1
2
3
4
5
6
7
8
9
10
11
Host user1
    HostName localhost
    User user1
    Port 2222
    NoHostAuthenticationForLocalhost yes

Host user2
    HostName localhost
    User user2
    Port 2222
    NoHostAuthenticationForLocalhost yes

Note that we actually use the user names in place of the host names. This makes connecting to the load balanced hosts very simple. For instance, to log as user1 we would type the following:

1
$ ssh user1

Which is a far cry from the command line we required just a few minutes before. We can do the same for user2 and we could also add more hosts, and more options, but I will save that all for another time.

Making the Leap

None of my office machines are public facing, so they require a second hop once I’ve logged in to the public machines. This can get a little tiresome to type if you end up working remotely a lot, like I do.

To save myself a few keystrokes, I add the following entry to my ~/.ssh/config file:

1
2
3
4
5
6
7
8
9
10
11
12
13
Host user3
    HostName localhost
    User user3
    Port 2222
    NoHostAuthenticationForLocalhost yes

Host office
    HostName private1.example.com
    User private1
    Port 22
    # I actually really just want to hop from the public facing linux
    # machines to my office machine
    ProxyCommand ssh -q user3 nc -p0 %h %p

Where nc is a tool by the name netcat, a very hand network listener/sender/swiss army knife. Now, thanks to the simple configuration above, all I need to write is:

1
$ ssh office

and I’ll be connected directly to the workstation in my office.

Security

There is one last issue that I have been silently ignoring. The astute reader may have noted that we can no longer detect legitimate person-in-the-middle attacks, because we have explicitly told ssh not to authenticate host identities. I purposely ignored this issue for various reasons, arguably the most important one, to me, is that accounting for it would greatly complicate the whole article. Time permitting and cleverness permuting, I will try to address this shortcoming in a future article.

Code

As per usual, the code for all my work is freely available online. If you want to look at my implementation of these ideas in particular, point your browser at the following two pages:

Using Google as a Command-line Spell-checker

At the best of times, I can maintain a mediocre level of control of the English language. Unfortunately, this has not translates in to an ability to spell.

For most applications, this is fine, there’s a reasonable spellchecker built in to just about everything. It’s a problem for me when I’m at the terminal all day though. It’s true, there are plenty of command line checkers, but it’s rare that any can handle the level of misspelling I feed them. To date, the most capable system of parsing my spelling is Google. But switching to a browser and feeding in a word at a time is a tad cumbersome, to say the least.

This was the original impetus for the code. It’s a glorified cURL call, but it does the trick. It also became quite clear that for a little extra work, we could detect things like the Urban Dictionary (Dirionary) results as part of of the page.

For simplicity, we assume the following phrase will remain stable:

1
Did you mean to search for:[ ]*([^ ]*).*

This can make our code much simpler and easier to follow. It also seem, that in the long run, unless many people adopt this method of word-recommendation search, Google’s only changes will be for aesthetic reasons, and considering their overal minimalistic approach to data display, we should be able to actually quite easily update this in the future, even it if is a major word order change.

Where the above line shows up if Google is guessing (aka “recall that that word is actually spelt.”), the following is likely when the “optimal” match is found, according to its rule-set:

1
Showing results for[ ]*([^ ]*)

Lastly, for now, if we don see anything from Google proper, then we may be looking for a piece of slang, so we will use the result of our trusty old friend, Urban Dictionary:

1
Urban Dictionary:[ ]*([^ ]*)

The link to the full code can be found bellow, but the critical code is here:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
my @patterns = (
   "Did you mean to search for:[ ]*([^ ]*).*",
   "Showing results for[ ]*([^ ]*)",
   "Urban Dictionary:[ ]*([^ ]*)");
open (OUTPUT, "| tee $cache_filename");
while (<INPUT>) {
  chomp; s/<[^>]*>/ /g;

  # Check each of the patterns we defined above:
  for my $pattern (@patterns) {
    if (/$pattern/) {
      $suggestion = "$1";
      last;
    }
  }
}
print OUTPUT "$suggestion";

Notice we strip all HTML from the document, so it is easier to parse. It is odd that web-scrapping can come to this, where it just makes more sense to strip all that is structured to work with simpler structure. Is this what people call irony?

The full code can be seen here.

A Note About Me

I’ve been using computers for as long as I can remember.

I broke and fixed my first computer at the age of 7, and I’ve been hooked ever since. It has been a while since I concentrated strictly on the internals of machines. I became fascinated with software construction at age 10, after joining a school computer club. We learnt how to use our trigonometry lessons to make something happen on the screen. It was never fast nor pretty but it is was fun.

I have long stopped keeping up with the latest hardware releases. Instead, I focus on the architecture I am currently targeting. Despite not attending to the latest trends in consumer hardware, I still keep myself informed about larger trends or shifts in hardware design. I am currently particularly interested in multiĀ­ and many-core systems: from how their cache and register memory is arranged, what new concurrency primitives the cores support, all the way to how they shepherd bits between cores and even other physical CPUs.

I’ve been developing software professionally for almost 17 years now. I’ve written critical systems-level code; database applications; elaborate, native and aesthetically pleasing graphical user interfaces; highly parallel and distributed systems; as well as basic utilities to help myself and others close to me perform everyday task more pleasantly. Many of these projects have been developed using a variety of languages and a diverse set of tools. The remaining ones were written in C.

I prefer to use stable languages and mature tools for “production” code. However, experimenting with new languages and novel tools for small projects is something I continue to do often–for better or worse. Ultimately though, I like writing: I like writing code, I like writing about code, I like writing documentation for code, and I adore writing tutorials on how to use code. The latter two tend to suffer the most when I am busy, but only because it is usually the part that requires the most care to get correct. As far as I can tell, there are no automated tests for good explanations.