Planet Iron Blogger

October 27, 2010

Inside 245s

Blog name changed…

...because I don’t live in a room numbered 245s anymore. Yep. :-)

/img/cow.jpg

This is a cow. They munch grass next to the River Cam.

Pop quiz. What do matrix-chain multiplication, longest common subsequence, construction of optimal binary search trees, bitonic euclidean traveling-salesman, edit distance and the Viterbi algorithm have in common?

by Edward Z. Yang at October 27, 2010 01:00 PM

October 26, 2010

scripts.mit.edu

I'm seeing: "It is not safe to rely on the system's timezone settings."

If you are getting the PHP warning:

Warning: date() [function.date]: It is not safe to rely on the system’s timezone settings. You are \*required\* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected ‘America/New_York’ for ‘EDT/-4.0/DST’ instead in /afs/athena.mit.edu/… on line 0

You can resolve it by editing or creating a php.ini file in the same directory as the PHP script emitting this error and adding the line:

date.timezone = America/New_York

by ezyang at October 26, 2010 11:21 PM

How do I debug my scripts?

Unfortunately, scripts.mit.edu does not support per-user error logs (yet!), so debugging a recalcitrant CGI script may be a little difficult. While the Scripts team hopes to add error logging in the near future, here are some immediate ways you can convince your scripts to give you more information:

  • You can view error logs in real time using the ‘logview’ command on a Scripts server. Note that you need to be SSH’ed into the server that is serving your website, and not necessarily scripts.mit.edu server. You can determine what server you are being load-balanced to by checking the bottom of http://scripts.mit.edu and looking for “You are currently connected to XXX.mit.edu” server. This will not display all errors: in particular, if an error message spans multiple lines, you will only see the first line, and error messages with no data associated with your locker will also be hidden.
  • If you have a PHP script that is failing with no error output, you likely have error reporting turned off. Open or create a php.ini file in the same directory as the script, and in it ensure that “display_errors = on” and “error_reporting = E_ALL ^ E_NOTICE”. This will cause errors to be displayed on the web page.
  • If you have some script that you know is writing information to stderr that you want to see, you can often run it directly from the command line via SSH to scripts and see if it outputs anything interesting. Nota bene: the configuration of the command line may be subtly different than that invoked from the web, for example, php from the command line will not load a php.ini file in the directory; you’ll need to manually add it with -c

by ezyang at October 26, 2010 10:51 PM

October 25, 2010

Inside 245s

OCaml for Haskellers

I’ve started formally learning OCaml (I’ve been reading ML since Okasaki, but I’ve never written any of it), and here are some notes about differences from Haskell from Jason Hickey's Introduction to Objective Caml. The two most notable differences are that OCaml is impure and strict.


Features. Here are some features OCaml has that Haskell does not:

  • OCaml has named parameters (~x:i binds to i the value of named parameter x, ~x is a shorthand for ~x:x).
  • OCaml has optional parameters (?(x:i = default) binds i to an optional named parameter x with default default).
  • OCaml has open union types ([> 'Integer of int | 'Real of float] where the type holds the implementation; you can assign it to a type with type 'a number = [> 'Integer of int | 'Real of float] as a). Anonymous closed unions are also allowed ([< 'Integer of int | 'Real of float]).
  • OCaml has mutable records (preface record field in definition with mutable, and then use the <- operator to assign values).
  • OCaml has a module system (only briefly mentioned today).
  • OCaml has native objects (not covered in this post).

Syntax. Omission means the relevant language feature works the same way (for example, let f x y = x + y is the same)

Organization:

{- Haskell -}
(* OCaml *)

Types:

()   Int Float Char String Bool (capitalized)
unit int float char string bool (lower case)

Operators:

  == /= .&.  .|. xor  shiftL shiftR complement
= == != land lor lxor [la]sl [la]sr lnot

(arithmetic versus logical shift in Haskell depends on the type of the Bits.)

Float operators in OCaml: affix period (i.e. +.)

Float casting:

fromIntegral floor
int_of_float float_of_int

String operators:

++ !!i
^  .[i] (note, string != char list)

Composite types:

(Int, Int)  [Bool]
int * int   bool list

Lists:

x :  [1, 2, 3]
x :: [1; 2; 3]

Data types:

data Tree a = Node a (Tree a) (Tree a) | Leaf
type 'a tree = Node of 'a * 'a tree * 'a tree | Leaf;;

(note that in OCaml you'd need Node (v,l,r) to match, despite there not actually being a tuple)

Records:

data Rec = Rec { x :: Int, y :: Int }
type rec = { x : int; y : int };;
Field access:
    x r
    r.x
Functional update:
    r { x = 2 }
    r with x = 2

(OCaml records also have destructive update.)

Maybe:

data Maybe a = Just a | Nothing
type 'a option = None | Some of 'a;;

Array:

         readArray a i  writeArray a i v
[|1; 3|] a.(i)          a.(i) <- v

References:

newIORef writeIORef readIORef
ref      :=         !

Top level definition:

x = 1
let x = 1;;

Lambda:

\x y -> f y x
fun x y -> f y x

Recursion:

let     f x = if x == 0 then 1 else x * f (x-1)
let rec f x = if x == 0 then 1 else x * f (x-1)

Mutual recursion (note that Haskell let is always recursive):

let f x = g x
    g x = f x
let rec f x = g x
and     g x = f x

Function pattern matching:

let f 0 = 1
    f 1 = 2
let f = function
    | 0 -> 1
    | 1 -> 2

(note: you can put pattern matches in the arguments for OCaml, but lack of an equational function definition style makes this not useful)

Case:

case f x of
    0 -> 1
    y | y > 5 -> 2
    y | x == 1 || x == 2 -> y
    _ -> -1
match f x with
    | 0 -> 1
    | y when y > 5 -> 2
    | (1 | 2) as y -> y
    | _ -> -1

Exceptions:

Definition
    data MyException = MyException String
    exception MyException of string;;
Throw exception
    throw (MyException "error")
    raise (MyException "error")
Catch exception
    catch expr $ \e -> case e of
        x -> result
    try expr with
        | x -> result
Assertion
    assert (f == 1) expr
    assert (f == 1); expr

Build:

ghc --make file.hs
ocamlopt -o file file.ml

Run:

runghc file.hs
ocamlrun file.ml

Type signatures. Haskell supports specifying a type signature for an expression using the double colon. OCaml has two ways of specifying types, they can be done inline:

let intEq (x : int) (y : int) : bool = ...

or they can be placed in an interface file (extension mli):

val intEq : int -> int -> bool

The latter method is preferred, and is analogous to an hs-boot file as supported by GHC.


Eta expansion. Polymorphic types in the form of '_a can be thought to behave like Haskell’s monomorphism restriction: they can only be instantiated to one concrete type. However, in Haskell the monomorphism restriction was intended to avoid extra recomputation for values that a user didn’t expect; in OCaml the value restriction is required to preserve the soundness of the type system in the face of side effects, and applies to functions too (just look for the tell-tale '_a in a signature). More fundamentally, 'a indicates a generalized type, while '_a indicates a concrete type which, at this point, is unknown—in Haskell, all type variables are implicitly universally quantified, so the former is always the case (except when the monomorphism restriction kicks in, and even then no type variables are ever shown to you. But OCaml requires monomorphic type variables to not escape from compilation units, so there is a bit of similarity. Did this make no sense? Don’t panic.)

In Haskell, we’d make our monomorphic value polymorphic again by specifying an explicit type signature. In OCaml, we generalize the type by eta expanding. The canonical example is the id function, which when applied to itself (id id) results in a function of type '_a -> '_a (that is, restricted.) We can recover 'a -> 'a by writing fun x -> id id x.

There is one more subtlety to deal with OCaml’s impurity and strictness: eta expansion acts like a thunk, so if the expression you eta expand has side effects, they will be delayed. You can of course write fun () -> expr to simulate a classic thunk.


Tail recursion. In Haskell, you do not have to worry about tail recursion when the computation is lazy; instead you work on putting the computation in a data structure so that the user doesn't force more of it than they need (guarded recursion), and “stack frames” are happily discarded as you pattern match deeper into the structure. However, if you are implementing something like foldl', which is strict, you’d want to pay attention to this (and not build up a really big thunk.)

Well, OCaml is strict by default, so you always should pay attention to making sure you have tail calls. One interesting place this comes up is in the implementation of map, the naive version of which cannot be tail-call optimized. In Haskell, this is not a problem because our map is lazy and the recursion is hidden away in our cons constructor; in OCaml, there is a trade off between copying the entire list to get TCO, or not copying and potentially exhausting stack space when you get big lists. (Note that a strict map function in Haskell would have the same problem; this is a difference between laziness and strictness, and not Haskell and OCaml.)


File organization. A single file OCaml script contains a list of statements which are executed in order. (There is no main function).

The moral equivalent of Haskell modules are called compilation units in OCaml, with the naming convention of foo.ml (lower case!) corresponding to the Foo module, or Foo.foo referring to the foo function in Foo.

It is considered good practice to write interface files, mli, as described above; these are like export lists. The interface file will also contain data definitions (with the constructors omitted to implement hiding).

By default all modules are automatically “imported” like import qualified Foo (no import list necessary). Traditional import Foo style imports (so that you can use names unqualified) can be done with open Foo in OCaml.


Module system. OCaml does not have type classes but it does have modules and you can achieve fairly similar effects with them. (Another classic way of getting type class style effects is to use objects, but I’m not covering them today.) I was going to talk about this today but this post is getting long so maybe I’ll save it for another day.


Open question. I’m not sure how much of this is OCaml specific, and how much generalizes to all ML languages.

by Edward Z. Yang at October 25, 2010 01:00 PM

cslink

Izakaya in Boston

I recently went to Basho, an izakaya near Fenway. The interior was very large and trendy looking, complete with a crazy light display behind the sushi bar and fake bamboo. The overall effect was somewhat disconcerting, as I generally expect izakayas to be smaller, more intimate spaces. Also, as is typical for an izakaya, it had a long, confusing menu with a lot of different options with a lot of small plates and appetizer size portions. However, this place tried really hard to make its long confusing menu not as confusing by including a lot of explanatory text and had a note assuring the reader that he could summon one of the wait staff to explain the options and help plan a meal. I imagine this was a feature for the largely non-Japanese clientele but I personally found it rather confusing, probably because it was a confusing reversal of expectations.

Confusion about the ambiance aside, the food was very good. We ordered a bunch of skewers of various meats and vegetables, and they were all delicious, though could have benefited from a heat lamp to keep them warm before serving. We also got some special rolls and the foie gras teriyaki, which were also quite tasty. Also, there was a good selection of sake at what looked to be pretty reasonable prices. Overall: confusing atmosphere, tasty food.

by Cordelia at October 25, 2010 08:22 AM

Fireflies Sing

Mon of the Week: Enclosed Plover

At the other end of the spectrum in terms of realism from the Perching Hawk is this mon below, from the same collection of provincial samurai mon.(KJ:7) The highly stylized bird in the middle is a plover, and this depiction of plovers is still common through the present day. Plovers were a common motif in Japanese poetry, with connotations of longevity based on their cry “chiyo” (thousands of generations).(Komuso) More mysterious is the enclosure around the plover. It bears some similarity to a piece of horse equipment (aori/泥障) that would hang from the saddle and sit between the rider’s legs and the horse’s body; these could be decorated with mon the way the plover image appears here. This explanation isn’t entirely convincing, so there could be other possibilities.

Enclosed Plover

by Kihō at October 25, 2010 05:13 AM

Hyperextended Metaphor

Lean Startups and the Theory of the Firm

If you spend much time in the entrepreneurial corners of the blogosphere, you’re certain to have heard about lean startups. If you haven’t, check out Eric Reis and Steve Blank. The core of the lean startup is two related ideas: continuous validation and building the smallest company that can validate an idea. The result is dramatically reduced costs, reduced time-to-failure, and reduced risk. A lot has been written and can be written about validation. But what I’m concerned about now is how small the smallest possible company is, and specifically why it is usually more than one person.

In business school you are likely to encounter the Theory of the Firm. If you haven’t been to business school, but you grew up in the modern west, it may seem strange to think that you need a theory to explain the existence of big companies. But actually, big companies are a recent innovation, something that came about in the later part of the Industrial Revolution, the early 20th century. Adam Smith, when conceiving the famous Invisible Hand of capitalism, had no concept of the international megacorporation. His pin makers worked in small groups, with the free market guiding their interactions.

If the markets are efficient, there ought not be any need for corporations. People can freely associate to pursue their various goals, exchanging money for goods and services, each pursuing their own ends. In fact, a large corporation represents an imposition on the free market, where a group of people (employees) decide to transact with each other and the owners of the corporation under special rules. The question at hand is why they do that, and why are some specializations best kept inside the firm while others are commonly contracted out.

The large corporation may be a phenomenon of the 20th century, brought about by efficiencies of scale, inefficiencies of communication, concentrations of management and financial expertise. Or there may be fundamental value to the corporation. The theory of the firm enables us to reason about why companies exist, and whether they will persist.

The most widely understood theory of the firm is that of Ronald Coase, based on trasnsaction costs. In Coase’s model, having a service provider within the firm is economically advantageous when the cost of transacting for a service or asset with an outside party exceeds the inefficiency of bringing the service or asset inside the firm. To answer whether a given function belongs inside the firm, from office cleaning services to recruiting to software development, examine the costs associated with contracting for the function, compared to the efficiency gained from getting the service on the free market. This theory is very attractive for the modern lean startup. In the 21st century more and more functions, from graphic design to office space, are being standardized, commoditized, and delivered on liquid markets like 99designs. As communications technology improves, transaction costs go down, and firms should get smaller. These are exciting times.

But there are other approaches to understanding the firm. The paper which precipitated this blog post, Eric Van den Steen’s Interpersonal Authority in a Theory of the Firm (via Marginal Revolution), finds substantial value in the firm to create goal-alignment. In his model, consider two parties with two business opportunities (for example, building a product and selling it) deciding how to pursue them. If their two opportunities have are substantially interdependent, but their decisions are made independently, then each is in danger of being spoiled by the other. If instead one party takes a controlling role, offering the other party appropriate incentives, then the likelihood of being spoiled drops out and it is more likely that both business opportunities will be successful. And further, the optimal incentives in this case looks more like salary than like partnership, because the goal is to get the employee to do what they are told, rather than what they think will be most successful.

The take-away for the lean startup is that you must include in your firm the people, skills, and assets from whom you require alignment to a common goal. You can outsource anything where the practitioner can pursue their own profit maximization and not impact your focus. The meta take-away is that the theory of the firm is still open to innovative interpretations. For anyone interested in studying entrepreneurship, it’s important to understand the economics underlying the organizations that are being created.

by tibbetts at October 25, 2010 03:28 AM

Made of Bugs

Configuring dnsmasq with VMware Workstation

I love VMware workstation. I keep VMs around for basically every version of every major Linux distribution, and use them heavily for all kinds of kernel testing and development.

This post is a quick writeup of my networking setup with VMware Workstation, using dnsmasq to assign my VMs addresses and provide a DNS server to resolve VM addresses.

The objective

I want to be able to resolve my VM’s hostnames so that I can ssh to them, or run other network services and access them from the host. I could just assign static addresses and put them in /etc/hosts, but that’s totally lame, and liable to be a source of error and frustration, because I have dozens of VMs, and add and remove them frequently.

We’re going to set things up so that when VMs get addresses from DHCP, their hostnames automatically become resolvable, using the .vmware domain. To do this, we’re going to set up a piece of software called dnsmasq, which is a flexible DNS and DHCP server, designed for basically exactly this purpose.

The setup

Because I use my VMs for local testing, I just keep most of them on a local NAT on my machine. I configure that virtual network inside VMware as follows (run vmware-netconfig, or follow the appropriate menus):

Note how I disable “Use local DHCP service to distribute IP addresses to VMs” — we’re going to set up dnsmasq to prove DHCP, so we don’t want it fighting with VMware’s.

Notice that the subnet I’m using here is 172.16.37.* — if you choose a different one, you’ll need to adjust accordingly later.

Configuring dnsmasq

Then, I install dnsmasq, and configure /etc/dnsmasq.conf as follows:

listen-address=172.16.37.1
listen-address=127.0.0.1
no-dhcp-interface=lo

server=192.168.1.1
local=/vmware/

no-hosts
no-resolv

domain=vmware
dhcp-fqdn

dhcp-range=172.16.37.3,172.16.37.200,12h
dhcp-authoritative
dhcp-option=option:router,172.16.37.2

Here’s what each of those lines mean, in order:

listen-address=172.16.37.1
listen-address=127.0.0.1
no-dhcp-interface=lo

We don’t want dnsmasq serving DHCP or DNS to the outside world or other virtual networks, so we only tell it to listen on the local interface — so that we can talk to it from the host — and to the virtual network we set up in the previous step. We don’t want it serving DHCP to localhost, though, so we tell it not to.

server=192.168.1.1
local=/vmware/

Here we tell dnsmasq how to forward DNS requests to the outside world. We’re going to be using dnsmasq as our primary nameserver, and having it forward requests for things it doesn’t understand to a real DNS server. In my case, that’s my LAN’s router, at 192.168.1.1. The local line tells dnsmasq that the .vmware domain is local, and it should never forward requests to resolve things in that domain.

If I needed something more complicated, it might be possible to use the resolv-file option or similar, but I don’t, personally.

no-hosts
no-resolv

These options tell dnsmasq not to look at resolv.conf or /etc/hosts when resolving names — we want it only to resolve VMs itself, and to forward everything else.

domain=vmware
dhcp-fqdn

This tells dnsmasq to assign the .vmware domain to hosts it hands out DHCP to, so that we can resolve VMs in the .vmware domain.

dhcp-range=172.16.37.3,172.16.37.200,12h
dhcp-authoritative

And finally, we configure the DHCP server. We give it a range of addresses to assign on the subnet we created earlier. I stop at .200, so that I can leave the last few open for static IPs if I need for some reason, and we start at .3.1 is the host, and .2 is the address of VMware’s router. dhcp-authoritative enables some optimizations when dnsmasq knows it is the only DHCP server around.

dhcp-option=option:router,172.16.37.2

Finally, we need dhcp-option to tell DHCP clients to use the VMware-provided router at .2 as their gateway, instead of using the host, at .1. We could configure the host to be a NAT server using Linux’s NAT, but that’s outside the scope of this document.

Configuring the host

Now, we need to configure the host to use dnsmasq as our DNS server. This is a simple matter of telling the host to use 127.0.0.1 as our DNS server, and to add .vmware to our search path. If we’re editing resolv.conf directly, it would look like:

search vmware
nameserver 127.0.0.1

Configuring guests

We need to configure our guests to send a hostname along with their DHCP requests, so that dnsmasq can add them to its address table. How to do this varies by OS, but most modern OSes do it automatically. If they don’t, here are a few hints:

For RHEL-based distros, edit /etc/sysconfig/network-scripts/ifcfg-INTERFACE, and add a line like

 DHCP_HOSTNAME=centos-5-amd64

For most other Linux distributions, you can often edit dhclient.conf (usually in /etc/ or /etc/dhclient/) to include:

 send host-name "centos-5-amd64";

Or, with a recent dhclient,

 send host-name "<hostname>";

will make it look up the machine’s actual hostname.

Conclusions

That’s all there is to it. This is a pretty simple setup, but hopefully someone else will find this useful. If you need dnsmasq to do something more subtle, the documentation is mostly quite good.

by nelhage at October 25, 2010 03:15 AM

October 24, 2010

Free Dissociation

unofficial python audiobox.fm uploader

tl;dr audiobox-uploader-0.01.tar.gz git repository

Some time ago I was being frustrated by my inability to access the music stored on my personal fileserver while at work -- something about Apple having locked iTunes sharing to the local subnet, the lack of decent DAAP clients for Mac, and so on and so forth. Moving all the many gigabytes of music I have to my work laptop over work's network connection is slow and anti-social, and at any rate then I have two places in which I need to manage my music and propagate new albums I buy. (Yes, if this were a Twitter post it would get the #firstworldproblems hashtag.) "Wouldn't it be great, in this much-ballyhooed age of Cloud Computing," says I to myself, "if my music could live in the cloud."

Some friends of mine have a startup, MixApp, which lets me (legally!) publish the music on my fileserver and listen to it and chat about it with friends online, which was sort of like what I wanted. It's actually a really neat service, and I like it and use it a decent bit, but I don't always want to listen to music with other people, and the interface is tuned to the social music listening model and not so much to being like iTunes. Additionally, at the time they were having server problems (since resolved!) so that avenue wasn't available to me.

I started looking around online, and the first service I ran across that seemed to fit the bill was AudioBox.fm. For a mere $10 a month, they'll host up to 151GB of music, and they've got a nice Flash-based, iTunes-like player, last.fm scrobble support, decent library management capability, and most of the other features I expect out of modern music player software. They've got support for a bunch of formats besides MP3 (FLAC, OGG, and M4A being the ones I care most about), though all the music gets transcoded to MP3 for streaming, so I've been mostly converting to MP3 locally before I upload, since there's no sense taking up the storage space for FLAC if I don't get any benefit from it. Since it's all my music, I can also get it back any time I want, so it's a convenient backup of my music collection.

The only problem was getting all my music into the service. There's currently a fairly nice Flash uploader, but it only takes 999 tracks at once and only MP3s, and there's now also a Java WebStart-based uploader (which there wasn't when I started), but most of my music lives on my Linux fileserver, not any of the client computers I use, so neither of those was going to do it. There's also a nice RESTful API, and so I set out to write a Linux upload script.

Along the way, I discovered that none of Python's built-in HTTP libraries deal with submitting multipart forms. I ended up stealing the multipart processing logic from Gabriel Falcao's bolacha library, of which portions were in turn borrowed from Django's test client, but I was disappointed that the support wasn't built into something more comprehensive. Claudio Poli at AudioBox pointed me towards bolacha, and has been excellent to work with on this script -- I'm pleased with AudioBox's attentiveness to developers. (Careful observers will note that AudioBox offers both an OAuth authentication API for web services and HTTP Basic authentication for desktop applications, and Claudio promises that they aren't going to pull a Twitter on desktop and open-source application developers.)

None of Python's built-in or commonly-used HTTP libraries support bandwidth throttling, either, which turns out to be important when you're uploading tens of gigabytes of music. I thought about building native support into the upload script, but I wanted to get a release out, and the trickle utility turns out to work marvellously on Python to limit its upload bandwidth use, so I punted on that. Seriously, if you don't know about trickle already, you should make a note of it -- I can't remember the number of times I've wanted to throttle a program that didn't provide the option, so its existence falls into the "I wish I'd known about this years ago" category.

At any rate, the result of my labors is audiobox-uploader-0.01.tar.gz, released here for the first time. Source can be found on Github, and users should please feel free to contact me with any questions, comments, or patches you might have. :-)

by Kevin Riggle at October 24, 2010 03:13 AM

October 23, 2010

Somatic Hypermutation

“元素” (Elements) Lyrics and Translation

So recently there was an article on boingboing titled “The Elements Song (Tom Lehrer tune), Super Cute Japanese Version,” which featured two 13-year-old girls singing “元素” [genso]*, meaning “element” (Japanese doesn’t have articles or plural forms of nouns), a Japanese rendition of the well-known tune “The Elements” by Tom Lehrer. Because it is awesome, I’ve decided to transcribe the lyrics and translate them (obviously, for most of the song, the translation is fairly obvious) here. First, I’ve made a table of the elements in the order that they are sung — unlike Tom Lehrer’s version, there are no extra words like “also”, etc, so these are the lyrics to most of the song — with chemical symbol in the left column, Japanese name (as written in the youtube video — there are alternate ways to write some of the names, which I’ll talk about later) in the middle column, and Hepburn romanization in the rightmost column, followed by the final sentence in the song….

Sb アンチモン anchimon
As ヒ素 hiso
Al アルミニウム aruminiumu
Se セレン seren
H 水素 suiso
O 酸素 sanso
N 窒素 chisso
Re レニウム reniumu
Ni ニッケル nikkeru
Nd ネオジム neojimu
Np ネプツニウム neputsuniumu
Ge ゲルマニウム gerumaniumu
Fe tetsu
Am アメリシウム amerishiumu
Ru ルテニウム ruteniumu
U ウラン uran
Eu ユウロピウム yuuropiumu
Zr ジルコニウム jirukoniumu
Lu ルテチウム rutechiumu
V バナジウム banajiumu
La ランタン rantan
Os オスミウム osumiumu
At アスタチン asutachin
Ra ラジウム rajiumu
Au kin
Pa プロトアクチニウム purotoakuchiniumu
In インジウム injiumu
Ga ガリウム gariumu
I ヨウ素 youso
Th トリウム toriumu
Tm ツリウム tsuriumu
Tl タリウム tariumu
Y イットリウム ittoriumu
Yb イッテルビウム itterubiumu
Ac アクチニウム akuchiniumu
Rb ルビジウム rubijiumu
B ホウ素 houso
Gd ガドリニウム gadoriniumu
Nb ニオブ niobu
Ir イリジウム irijiumu
Sr ストロンチウム sutoronchiumu
Si ケイ素 keiso
Ag gin
Sm サマリウム samariumu
Bi ビスマス bisumasu
Br 臭素 shuuso
Li リチウム richiumu
Be ベリリウム beririumu
Ba バリウム barium
Ho ホルミウム horumiumu
He ヘリウム heriumu
Hf ハフニウム hafuniumu
Er エルビウム erubiumu
P リン rin
Fr フランシウム furanshiumu
F フッ素 fusso
Tb テルビウム terubiumu
Mn マンガン mangan
Hg 水銀 suigin
Mo モリブデン moribuden
Mg マグネシウム maguneshiumu
Dy ジスプロシウム jisupuroshiumu
Sc スカンジウム sukanjiumu
Ce セリウム seriumu
Cs セシウム seshiumu
Pb namari
Pr プラセオジウム puraseojiumu
Pt 白金 hakkin
Pu プルトニウム purutoniumu
Pd パラジウム parajiumu
Pm プロメチウム puromechiumu
K カリウム kariumu
Po ポロニウム poroniumu
Ta タンタル tantaru
Tc テクネチウム tekunechiumu
Ti チタン chitan
Te テルル teruru
Cd カドミウム kadomiumu
Ca カルシウム karushiumu
Cr クロム kuromu
Cm キュリウム kyuriumu
S 硫黄 iou
Cf カリホルニウム karihoruniumu
Fm フェルミウム ferumiumu
Bk バークリウム baakuriumu
Md メンデレビウム menderebiumu
Es アインスタイニウム ainsutainiumu
No ノーベリウム nooberiumu
Ar アルゴン arugon
Kr クリプトン kuriputon
Ne ネオン neon
Rn ラドン radon
Xe キセノン kisenon
Zn 亜鉛 aen
Rh ロジウム rojiumu
Cl 塩素 enso
C 炭素 tanso
Co コバルト kobaruto
Cu dou
W タングステン tangusuten
Sn スズ suzu
Na ナトリウム natoriumu
Lr ローレンシウム roorenshiumu
Rf ラザホージウム razahoojiumu
Db ドブニウム dobuniumu
Sg シーボーギウム shiiboogiumu
Bh ボーリウム booriumu
Hs ハッシウム hasshiumu
Mt マイトネリウム maitoneriumu
Ds ダームスタチウム daamusutachiumu
Rg レントゲニウム rentogeniumu
Cn コペルニシウム koperunishiumu

The final line of the song is “これが今ま派遣された全てな元素の集まりです” [kore ga ima made haken sareta subete na genso no atsumari desu], which roughly translates to “this is all of the elements collection that have been sent up ’til now.”

Here comes the random spew of notes about the song, transcription process, translation process, etc: the order of the elements is the same as in the original, but there are more of them, which have been tacked on to the end of the song. In fact, the last element in the song is new enough that when I was going through and checking my transcription/romanization using WWWJDIC, I found that the dictionary didn’t have it. And in checking the transcription/romanization, I ended up finding two mistakes. I’m also considering giving this mass of katakana to my Japanese-Learners students for practice. How about it, guys?

Anyway, to the last line: it’s hard to get the number of syllables exactly right for everything, so some of the vowels are stretched out when sung. The problem is, in Japanese, the length of the vowel is a differentiating trait between words. The final line is sung with extra elongated vowels (so that it sounds like “haaken saareta”), but given that we’re probably trying to approximate the last part of the original “The Elements” song, I settled on the word 派遣 [haken] (defined in WWWJDIC as dispatch/send) as the noun to form the compound verb “send” when combined with された [sareta], the perfect/past form of the potential form of する [suru], meaning “do.” Thus, the combination “派遣された” [hakensareta] roughly means “was sent,” and it modifies the noun phrase “全てな元素の集まり” [subete na genso no atsumari], where 全て [subete] means “all”, な [na] is a nominal-connecting particle, 元素 [genso] means “element” (or, as pointed at near the beginning of the post, could be interpreted as “the elements” because of the lack of articles and plural noun forms in Japanese), の [no] is the other nominal-connecting particle, and 集まり [atsumari], derived from the verb 集まる [atsumaru], which means “gather up” or “collect”, means “collection.” Thus, the noun phrase can be translated as “all of the elements collection,” and since it is modified by “派遣された” [hakensareta], I translated that chunk of the sentence as “all of the elements collection that have been sent.”

As for the rest of the sentence, これ [kore] is a demonstrative that means “this” (and here refers to the aforementioned elements, of course), が [ga] is a subject marker that indicates that “これ” [kore] is the subject of the imperfect, distal copula, です [desu], at the end of the sentence. And the “今まで” [ima made] component of the sentence can be broken into 今 [ima], meaning “now,” and まで [made], a particle meaning “until.”

So now, some comments about the names of the elements, because I find them somewhat intriguing: most of the element names are from English, German, or Chinese, as exemplified in アルミニウム [aruminiumu], アンチモン [anchimon] (from “Antimon”), and ヒ素 [hiso], respectively. Okay, so the completely katakana names have origins that are fairly obvious — they’re nipponizations of either English or German, mostly (I’d say all, but some are potentially ambiguous, and superlatives are difficult to support). A handful of elements share the exact same kanji as their Chinese counterparts: Fe/鉄 [tetsu], Au/金 [kin], Ag/銀 [gin], Pb/鉛 [namari], and Cu/銅 [dou]. Another handful of elements are derived from the Chinese: As/ヒ素/砒素 [hiso], I/ヨウ素/沃素 [youso], B/ホウ素/硼素 [houso], Si/ケイ素/珪素/硅素 [keiso], and F/フッ素/弗素 [fusso]. (All of the non-素 [so] 漢字 [kanji] are not considered common kanji, according to WWWJDIC.) Going down the list one at a time, then: 砒 is pronounced pī​ and means “arsenic” in Chinese, which is where the Japanese pronunciation derives from. The Chinese word 沃 is pronounced wò​ and means fertile/rich/irrigate; again, Japanese pronunciation derives from the kanji and is thus written with katakana because it’s a loan word of sorts. Oddly, the Chinese 硼, which does mean “boron,” is pronounced péng​, so this nipponization is beyond me…. Both 珪 and 硅 mean “silicon,” though the first character has a radical that is generally used with precious materials and can refer to a “jade tablet” (according to MDBG), while the second character refers specifically to the chemical element and is the character used in the Chinese periodic table; both characters are pronounced guī​, for which the nipponization makes sense again. Finally, we have 弗, pronounced fú​ and meaning “not” in Chinese (according to MDBG), though the meaning of the 漢字 [kanji] in Japanese is “dollar” (according to WWWJDIC); the pronunciation makes sense, but I’m unsure as to the rationale behind the meaning….

Of course, some of the names are original to Japanese: H/水素 [suiso] meaning “water element,” O/酸素 [sanso] meaning “sour/acidic element,” N/窒素 [chisso] meaning “plug-up/obstruct element,” Br/臭素 [shuuso] meaning “stinky element,” Hg/水銀 [suigin] meaning “liquid silver,” Pt/白金 [hakkin] meaning “white gold,” S/硫黄 [iou] meaning “yellow sulfur” (the first character is the same as the Chinese character for elemental sulfur, while the second character means “yellow”), Zn/亜鉛 [aen] meaning “come-after lead,” Cl/塩素 [enso] meaning “salt element,” and C/炭素 [tanso] meaning “charcoal/coal element.” The hypotheses for nitrogen and zinc that Ben came up with on zephyr follow:

[Nitrogen] blocks oxidation and/or breathing.
Traditional early experiments in such things involved burning metal in a confined volume of air, allowing one to measure that 30% of the air’s mass was added to the metal and 70% was left unreacted. Isolation and further study of that remaining portion shows that it obstructs breathing, etc..

I think it has to do with the refining process — if you have mixed zinc and lead ore, the lead will reduce out first, but if you go to higher temperature (???) then the zinc will come off.

So there you have it. “The Elements” song in Japanese!!

* Japanese in this post is followed by the Hepburn romanization in brackets, as it is through most of the blog.

by Zek at October 23, 2010 05:03 PM

October 22, 2010

Inside 245s

Don’t Repeat Yourself is context dependent

I am a member of a group called the Assassins’ Guild. No, we don’t kill people, and no, we don’t play the game Assassin. Instead, we write and run competitive live action role-playing games: you get some game rules describing the universe, a character sheet with goals, abilities and limitations, and we set you loose for anywhere from four hours to ten days. In this context, I’d like to describe a situation where applying the rule Don’t Repeat Yourself can be harmful.

The principle of Don’t Repeat Yourself comes up in a very interesting way when game writers construct game rules. The game rules are rather complex: we’d like players to be able to do things like perform melee attacks, stab each other in the back, conjure magic, break down doors, and we have to do this all without actually injuring anyone or harming any property, so, in a fashion typical of MIT students, we have “mechanics” for performing these in-game actions (for example, in one set of rules, a melee attack can be declared with “Wound 5”, where 5 is your combat rating, and if another individual has a CR of 5 or greater, they can declare “Resist”; otherwise, they have to role-play falling down unconscious and bleeding. It’s great fun.) Because there are so many rules necessary to construct a reasonable base universe, there is a vanilla, nine-page rules sheet that most gamewriters adapt for their games.

Of course, the rules aren’t always the same. One set of GMs (the people who write and run the game) may decide that a single CR rating is not enough, and that people should have separate attack and defense ratings. Another set of GMs might introduce robotic characters, who cannot die from bleeding out. And so forth.

So when we give rules out to players, we have two possibilities: we can repeat ourselves, and simply give them the full, amended set of rules. Or we can avoid repeating ourselves, and give out the standard rules and a list of errata—the specific changes made in our universe. We tend to repeat ourselves, since it’s easier to do with our game production tools. But an obvious question to ask is, which approach is better?

The answer is, of course, it depends.

  • Veteran players who are well acquainted with the standard set of rules don’t need the entire set of rules given to them every time they play a game; instead, it would be much easier and more efficient for them if they were just given the errata sheet, so they can go, “Oh, hm, that’s different, ok” and go and concoct strategies for this altered game universe. This is particularly important for ten-days, where altered universe rules can greatly influence plotting and strategy.
  • For new players who have never played a game before, being given a set of rules and then being told, “Oh, but disregard that and that and here is an extra condition for that case” would be very confusing! The full rules, repeated for the first few times they play a game, is helpful.

I think this same principle applies to Don’t Repeat Yourself as applied in software development. It’s good and useful to adopt a compact, unique representation for any particular piece of code or data, but don’t forget that a little bit of redundancy will greatly help out people learning your system for the first time! And to get the best of both worlds, you shouldn’t even have to repeat yourself: you should make the computer do it for you.

Postscript. For the curious, here is a PDF of the game rules we used for a game I wrote in conjunction with Alex Gurany and Jonathan Chapman, The Murder of Jefferson Douglass (working name A Dangerous Game).

Postscript II. When has repeating yourself been considered good design?

  • Perl wants programmers to have to say as little as possible to get the job done, and this has given it a reputation as a “write only language.”
  • Not all code that looks the same should be refactored into a function; there should be some logical unity to what is factored out.
  • Java involves writing copious amounts of code: IDEs generate code for hashCode and equals, and you possibly tweak it after the fact. Those who like Java controversially claim that this prevents Java programmers from doing too much damage (though some might disagree.)
  • When you write essays, even if you’ve already defined a term fifty pages ago, it’s good to refresh a reader’s memory. This is especially true for math textbooks.
  • Haskell challenges you to abstract as much mathematically sound structure as possible. As a result, it makes people’s heads hurt, leads to combinator zoos up to the wazoo. But it’s also quite beneficial for even moderately advanced users.

Readers are encouraged to come up with more examples.

by Edward Z. Yang at October 22, 2010 01:00 PM

October 20, 2010

Inside 245s

Purpose of proof: semi-formal methods

In which the author muses that “semi-formal methods” (that is, non computer-assisted proof writing) should take a more active role in allowing software engineers to communicate with one another.


C++0x has a lot of new, whiz-bang features in it, one of which is the atomic operations library. This library has advanced features that enable compiler writers and concurrency library authors to take advantage of a relaxed memory model, resulting in blazingly fast concurrent code.

It’s also ridiculously bitchy to get right.

The Mathematizing C++ Concurrency project at Cambridge is an example of what happens when you throw formal methods at an exceedingly tricky specification: you find bugs. Lots of them, ranging from slight clarifications to substantive changes. As of a talk Mark Batty gave on Monday there are still open problems: for example, the sequential memory model isn’t actually sequential in all cases. You can consult the Pre-Rapperswil paper §4 for more details.

Which brings me to a piercing question:

When software engineers want to convince one another that their software is correct, what do they do?

This particular question is not about proving software “correct”—skeptics rightly point out that in many cases the concept of “correctness” is ill-defined. Instead, I am asking about communication, along the lines of “I have just written an exceptionally tricky piece of code, and I would now like to convince my coworker that I did it properly.” How do we do this?

We don’t.

Certainly there are times when the expense of explaining some particular piece of code is not useful. Maybe the vast majority of code we write is like this. And certainly we have mechanisms for “code review.” But the mostly widely practiced form of code review revolves around the patch and frequently is only productive when the original programmer is still around and still remembers how the code works. Having a reviewer read an entire program has been determined to be a frustrating and inordinately difficult thing to do—so instead, we focus on style and local structure and hope no one writes immaculate evil code. Security researchers may review code and look for patterns of use that developers tend to “get wrong” and zero in on them. We do have holistic standards, but they tend towards “it seems to work,” or, if we’re lucky, "it doesn’t break any automated regression tests.”

What we have is a critical communication failure.


One place to draw inspiration from is that of proof in mathematics. The proof has proven to be an useful tool at communicating mathematical ideas from one person to another, with a certain of rigor to avoid ambiguity and confusion, but not computer-level formality: unlike computer science, mathematicians have only recently begun to formalize proofs for computer consumption. Writing and reading proofs is tough business, but it is the key tool by which knowledge is passed down.

Is a program a proof? In short, yes. But it is a proof of the wrong thing: that is, it precisely specifies what the program will do, but subsequently fails to say anything beyond that (like correctness or performance or any number of other intangible qualities.) And furthermore, it is targeted at the computer, not another person. It is one of the reasons why “the specification of the language is the compiler itself” is such a highly unsatisfying answer.

Even worse, at some point in time you may have had in your head a mental model of how some dark magic worked, having meticulously worked it out and convinced yourself that it worked. And then you wrote // Black magic: don't touch unless you understand all of this! And then you moved on and the knowledge was lost forever, to be rediscovered by some intrepid soul who arduously reread your code and reconstructed your proof. Give them a bone! And if you haven’t even convinced yourself that the code for your critical section will do the right thing, shame on you! (If your code is simple, it should have been a simple proof. If your code is complicated, you probably got it wrong.)


You might argue that this is just the age-old adage “we need more documentation!” But there is a difference: proofs play a fundamentally different role than just documentation. Like programs, they must also be maintained, but their maintenance is not another chore to be done, inessential to the working of your program—rather, it should be considered a critical design exercise for assuring you and your colleagues of that your new feature is theoretically sound. It is stated that good comments say “Why” not “What.” I want to demand rigor now.

Rigor does not mean that a proof needs to be in “Greek letters” (that is, written in formal notation)—after all, such language is frequently off putting to those who have not seen it before. But it’s often a good idea, because formal language can capture ideas much more precisely and succinctly than English can.

Because programs frequently evolve in their scope and requirements (unlike mathematical proofs), we need unusually good abstractions to make sure we can adjust our proofs. Our proofs about higher level protocols should be able to ignore the low level details of any operation. Instead, they should rely on whatever higher level representation each operation has (whether its pre and post-conditions, denotational semantics, predicative semantics, etc). We shouldn’t assume our abstractions work either (nor should we throw up our hands and say “all abstractions are leaky”): we should prove that they have the properties we think they should have (and also say what properties they don’t have too). Of course, they might end up being the wrong properties, as is often the case in evolutionary software, but often, proof can smoke these misconceptions out.

by Edward Z. Yang at October 20, 2010 01:00 PM

October 18, 2010

Clare Bayley

Freshmen Photobooth at the Aquarium

Freshmen Photobooth: Nick

I realized it’s been waaay too long since I’ve posted photos, since I’ve mostly been shooting for BadBoys and you don’t get to see those until you buy a calendar. ;) Anywho, here are some wonderful shots we got from the photobooth we did at the New England Aquarium for an MIT orientation event.

Freshmen Photobooth: Group 2

Freshmen Photobooth: Group 1

Freshmen Photobooth: Laura

Freshmen Photobooth: Group 3

Freshmen Photobooth: Captured

Freshmen Photobooth: Attack!

Freshmen Photobooth: Roadkill Buffet

by Clare Bayley at October 18, 2010 07:00 PM

Inside 245s

Rapidly prototyping scripts in Haskell

I’ve been having some vicious fun over the weekend hacking up a little tool called MMR Hammer in Haskell. I won’t bore you with the vagaries of multimaster replication with Fedora Directory Server; instead, I want to talk about rapidly prototyping scripts in Haskell—programs that are characterized by a low amount of computation and a high amount of IO. Using this script as a case study, I’ll describe how I approached the problem, what was easy to do and what took a little more coaxing. In particular, my main arguments are:

  1. In highly specialized scripts, you can get away with not specifying top-level type signatures,
  2. The IO monad is the only monad you need, and finally
  3. You can and should write hackish code in Haskell, and the language will impose just the right amount of rigor to ensure you can clean it up later.

I hope to convince you that Haskell can be a great language for prototyping scripts.

What are the characteristics of rapidly prototyping scripts? There are two primary goals of rapid prototyping: to get it working, and to get it working quickly. There are a confluence of factors that feed into these two basic goals:

  • Your requirements are immediately obvious—the problem is an exercise of getting your thoughts into working code. (You might decide later that your requirements are wrong.)
  • You have an existing API that you want to use, which let’s you say “I want to set the X property to Y” instead of saying “I will transmit a binary message of this particular format with this data over TCP.” This should map onto your conception of what you want to do.
  • You are going to manually test by repeatedly executing the code path you care about. Code that you aren’t developing actively will in general not get run (and may fail to compile, if you have lots of helper functions). Furthermore, running your code should be fast and not involve a long compilation process.
  • You want to avoid shaving yaks: solving unrelated problems eats up time and prevents your software from working; better to hack around a problem now.
  • Specialization of your code for your specific use-case is good: it makes it easier to use, and gives a specific example of what a future generalization needs to support, if you decide to make your code more widely applicable in the future (which seems to happen to a lot of prototypes.)
  • You’re not doing very much computationally expensive work, but your logic is more complicated than is maintainable in a shell script.

What does a language that enables rapid prototyping look like?

  • It should be concise, and at the very least, not make you repeat yourself.
  • It should “come with batteries,” and at least have the important API you want to use.
  • It should be interpreted.
  • It should be well used; that is, what you are trying to do should exist somewhere in the union of what other people have already done with the language. This means you are less likely to run into bizarre error conditions in code that no one else runs.
  • It should have a fast write-test-debug cycle, at least for small programs.
  • The compiler should not get in your way.

General prototyping in Haskell. If we look at our list above, Haskell has several aspects that recommend it. GHC has a runghc command which allows you to interpret your script, which means for quick prototyping. Functional programming encourages high amounts of code reuse, and can be extremely concise when your comfortable with using higher-order functions. And, increasingly, it’s growing a rather large set of batteries. In the case of LDAP MMR, I needed a bindings for the OpenLDAP library, which John Goerzen had already written. A great start.

The compiler should not get in your way. This is perhaps the most obvious problem for any newcomer to Haskell: they try to some pedestrian program and the compiler starts bleating at them with a complex type error, rather than the usual syntax error or runtime error. As they get more acquainted with Haskell, their mental model of Haskell’s type system improves and their ability to fix type errors improves.

The million dollar question, then, is how well do you have to know Haskell to be able to quickly resolve type errors? I argue, in the case of rapid prototyping in Haskell, not much at all!

One simplifying factor is the fact that the functions you write will usually not be polymorphic. Out of the 73 fully implemented functions in MMR Hammer, only six have inferred nontrivial polymorphic type signatures, all but one of these is only used single type context.

For these signatures, a is always String:

Inferred type: lookupKey :: forall a.
                            [Char] -> [([Char], [a])] -> [a]

Inferred type: lookupKey1 :: forall a.
                             [Char] -> [([Char], [a])] -> Maybe a

m is always IO, t is always [String] but is polymorphic because it’s not used in the function body:

Inferred type: mungeAgreement :: forall (m :: * -> *).
                                 (Monad m) =>
                                 LDAPEntry -> m LDAPEntry

Inferred type: replicaConfigPredicate :: forall t (m :: * -> *).
                                         (Monad m) =>
                                         ([Char], t) -> m Bool

a here is always (String, String, String); however, this function is one of the few truly generic ones (it’s intended to be an implementation of msum for IO):

Inferred type: tryAll :: forall a. [IO a] -> IO a

And finally, our other truly generic function, a convenience debugging function:

Inferred type: debugIOVal :: forall b. [Char] -> IO b -> IO b

I claim that for highly specific, prototype code, GHC will usually infer fairly monomorphic types, and thus you don’t need to add very many explicit type signatures to get good errors. You may notice that MMR Hammer has almost no explicit type signatures—I argue that for monomorphic code, this is OK! Furthermore, this means that you only need to know how to use polymorphic functions, and not how to write them. (To say nothing of more advanced type trickery!)

Monads, monads, monads. I suspect a highly simplifying assumption for scripts is to avoid using any monad besides IO. For example, the following code could have been implemented using the Reader transformer on top of IO:

ldapAddEntry ldap (LDAPEntry dn attrs) = ...
ldapDeleteEntry ldap (LDAPEntry dn _ ) = ...
printAgreements ldap = ...
suspendAgreements ldap statefile = ...
restoreAgreements ldap statefile = ...
reinitAgreements ldap statefile = ...

But with only one argument being passed around, which was essentially required for any call to the API (so I would have done a bit of ask calling anyway), so using the reader transformer would have probably increased code size, as all of my LDAP code would have then needed to be lifted with liftIO.

Less monads also means less things to worry about: you don’t have to worry about mixing up monads and you can freely use error as a shorthand for bailing out on a critical error. In IO these get converted into exceptions which are propagated the usual way—because they are strings, you can’t write very robust error handling code, but hey, prototypes usually don’t have error handling. In particular, it’s good for a prototype to be brittle: to prefer to error out rather than to do some operation that may be correct but could also result in total nonsense.

Hanging lambda style also makes writing out code that uses bracketing functions very pleasant. Here are some example:

withFile statefile WriteMode $ \h ->
    hPutStr h (serializeEntries replicas)

forM_ conflicts $ \(LDAPEntry dn attrs) ->
    putStrLn dn

Look, no parentheses!

Reaping the benefits. Sometimes, you might try writing a program in another language for purely pedagogical purposes. But otherwise, if you know a language, and it works well for you, you won’t really want to change unless there are compelling benefits. Here are the compelling benefits of writing your code in Haskell:

  • When you’re interacting with the outside world, you will fairly quickly find yourself wanting some sort of concurrent execution: maybe you want to submit a query but timeout if it doesn’t come back in ten seconds, or you’d like to do several HTTP requests in parallel, or you’d like to monitor a condition until it is fulfilled and then do something else. Haskell makes doing this sort of thing ridiculously easy, and this is a rarity among languages that can also be interpreted.

  • Because you don’t have automatic tests, once you’ve written some code and manually verified that it works, you want it to stay working even when you’re working on some other part of the program. This is hard to guarantee if you’ve built helper functions that need to evolve: if you change a helper function API and forget to update all of its call sites, your code will compile but when you go back and try running an older codepath you’ll find you’ll have a bunch of trivial errors to fix. Static types make this go away. Seriously.

  • Haskell gives you really, really cheap abstraction. Things you might have written out in full back in Python because the more general version would have required higher order functions and looked ugly are extremely natural and easy in Haskell, and you truly don’t have to say very much to get a lot done. A friend of mine once complained that Haskell encouraged you to spend to much time working on abstractions; this is true, but I also believe once you’ve waded into the fields of Oleg once, you’ll have a better feel in the future for when it is and isn’t appropriate.

  • Rigorous NULL handling with Maybe gets you thinking about error conditions earlier. Many times, you will want to abort because you don’t want to bother dealing with that error condition, but other times you’ll want to handle things a little more gracefully, and the explicit types will always remind you when that is possible:

    case mhost of
        (Just host) -> do
            let status = maybe "no status found" id mstatus
            printf ("%-" ++ show width ++ "s : %s\n") host status
        _ -> warnIO ("Malformed replication agreement at " ++ dn)
    
  • Slicing and dicing input in a completely ad hoc way is doable and concise:

    let section = takeWhile (not . isPrefixOf "profile") . tail
                . dropWhile (/= "profile default") $ contents
        getField name = let prefix = name ++ " "
                        in evaluate . fromJust . stripPrefix prefix
                                    . fromJust . find (isPrefixOf prefix)
                                    $ section
    

    But at the same time, it’s not too difficult to rip out this code for a real parsing library for not too many more lines of code. This an instance of a more general pattern in Haskell, which is that moving from brittle hacks to robust code is quite easy to do (see also, static type system.)

Some downsides. Adding option parsing to my script was unreasonably annoying, and after staring at cmdargs and cmdlib for a little bit, I decided to roll my own with getopt, which ended up being a nontrivial chunk of code in my script anyway. I’m not quite sure what went wrong here, but part of the issue was my really specialized taste in command line APIs (based off of Git, no less), and it wasn’t obvious how to use the higher level libraries to the effect I wanted. This is perhaps witnessed by the fact that most of the major Haskell command line applications also roll their own command parser. More on this on another post.

Using LDAP was also an interesting exercise: it was a fairly high quality library that worked, but it wasn’t comprehensive (I ended up submitting a patch to support ldap_initialize) and it wasn’t battle tested (it had no workaround for a longstanding bug between OpenLDAP and Fedora DS—more on that in another post too.) This is something that gets better with time, but until then expect to work closely with upstream for specialized libraries.

by Edward Z. Yang at October 18, 2010 01:00 PM

cslink

Little Q reloaded

There used to be this hot pot restaurant in Quincy called Little Q. Last year, they closed down because they lost their retail space. However, they’ve recently reopened as Q Restaurant in Chinatown. I went and checked it out today.

The space is extremely trendy, which I find very strange for a Chinese hot pot place. It’s probably because I’m used to Chinese hot pot being in hole in the wall locations or buffet places like the one I’ve written about previously. I was pretty curious about how Q Restaurant would be able to compete with Hot Pot Buffet, seeing as it’s at a significantly higher price point. The answer appears to be higher quality meats (the USDA prime beef practically melted in my mouth), serving sushi, and generally being a trendier place. For example, there was an extensive cocktail menu and a full bar. The cocktails I tried were extremely tasty and quite innovative. Also, the service is way more attentive.

As for the food, the meats were all very good. The seafood platter had some fish cuts that looked like they were secretly supposed to be for nigiri and I wonder if they use the same fish for it. The broths were varied and interesting. Overall, it was delicious and I’ll definitely be going back.

by Cordelia at October 18, 2010 07:30 AM