will fitzgerald

slightly less blurry in real life

Profile

Computer Software | San Francisco Bay Area, US

Education

Additional Information

Websites:

Posts

April 28, 10:22 PM

I won the Listserve lottery, and sent this story out to about 13k people today. It was a revision of something I’d written a while back.

Patrick of Ireland

Once upon a time — and listen, for although this sounds like a fable, I will try to make every word true — once upon a time there was a young boy named Patrick, a young Christian boy whose father was a deacon, and whose grandfather was a priest. Patrick, though he was raised in a Christian household did not himself know God. I imagine he was like most people, just living his life out without much concern for spiritual things.

But then something really terrible happened. Raiders attacked his parents’ villa, and Patrick was taken off to be a slave in Ireland. I suspect that his parents had plans for him to go off to the city to become an educated gentleman. Instead, he was taken off at age sixteen to become a slave. Instead of going to school, he was forced to become a shepherd.

In his loneliness and emptiness, he began to remember what he knew about God. And he started to pray the prayers he had been taught; I don’t know what prayers, but I imagine he prayed the “Our Father.” He started praying more and more — in the fields at night and during the day, even waking up before daylight to pray, praying up to, he says, 100 prayers in a day. God started to burn in him.

He was a slave until he was about twenty, and then something very spooky but real happened. He was sleeping, but he heard a voice saying he was going home soon. Soon after he heard the voice again saying that his boat was ready. He immediately fled from his slave-owner, for he recognized that this was a message directly from God. He had to travel 200 miles to get to the harbor where the ship was, in a land he did not know, among people whom he did not know.

He found the ship, and they almost didn’t take him. After the visions and after walking 200 miles, I imagine he was disappointed. He headed back to the hut where he was staying praying along the way. As he was walking and praying, one of the men shouted at him to come back; he could get a ride with them.

They traveled for three days before they landed, and then the whole group started walking — after twenty-eight days, their food ran out, and it was uninhabited. His companions were kind to have taken him in, but now they were hungry and cranky, and began to taunt him about God, asking that perennial question: If God is so great and powerful, why isn’t God helping us?

Patrick had good reason to trust that God had something other than starving in mind for them, though, and he told them boldly that they should become Christians. He also said that God was going to provide so much food that very day that they couldn’t eat any more. And so it happened: a herd of pigs came by, and I imagine they had a pretty good pork barbecue. Their attitude towards God and Patrick changed that day; in fact, they had fire and food enough for the rest of their journey.

Patrick eventually was able to return to his kinsfolk, and he was glad to return, and they were glad to have him. But Patrick had another dream: a man named Victorius bringing letters from Ireland, and in his dream he read one of the letters. The letter said it was “The Voice of the Irish,” and he could hear the voices of people he knew in Ireland begging him to return to return to the land of his slavery.

So Patrick returned to Ireland to preach. He didn’t cast out the snakes; we don’t know if he used a shamrock to teach the Trinity; the hymns attributed to him are most likely not by him. But we know he continued to face hardship. In fact, he was kidnapped at least one more time for two months. Patrick’s life was full of ups and downs, He was ashamed of his poor education (poor man, he only knew three or so languages and wrote his confession in Latin that was perhaps not up to par). He called himself a stutterer, though he preached to thousands. He felt that “poverty and failure suited him better than wealth and delight.” He certainly never got wealthy ministering to the Irish. He remained homesick, I think, to the end of his days. But he recognized the work of God through him, but he felt bound by the Spirit to remain in Ireland.


March 24, 06:40 PM

The tr program converts a string of characters into another string of characters, using a very simple rule system. The tr [A-Z] [a-z].  But this works for simple “ASCII” strings only. tr, at least on many systems, understands Unicode, and so the standard example fails for converting, say, Russian or Czech. But tr also understands character classes, so the standard example should be written tr [:upper:] [:lower:].

File under “boring post.”


March 17, 02:38 AM

This American Life released a retraction today about it story reporting on working conditions at Apple suppliers in China. Kudos to This American Life for spending an entire show on this.

I’m afraid that This American Life’s focus on its own errors might overshadow the truths about working conditions in China. To their credit, they spend some time trying to get at the facts.

In the final minutes, Ira Glass interviews Charles Duhigg of the New York Times, who has done his own investigation of working conditions at Apple. Duhigg makes the following claims, which I have little reason to doubt:

  • Actual labor costs are not a major component of the cost of creating Apples products; Apple products could be made in the US for roughly the same cost.
  • It is the ability to quickly and flexibly adjust its supplier chain that in the real benefit to sourcing to China (actually, I’m a bit skeptical of Duhigg’s story here — I wonder if he is exaggerating a bit — but I don’t doubt at all that China is very much more flexible than the US).
  • The biggest violations in working conditions are overwork (24 hour back-to-back shifts, 60+ hour work weeks, people pressured into working over time) and unsafe conditions (for example, flamable industrial dust).
  • Apple lacks the will to insist on better working conditions in China.
  • If consumers put pressure on Apple, Apple would insist on better working conditions. (In an interview Duhigg did with Terry Gross, he compared this to the changes that occurred at Nike suppliers).
Duhigg finished with this:

You’re not only the direct beneficiary; you are actually one of the  reasons why it exists. If you made different choices, if you demanded different conditions, if you demanded that other people enjoy the same work protections that you yourself enjoy, then, then those conditions would be different overseas.

This is worth pondering. I think it might be time to stop buying from Apple until things are vastly improved. If Apple, the leading tech producer, corrects its course, I am sure most of the the other hardware companies will follow suit.

Questions — what is Microsoft’s current record with respect to working conditions overseas? Are there hardware companies who are more ethical?


March 08, 06:19 PM

A random Scala note.

Today, I wanted to apply a list of filters to each item in a list, and return just those that pass each of the filters.

For example, given a range of integers, return just those that are divisible by 2 and by 3.

Let’s start by defining a boolean function divides:

def divides(d:Int,i:Int) = if (i%d==0) true else false

Note that divides(2,_:Int) defines the (partial) function for division by 2.

(divides(2,_:Int))(2) => true
(divides(2,_:Int))(3) => false

So we can create our filters so:

val filters = divides(2,_:Int) :: divides(3,_:Int) :: Nil

or

val filters = List(divides(2,_:Int),divides(3,_:Int))

Now, we can simply use Scala’s filter and forall functions to filter a range of integers:

scala> Range(1,50).filter(x => filters.forall(f => f(x)))
res45: scala.collection.immutable.IndexedSeq[Int] =
  Vector(6, 12, 18, 24, 30, 36, 42, 48)

The filters could also be defined as a Set, but by creating them as a List, one can put the less expensive filters first.


March 01, 12:35 AM

Last night, I watched the American Experience documentary on the Amish, which I thought very well done. They treated the Amish with respect, neither nostalgically nor scornfully, with beautiful camera work, well-crafted stories, and good expert commentary. Personally, I especially enjoyed hearing some Amish singing, especially O Gott Vater, which I have never heard sung.

They told the story of the 2006 Nickel Mines tragedy, in which a man shot ten Amish girls in a schoolhouse, killing five of them, and of the amazing forgiveness the Amish showed the killer (who died in the attack) and the killer’s family. They also told stories of former Amish who were placed under the ban and shunned by their Amish communities. The documentarians did not draw any connections between these two, but the stories seem to prompt the following question: how can such peace-loving, forgiving people place others under a strict ban of total non-contact?

The Amish do not need me to defend them, but I have some thoughts. The first is a historical one. When the early Anabaptists began to practice the ban, it is important to remember what the typical punishment was for those who disagreed with a community was in their day. Basically, in most communities, deep theological and civil disagreements would lead to a death sentence for the rebellious one. The Anabaptists themselves were hunted down, and thousands were cruelly murdered for their disagreement. Over against this, the Anabaptists (and the Amish is particular) came up with a non-capital punishment for disagreement: the ban. Compared to death, being shunned is a light sentence.

It is important to remember that the ban is applied only to Amish who have been baptized in the community. Adults who were never baptized as Amish Christians are not subject to the ban; only those who, having committed their lives to God and to the Amish community   through baptism are subject to shunning. Adults who grew up Amish, but decided against baptism, are not shunned — they are treated as any other member of the world.

The documentary makes a point of saying that, in the Amish view, the community is more important than the individual. God is obey as a community, not as a bunch of people who happen to believe the same thing and act in concert. One outworking of this is that the Amish act to preserve and protect and nourish the community more than their individual members. Shnning is but the most drastic example of this: it is more important to protect the community than it is to maintain family and friendship ties, no matter how tender.

Finally, the Amish also believe that shunning is the best thing for the person being shunned; a radical kind of tough love to bring the rebel back (and then, the famous Amish forgiveness should kick in). By servering all of a person’s contact with the community (where one has learned to speak, to love, where ones friends and family are),  the community hopes to instill the deep cost of  rebellion. There are few half-way steps.

As a Mennonite, I admire the consistency and faithfulness of my Anabaptist cousins, and I think about this topic a lot, in fact: how to uphold standards as the world goes to hell in a handbasket. I don’t really have the power to place people under a ban, but it’s good to be reminded that people need to accept responsibility for their actions, that the community is important, and a commitment to following Jesus is a very serious one.

God grant us wisdom to know the differences between how to treat people who fail you, who fail the church though their inidividual actions.


January 13, 12:01 AM

Dearest creature in creation,
Study English pronunciation.
It’s more regular in its core
Than pundits, who focus on its more
Erratic ways, would have you believe.
Perhaps they simply cannot conceive
Of any system not based in Latin—
They would choose, I suppose, to flatten
All writing to “one form, one sound”
But, really, regularities abound.
Consider, how we pronounce the plural
Form of words; Imagine the neural
Work of reading “dogs” and “cats.”
Would you prefer “dogz”? That’s
Not right—that single ess for each
Is easier to read, to sound out, and to teach.
Or consider “heir/inherit”
To write “air” would be a demerit,
A signature failure, and a sign
Of a spelling system’s worse design.
Seriously, it would simply astonish,
Anyone to think that “ghoti” sounds like “fish.”
Besides, English spans such colossal ages
And latitudes, I doubt such cages
Desired by fans of regularization
Could withstand the normal mutation
Of how language really adapts.
“Wind” and “hind” have rhymed or not, perhaps,
As, over time and place, each has adopted
A short I, sometimes a long I, co-opted
By real human beings. So “after tea and cakes and ices, “
Let us “force the moment to its crisis”—
Haters, they say, are going to hate; let them snivel
I have had enough of drivel,
Go ahead, enjoy your whine,
But English spelling is basically fine.

—Will Fitzgerald, January 2012


January 07, 06:47 PM

% of English tweets by size (sample 50k)

I get a very different distribution of tweets than Isaac Hepworth — no spikes at 28. My provisional guess is that his data is a bit wonky. My data here is (only) 50k English tweets from one day in 2007.

Isaac Hepworth's distribution


December 31, 07:09 PM

The WordPress.com stats helper monkeys prepared a 2011 annual report for this blog.

Here’s an excerpt:

A New York City subway train holds 1,200 people. This blog was viewed about 6,000 times in 2011. If it were a NYC subway train, it would take about 5 trips to carry that many people.

Click here to see the complete report.


November 22, 07:37 PM

This is a followup to my post Computational lexicography on the cheap using Twitter, but more especially in response to Using off-the-shelf software for basic Twitter analysis.

The later article shows how to use database software (MySQL and its implementation of the SQL language) to do basic Twitter analysis. The ‘basic analysis’ includes counts by hashtag, timelines, and word clouds. They analyse about 475k tweets.

But here’s the thing: all their analyses can be done more simply with simple text files and pipes of Unix commands (as most eloquently demonstarted in Unix for Poets, by Ken Church). In fact, several simple   commands—commands I use everyday—are powerful enough to do the kind of analyses they discuss.

Getting the data.

(You can skip over this if you have data already!)

Interestingly, they do not show how to get the tweets to begin with. My previous post discusses this, but it might be useful to show a simple Ruby program that collects Tweet data, especially since the method has changed slightly since my post. The biggest hurdle is setting up authentication to access Twitter’s data—discussed in full, here, but the crucial thing is that you have to register as a Twitter developer, register a Twitter application, and get special tokens. You create an application at the Twitter apps page; from that same location you generate the special tokens.

Here’s the Ruby script (also listed here).

require 'rubygems'
require 'tweetstream'
require 'date'

TweetStream.configure do |config|
  config.consumer_key = ''
  config.consumer_secret = ''
  config.oauth_token = ''
  config.oauth_token_secret = ''
  config.auth_method = :oauth
  config.parser   = :json_gem
end

# Change the words you want to track
TweetStream::Client.new.track('football', 'baseball', 'soccer', 'cricket') do |status|
  begin
    # The Tweet id
    id = status.id
    # The text of the tweet, with new lines (returns) replaced by spaces
    txt = status.text.gsub(/\n/," ")
    # The date of the tweet, printed out in a slightly more useful form 
    # for our purposes
    d = DateTime.parse(status.created_at).strftime("%Y-%m-%d\t%H:%M:%S")
    puts [id,txt,d].join("\t")
  rescue Exception => e
    puts "!!! Error: #{e.to_s}"
  end
end

With the proper keys and secrets, this gist wlll allow you to track keywords over time, and print out, in a tab-separated format, the tweet id, the text of the tweet, the date, andthe time it was published (in UTC, or Greenwich, time). You could add additional columns, as described (by example) in the Twitter API.

The example here tracks mentions of football, baseball, soccer, and cricket, but obviously, these could be other keywords. Running this using this command:

ruby track_tweets.rb | tee nsports.tsv

will place tweets in the file ‘nsports.tsv’.

Basic statistics

Counting the number of football, baseball, etc. mentions is easy:

$ grep -i football nsports.tsv | wc -l
$ grep -i baseball nsports.tsv | wc -l
$ grep -i soccer nsports.tsv | wc -l
$ grep -i cricket nsports.tsv | wc -l

As well as getting the number of lines in the file:

$ cat nsports.tsv | wc -l

The second analysis was to count who is retweeted the most, done by counting the username after the  standard Twitter “RT ” (eg “rt @willf good stuff!”). The following pipeline of commands accomplishes this simply enough:

egrep -io "rt +@\w+" nsports.tsv | perl -pe "s/ +/ /g" | cut -f2 -d\  | sort | uniq -c | sort -rn | head

(This may be easier to copy from here). Each of these is a separate command, and the pipe symbol (|), indicates that the output from one command goes on to the next. Here’s what these commands do:

  1. egrep -io “rt +@\w+” nsports.tsv — searches through the tweets for the pattern RT space @ name, where there is one or more spaces, and one or more ‘word’ characters. It only prints the matching parts (-o), and ignores differences in case (-i).
  2. perl -pe “s/ +/ /g” — I noticed that from time to time, there is more than one space after the ‘RT’, so this substitutes one or more spaces with exactly one space.
  3. cut -f2 -d\  – Each line looks like “RT @name”, now, and this command ‘cuts’ the second field out of each line, with a delimiter of a space. This results in each line looking like ‘@name’.
  4. sort | uniq -c | sort -rn — this is three commands, but I type them so frequently, it seems like one to me. It sorts the text, so they can be counted with the uniq command, which produces two columns : the count and the name; we reverse sort (-r) on the first numeric field (-n)
  5. head — this shows the top ten lines from a file.

This command pipeline should have no problem handling 475k lines.

The third analysis was to put the data in a format that can be used by Excel to create a graph, with counts by day. Because we have printed the date and time in separate columns, with the date in column 3. So, we can simply do the cut, sort, uniq series:

cat nsports.tsv | cut -f3 | sort | uniq -c > for_excel.tsv

This will put the data into a format that Excel can read.

Finally, the authors show how to create Wordle word graphs overall, and for the categories. I’m not a big fan of these as a data exploration tool, but notice you can use cut -f2 to get the text to paste into Wordle.

So, this is computational social science on the cheap using Twitter, using some basic Unix commands (cat, cut, sort, uniq, grep), with one tiny, tiny call to Perl. You can do this too–and it’s easier to learn than MySQL and SQL! Plus, you can easily read the text files that are created. All of this was done on a standard Mac, but any Unix machine, or Windows machine with the Cygwin tools installed, can do this as well.


Repositories

Websites

My personal website is Entish.org. It contains a lot of things, including my cv, a tutorial for C++, a bunch of stuff about shape note singing, and the canonical list of sticky jokes. An old blog of mine, the Digital Car Journal, is archived there.

Some other sites and projects: The Trumpet is an indie "thrice annual journal of shape note music." HarmoniaSacra.org, an online version of the shape note tunebook, the Harmonia Sacra. I also designed and managed our church's website, KMenno.org.

Once, I helped NASA and the Army make a helicopter fly all by itself.

By the way, the spash page picture was made by Jessica Beer at the 22nd Midwest Sacred Harp Convention.

Updates

abcdefghijklmnopqrstuvwxyz abcdefghijklmnopqrstuvwxyz