4e46

"For" vs. "as" continued

03-09-2009

Hypothesis: In philosophy, people care about analogy and metaphor, but not about iteration or quantification; in computer science, it's just the opposite. The word "as" can be used for analogies and metaphors ("she's as beautiful as a night sky"), while the word "for" can be used for iteration or quantification ("for i from 1 to n", "for all "). So philosophers avoid "as" and computer scientists avoid "for" because each word is ambiguous in the respective discipline.

"For" vs. "as"

03-06-2009

I noticed something today. Back when I used to write philosophy papers, I frequently used the word "for" to connect justifications to claims, e.g. "Socrates is a man, for he is mortal." Now, I tend to use "as" instead, e.g. "Socrates is a man, as he is mortal."

I thought that the change might be just an accident of my personal development. But I was reading an essay by a philospher tonight, and another one by a historian, and I noticed that they both used "for" in this sense! I never see this usage in ML papers, on the other hand, but I do see "as", e.g. I think I noticed it the other day in Vapnik. So perhaps there are systematic differences among fields. I wonder why ... .

Cumulative learning

02-18-2009

"No one has written a masterpiece in a language he hadn't forgotten learning." – Nabokov(?)

The first step to applying a machine learning algorithm to some dataset or task is typically to choose a suitable feature representation. For example, if our dataset is a natural language document, we may choose to represent each document by a "bag-of-words" count vector, a sequence of word tokens, a parse tree, a scalar that is 1 if the word "the" appears in the document and 0 otherwise, and so on. Making an appropriate choice requires knowledge of both the domain and the algorithm, and it's not always easy to get right.

To avoid this possibility, people often use feature selection as a second preprocessing step. Then you can put in a bunch of possibly irrelevant features and hope that the feature selection algorithm filters all but the best ones away.

But standard feature selection is limited in several ways. (a) First, feature selection can only remove features, it can't find new ones. This is undesirable both on philosophical grounds and on practical grounds. Philosophically, it's ugly. Practically, (i) we already mentioned it can be difficult to find good features to add. For example, suppose we're representing each document as a fixed-length vector; once we've added, say, word counts, part-of-speech tag counts, and perhaps a few indicators for key phrases or constructions, it's not clear what else to add. (ii) If we add other features just by making them up as we go along, the problem is that the number of possibilities is unbounded, and grows exponentially. For example we could in addition to a dimension for each word, our vectors could also have a dimension for each pair of words—but this requires additional dimensions; adding a dimension for each triple of words requires additional dimensions, etc. This approach is clearly infeasible.

(b) Second, standard feature selection cannot change the fundamental representation of each datum. In the previous example, if it turns out that a lot of information is contained in the triples, perhaps we'd be better off representing each document as a stream rather than as a vector—but standard feature selection can't help us decide that. (In fact, we'd probably need an entirely different learning algorithm.)

(c) Third, standard feature selection only happens once.

My idea is to learn in stages, starting from the "original" representation of the data. We have some target concept that we want to learn. Somehow—I'm still not exactly sure what the best way to do this is; that's the research task—we build up a mental model of the training data gradually, and the hope is that at the end the target concept will be easy to learn.

A human analogue motivates this idea. We don't try to learn real analysis before learning basic arithmetic, then algebra, and then calculus. We don't try to play the Art of Fugue before learning how to play Anna Magdalena's Notebook and then the Well-Tempered Clavier. And, as Nabakov's quotation suggests, we don't try to write a novel before learning how to understand a language, speak a language, and write simple things in that language.

Schedule

02-18-2009

I have been discouraged lately by my lack of progress, both at learning what is known and at research. I've been working really hard (except for the last week), but it seems I have less to show for it than I'd like.

I believe that my current strategy has at least three weaknesses:

  1. I have not been studying broadly enough. Ph.D. students are supposed to drill deep holes; but when drilling a hole if you start out too narrowly your bit may stick. I think this has happened to me.
  2. I've been working toward goals that are too short term. This is necessary to a certain extent—I absolutely must publish something, preferably a couple of things, before next fall in order to avoid disaster, and there are a ton of deadlines right now for summer conferences. But I need to be realistic about what I can accomplish and then work steadily to achieve it.
  3. I have not been exercising enough, and this affects my mental acuity. This whole year the pool hours have been 6:30-9pm. Last year, and last summer, the pool was open 5-7, and I swam regularly. In January it was also open 5-7, and I again swam regularly. But 6:30 is too late for me, so I haven't been exercising at all.
A fourth weakness, that I am a bad writer, will (I hope!) be ameliorated by this blog.

I intend to address weaknesses (1) and (3) by going to the gym every day at 5pm. Instead of swimming I'll do one of the unpleasant aerobics machines (ad 3), but I'll read a book while I'm doing it (ad 1). I started reading a couple of days ago a new biography of Chagall, which is quite interesting. At Carleton I used this technique for my last 1.5 years with good success.

I'm still not sure how to address weakness (2). I feel a lot of pressure, and I do need to be ambitious. It's not easy to strike the right balance.

Minimax review

02-17-2009

Setting: Zero-sum, two-player, perfect-knowledge game.

Aim: Find best next move.

Key intuition: Suppose we have two moves available, and , and that

  • if we take , our opponent can choose between a terrible move (for him) and an excellent one, while
  • if we take , our opponent can choose between two mediocre moves.
We might take move and hope that our opponent makes a mistake, but against a skilled opponent is no good, because it allows the opponent an excellent move; is better because the best our opponent can get is a mediocre move.

Formulation: We want to find the optimal move from state . We play when is even, and the opponent plays when is odd. The set of possible moves from is denoted . We to assign a cost to each possible move. When there are no possible moves, i.e. is empty, we assign a cost to the current state based on the the rules of the game. Otherwise, when moves from exist, the cost is defined using the minimax principle as follows:

Then the optimal move

Reference: Russell and Norvig. (I read it some 12 hours ago, though.)

HW in the music library

02-16-2009

I spent about 9 hours yesterday in the music library working on my 809 homework. Three observations:

  1. I recognized a substantial majority of the people who came in yesterday, and I'm pretty sure most of them recognized me, too. But—and this is the weird thing—apart from the librarians, I've never talked to any of them! I guess this isn't particularly uncommon; it's happened to me a few other places, too, e.g. Thieves, the Memorial infolab, and the HSLC. The proportion in this case is striking.
  2. There's this one grad-student librarian who I really like, but again I've barely talked to, who was working last night.
  3. Next time I need to finish the 809 homework earlier. Bumping up against deadlines is too stressful.

Happy

02-15-2009

Last night I played with the Hackberries for the first time in a long time. It was really fun. Nina and Patrick hosted, in the common area of their new cohousing community. Turnout was great, partly but not only because a lot of the cohousing people were there.

After music, (a) I talked to a college friend of Nina's who was visiting from Washington, D.C. (b) I talked to this new Hackberries guy who is also a programmer at a local company; I became extremely enthusiastic talking about Scheme and Haskell, neither of which he had much experience with. (c) I played a new (to me) game called Saboteur. (d) I went on a tour of some of the rest of the building.

I didn't go home until almost midnight. Afterward I felt extremely happy. I think this was largely due to the presence of Nina's extremely appealing friend. It also makes me feel good to play music, especially with so many other people. I need to do more of that.

First post

02-14-2009

I decided to start a blog again! In an act of hubris, I wrote my own blogging software this afternoon. My program satisfies three criteria that I haven't found, together, in any other package:

  • The html and feed are generated statically from a single text file that I can edit by hand. I like vim much better than web forms, especially ones that use TinyMCE or the like, and that don't automatically save every few seconds.
  • Each entry can display math formulae using . It can even include equations like
    for example. The blog's non-math markup is also -like; for example each post is inside a
    \\begin{post}
    and
    \\end{post}
    pair, and this list is an itemize. I guess I like backslashes better than angle brackets.
  • Because the blog is just a text file, I can use my favorite version control system, Mercurial, to track (and backup) the history of the blog.

I intend to use this blog for both personal posts and research posts, with three aims:

  1. To explore and record ideas.
    • Research posts: Right now I have lots of paper lying everywhere in my desk and room, and I have other notes in a variety of random files. It'd be nicer to have something that's legible and grepable and central.
    • Personal posts: I also frequently think things that I at least find amusing or cute or clever, and I'd like to share them.
  2. By writing down these things, I hope I'll also clear and exercise my mind, and therefore have more and better ideas.
  3. Finally, right now I frequently find it difficult to write, and what I produce often makes me cringe. I'd like to become a better writer, and practice will help.