2007-11-18

Symmetry

So there is this so-called by the popular media surfer dude who has a preprint on how the Theory of Everything can be based on the E8 Lie algebra. I have little clue about theoretical physics so I don't have anything to say on the actual physics idea, but in the comment section of a blog post discussing the pre-print, someone says:
"Beautiful" An off-topic serious question: One sees this this a lot in the scientific literature and I would like to ask why. Evolution built this admiration for symmetric features into us for it's purposes. Why do scientists persist in assuming that a criteria for animals to choose mates is appropriate for choosing theories.
Which I think is an excellent question! I have no answer but I have the impression that beauty as a criterion for science is much more common in theoretical physics than in something more down to earth, such as biology. Indeed one the most beautiful ideas in molecular biology, i.e. comma-free code for DNA, happened to be wrong.

2007-11-05

MT helping real people in the real world, now!

When the Computational Linguistics department at one of the German universities needs to bring over some candidates for PhD positions for interviews, apparently they get accommodation in a hotel which displays the bilingual information depicted above. This hotel is a very appropriate choice as it leads by example in embracing language technologies to enhance their guests' experience while keeping costs low. Great motivation for future researchers!

2007-08-22

ESSLLI 2007

So ESSLLI this year was at home, in Dublin. Hardly the best destination - rainy, cold, expensive and totally devoid of tourist attractions. But maybe it makes people concentrate more on the scientific content ;-) I had more fun last year in Málaga, but it was still nice to go to ESSLLI once more, and Trinity is definitely a nice central location (although a bit overrun by tourists).

Most of the courses I attended were decent though none was spectacular. The best one was definitely the one by Joakim Nivre and Ryan MacDonald on data-driven dependency parsing: easy to follow, clearly structured, and taught by people who really know what they are talking about.

I hoped the machine-learning course would be interesting but the first day was quite disappointing and I didn't go during the rest of the week.

Other than that, one of the invited talks was on modeling language acquisition. It was a strange affair: the speaker (Ronan Reilly) showed how a supervised learning algorithm, namely neural network with error backpropagation can accurately learn a toy grammar derived from a corpus of child directed speech, using as training material sentences generated with this grammar. Not terribly surprising is it? The puzzling fact was that the speaker proposed this as the model of first language acquisition. I had to leave early so I didn't get to ask the obvious question: where do kids get their error backpropagation? I thought it was more or less generally acknowledged that there is little feedback on errors, and whatever there is, children mostly ignore. So a strongly supervised learning model doesn't say anything about first language acquisition. Or is it me who is really confused?

I don't know if I'll be going to another ESSLLI: next summer I won't be a student anymore (hopefully!) so it might have been my last one. But who knows; it's definitely quite an addictive event...

2007-06-29

Chomsky the experimentalist

I just came across this Chomsky quote:
Corpus linguistics doesn't mean anything. It’s like saying suppose a physicist decides, suppose physics and chemistry decide that instead of relying on experiments, what they’re going to do is take videotapes of things happening in the world and they’ll collect huge videotapes of everything that’s happening and from that maybe they’ll come up with some generalizations or insights.

I'm all for experimental science but I did find it a bit amusing the way Chomsky appeals to its authority. From what I've seen a typical "experiment" in linguistics means the linguist coming up with a more or less convoluted sentence and introspecting to decide whether it's grammatical or not. You can easily guess what proper experimentalists in natural or social sciences would think of such a methodology. (By this I don't mean to say that grammaticality judgments are totally useless per se, just that the methods commonly used to obtain them don't meet the standards of proper experimental research.)

As to the use of corpus data in research on language, Chomsky's dismissal is hard to make sense of. So OK physicist don't typically record videos of things happening out there, fair enough. But even someone afflicted with acute physics envy should be able to see that there are respectable and highly successful branches of science which cannot and do not rely on experiments to obtain their data. In effect, scientist in those areas do analyze huge "videotapes" of stuff that happened to make generalizations and come up with theories about their domain of study. Obvious examples are paleontology, evolutionary biology or cosmology. You can't rerun the Big Bang to see what happens when you fiddle with such and such parameter. All we can ever hope to do is to watch those videos in the form of red-shifted light from receding galaxies, background radiation etc. And yet, we know way more about the history of the universe than about the workings of human language.

2007-04-13

Norvig's spelling corrector in Haskell

A pretty literal translation of http://www.norvig.com/spell-correct.html in Haskell.
*Main> sc <- getCorrector 
*Main> sc "speling"
spelling
I didn't bother to use ByteString so it's slow.
import Prelude hiding (words)
import Data.Char
import Data.Ord
import Data.Maybe
import qualified Data.Map as Map
import qualified Data.Set as Set
import qualified Data.List as List

words = List.words . map (toLower . (\c -> if isAlpha c then c else ' ')) 

train = List.foldl' (\dict f -> Map.insertWith' (+) f 1 dict) Map.empty 

edits1 word =
    let n = length word in
    Set.fromList $    [take i word ++ drop (i+1) word | i <- [0..n-1]]                                  -- deletion
                   ++ [take i word ++ [word!!(i+1)] ++ [word!!i] ++ drop (i+2) word | i <- [0..n-2]]    -- transposition
                   ++ [take i word ++ [c] ++ drop (i+1) word | i <- [0..n-1] , c <- ['a'..'z'] ]          -- alteration
                   ++ [take i word ++ [c] ++ drop i word | i <- [0..n-1] , c <- ['a'..'z'] ]              -- insertion

known_edits2 nwords word  = Set.fromList [e2 | e1 <- Set.elems (edits1 word)
                                             , e2 <- Set.elems (edits1 e1) 
                                             , e2 `Map.member` nwords ]

known nwords = Set.intersection (Map.keysSet nwords)

correct nwords word = let candidates = fromJust $ List.find (not . Set.null) [ known nwords (Set.singleton word)
                                                                             , known nwords (edits1 word) 
                                                                             , known_edits2 nwords word
                                                                             , Set.singleton word ]
               in List.maximumBy (comparing (\c -> Map.findWithDefault 1 c nwords)) (Set.elems candidates)

getCorrector = do
  nWORDS <- fmap (train . words) (readFile "big.txt")
  return  (putStrLn . correct nWORDS)

2006-10-24

Why smart people believe crazy stuff

It'd always been my impression that intelligence is rather strongly correlated with disbelief in the supernatural (whether in the form of religious belief or plain old superstition). I still think it is correlated but maybe not as strongly as I thought: I've recently met some smart people who believe pretty crazy stories.

And I don't just mean the odd mystically-inclined physicist who thinks the laws of nature are God or some such metaphor. I mean intelligent, educated, somewhat scientifically literate people believing pretty literally in a soul separate from the brain which survives the body's death and goes on to live in some alternative reality. Or some similar off-the-wall story: you get the idea.

There are probably many factors which make this particular mental setup possible but let me speculate as to one of the possibly most important factors. I actually hinted at the possible culprit above: "somewhat scientifically literate". I think it may not be sufficient to have basic secondary school scientific literacy to realize how much ideas such as astrology or an immortal soul are incompatible with the scientific outlook.

It doesn't much matter if you know the details of meiosis or are familiar with Heisenberg's uncertainty principle as disconnected facts. You need to be aware of the scientific method as a means to build successively more adequate approximations of reality. You need to know that the whole of science hangs together, that you are not allowed to just pick and choose: accept medicine because it seems to work, but reject evolution by natural selection because it makes you feel like the world is a cold and unpleasant place.

There is also the fact that you have to be willing and able to ask the right questions and follow the answers to their conclusions. If you think science is mostly right, but there is something more out there you have to be willing to check if this something out there is actually a possible extension to already established scientific facts. For example you may think physics and biology are correct as such, but they just fail to mention the immortal soul, so you are free to believe in it. And you want to because it gives you a warm and fuzzy feeling and makes you less afraid of death. Well not quite.

There's all the well known problems with dualism. To dumb down, if the soul is supposed to control the body then it has to interact with it somehow. If it is material then it's just some part of the brain which science can study and explain, and it's unlikely to survive the death of the rest of the brain. If it's immaterial, then how can it have a causal effect on a material object like the brain? Energy doesn't come out of nowhere, and so on.

There's more prosaic implausibilities and inconsistencies. Do only humans have a soul? Which of the extinct hominins, if any, had it too? At which point in the evolution of our species did we acquire it? At which point during our embryonic development do we get it? As soon as conception? If so, then most soulful beings don't even get to be born, as most pregnancies end in an unnoticed early spontaneous abortion. You can probably try to speculate about answers to these issues, but I bet when you're finished you'd be left with a kind of soul that you'd no longer want quite so much to believe in.

And some smart people who don't know much outside high-school science plus their own narrow area of expertise might just not have come across too many ideas that would make them want to ask and be able to answer those kinds of questions. Some combination of philosophy of science, cognitive science, neuroscience, evolutionary biology and embryology are the ingredient that might be missing. My bet is that many people, smart or otherwise, have only a very vague idea or none at all about those areas. If they had more, I am willing to risk a guess that they would be a bit more picky about the crazy stuff they choose to believe in.

I guess some soul believers still wouldn't go all the way: probably some would be tempted by the Kurzweil-style singularity stuff as a last recourse -- I've just noticed right now that a relatively smart acquaintance of mine is into it. But no matter, its a progress anyway...

2006-05-04

Linguistics and Ruby Programming

Apparently customers who bought this item (Far from the Madding Gerund and Other Dispatches from Language Log) also bought Agile Web Development with Rails by Dave Thomas, for some strange reason. The correlation must be pretty strong, I guess, as Amazon actually suggests to buy the two together.

I'll probably buy the first one. And maybe the second too, if for some reason I get kicked out of the PhD program and have to take on some crappy job as a script monkey web developer.