Another Five-Finger Exercise in Haskell
This one’s a command-line utility that reads a text file on stdIn and outputs a file with a list of all the words in the input file and the frequency with which they appear.
My favourite line in the code is this:
wordTotals = toList . foldr update Nothing . nwords
nwords takes a string and returns a list of normalized
words (lowercase, with no punctuation apart from apostrophes). The binary tree structure that contains the word dictionary is created by the foldr, which works more or less like Python’s reduce. This structure is then passed to toList, which traverses it depth-first to create a list.
It may help to think of the .nwords, which illustrates this a little:
isLetter c = isAlphaNum c || c==''' normalize = map toLower . filter isLetter isWord [] = False isWord s = True nwords = filter isWord . map normalize . words
normalize takes each letter in a word, drops it if it isn’t an alphanumeric character or an apostrophe, then forces it to lower case. map normalize does this for every word returned by words. Words like !!!
will be turned into empty strings, which are removed by filter isWord.
It’s really very similar to stringing together Unix commands in bash.
