[Corpora-List] What is best for text processing?

Grzegorz Chrupała grzegorz at pithekos.net
Mon Mar 17 17:30:19 UTC 2008


On Mon, Mar 17, 2008 at 2:50 PM, Eric Atwell <eric at comp.leeds.ac.uk> wrote:
>
>  My colleague David Duke suggested an even simpler HASKELL solution:
>
>  import Data.List
>  main = return . unlines . filter ("ing" `isSuffixOf`) . words =<<
>  getContents

This won't print out the words to the console though ;-)  You need
"putStr" there instead of "return".

>
>  Others could probably improve on the Prolog, Java, C etc versions
>  in terms of number of lines of code ... but I think the more important point
>  is the Python version is easier for beginners to pick up and understand,
>  and also to return to later and still understand.
>

I used the slightly more verbose list-comprehension version exactly to
support your point about being clear and easy to understand, not just
compact

Note how similar [ w | w <- words text , "ing" `isSuffix` w ] is to
standard mathematical set notation that we all know and love from high
school math:

{ w | w ∈ words(text) ⋀ "ing" isSuffixOf w }

Actually that particular feature of Haskell, i.e. list comprehension
notation, was borrowed into Python (and many other programming
languages), probably because of its expressiveness and readability:
http://en.wikipedia.org/wiki/List_comprehension#In_Python

Best,
--
Grzegorz
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list