[Corpora-List] Seeking feedback on a concordancer for Mac OS X
Yasuhiro Imao
yimao at humnet.ucla.edu
Mon Sep 28 17:51:02 UTC 2009
Dear list members,
Sorry, this is a long message.
Last year, I posted here that I was developing CasualConc, a
concordancer for Mac OS X (Leopard or later). I've improved it based
on some feedback and bug reports, and now I'm wrapping it up for
version 1.0 and seeking feedback on it. It is a simple Mac-native
(Cocoa interface) application for not-so-serious use (including ESL
studying/teaching, preliminary study), though I'm planning to use it
for my own study. CasualConc is a free beta and remains to be free
after version 1.0.
What it does:
- kwic concordancing (sort by L5-R5 words, with limited context word
search)
- word clusters (2-5; search word can be more than one word)
- collocation (L5-R5)
- word/n-gram (2-5) list (creating n-gram list can be slow)
- basic collocation/keyness stats calculation (including a simple 2x2
contingency table calculator)
- wildcard/regular expression search
- read txt, rtf, doc, pdf, odt
- handle multiple text encodings (default is UTF-8) (it might be
possible to add more upon request)
- file mode for thorough analysis and database mode for faster
repeated searches
- East Asian Language support (though the labels for the modes say
'Japanese'; and somewhat limited and not fully tested, but handles
texts with or without spaces between words)
- export results in CSV format
What it does not:
- handle tagged corpus (CasualConc can simply ignore certain formats
of tags)
- handle XML (though it has a very very limited XML mode as an
experiment)
- and more...
These two require more fundamental changes, so these will be on the to-
do list for future versions.
More information can be found at
http://sites.google.com/site/casualconc/
(or http://sites.google.com/site/casualconcj/ -- Japanese site)
and CasualConc can be downloaded from the site.
Now, I'd like to ask for help from Mac users on this list. I mainly
use/test CasualConc with an English corpus with plain texts, so I'd
like to hear from people who can test this with languages other than
English and Japanese (more feedback on these two languages is, of
course, welcome). I got some feedback from people who use this with
Spanish, Italian, and Greek, but more is better.
So what I'd like to hear are:
- how well CasualConc works with languages other than English (esp.
Korean and Chinese)
- any suggestion about implementation of tagged text/XML handling
- usefulness/accuracy of stats calculation, though they were tested
except for Fisher's Exact test
- usability of the application (easy enough for non-tech savvy people?)
- any bugs?
In addition to this, I'd like to seek feedback on my other
applications. Last year, when I posted my message here, a couple of
people asked for a parallel concordancer. So I tried to answer their
requests. I have two parallel concordancer also for Mac OS X 10.5 or
later.
CasualPConc
http://sites.google.com/site/casualconc/utility-programs/casualpconc
CasualMultiPConc
http://sites.google.com/site/casualconc/utility-programs/casualmultipconc
The latter is based on the former. CasualPConc can only handle two
parallel corpora and CauslMultiPConc can handle 2-5 parallel corpora.
I might make these into one application later, though. I've focused
on kwic search process and file management, so both application only
have limited features at the moment. These are also freeware.
What I'd like to hear about are:
- type of text format (tagged/xml) it should be able to handle
- format of exported file (information to be included, how text should
be formatted, etc.)
- any other tools to include?
- any other necessary features?
- easy to use?
- any bugs?
I'd also welcome any feedback/suggestion on these applications.
Because I have no experience in handling parallel corpora and no plan
to handle them in the foreseeable future, the development of these
applications totally depends on the feedback/suggestion.
I have a few other language-related applications on the site. If you
are interested, please take a look at them.
Thank you for reading this long message. I hope to hear from you soon
and I hope these applications are useful for any Mac users. Please
send your feedback/suggestion/bug report to the email address on this
message or the address on the site (under Contact).
Best,
Yasu Imao
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list