[Corpora-List] Seeking feedback on a concordancer for Mac OS X

Yasuhiro Imao yimao at humnet.ucla.edu
Mon Sep 28 17:51:02 UTC 2009


Dear list members,

Sorry, this is a long message.

Last year, I posted here that I was developing CasualConc, a  
concordancer for Mac OS X (Leopard or later).  I've improved it based  
on some feedback and bug reports, and now I'm wrapping it up for  
version 1.0 and seeking feedback on it.  It is a simple Mac-native  
(Cocoa interface) application for not-so-serious use (including ESL  
studying/teaching, preliminary study), though I'm planning to use it  
for my own study.  CasualConc is a free beta and remains to be free  
after version 1.0.

What it does:
- kwic concordancing (sort by L5-R5 words, with limited context word  
search)
- word clusters (2-5; search word can be more than one word)
- collocation (L5-R5)
- word/n-gram (2-5) list (creating n-gram list can be slow)
- basic collocation/keyness stats calculation (including a simple 2x2  
contingency table calculator)
- wildcard/regular expression search
- read txt, rtf, doc, pdf, odt
- handle multiple text encodings (default is UTF-8) (it might be  
possible to add more upon request)
- file mode for thorough analysis and database mode for faster  
repeated searches
- East Asian Language support (though the labels for the modes say  
'Japanese'; and somewhat limited and not fully tested, but handles  
texts with or without spaces between words)
- export results in CSV format

What it does not:
- handle tagged corpus (CasualConc can simply ignore certain formats  
of tags)
- handle XML (though it has a very very limited XML mode as an  
experiment)
- and more...

These two require more fundamental changes, so these will be on the to- 
do list for future versions.

More information can be found at

http://sites.google.com/site/casualconc/
(or http://sites.google.com/site/casualconcj/ -- Japanese site)

and CasualConc can be downloaded from the site.


Now, I'd like to ask for help from Mac users on this list.  I mainly  
use/test CasualConc with an English corpus with plain texts, so I'd  
like to hear from people who can test this with languages other than  
English and Japanese (more feedback on these two languages is, of  
course, welcome).  I got some feedback from people who use this with  
Spanish, Italian, and Greek, but more is better.

So what I'd like to hear are:
- how well CasualConc works with languages other than English (esp.  
Korean and Chinese)
- any suggestion about implementation of tagged text/XML handling
- usefulness/accuracy of stats calculation, though they were tested  
except for Fisher's Exact test
- usability of the application (easy enough for non-tech savvy people?)
- any bugs?


In addition to this, I'd like to seek feedback on my other  
applications.  Last year, when I posted my message here, a couple of  
people asked for a parallel concordancer.  So I tried to answer their  
requests.  I have two parallel concordancer also for Mac OS X 10.5 or  
later.

CasualPConc
http://sites.google.com/site/casualconc/utility-programs/casualpconc

CasualMultiPConc
http://sites.google.com/site/casualconc/utility-programs/casualmultipconc

The latter is based on the former.  CasualPConc can only handle two  
parallel corpora and CauslMultiPConc can handle 2-5 parallel corpora.   
I might make these into one application later, though.  I've focused  
on kwic search process and file management, so both application only  
have limited features at the moment.  These are also freeware.

What I'd like to hear about are:
- type of text format (tagged/xml) it should be able to handle
- format of exported file (information to be included, how text should  
be formatted, etc.)
- any other tools to include?
- any other necessary features?
- easy to use?
- any bugs?

I'd also welcome any feedback/suggestion on these applications.   
Because I have no experience in handling parallel corpora and no plan  
to handle them in the foreseeable future, the development of these  
applications totally depends on the feedback/suggestion.

I have a few other language-related applications on the site.  If you  
are interested, please take a look at them.

Thank you for reading this long message.  I hope to hear from you soon  
and I hope these applications are useful for any Mac users.  Please  
send your feedback/suggestion/bug report to the email address on this  
message or the address on the site (under Contact).

Best,
Yasu Imao


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list