[Corpora-List] WordSmith

Klaus Guenther klaus at capitalfocus.org
Sat May 28 20:54:37 UTC 2005


Hi Li-chin,

As you probably know, Wordsmith accepts UTF-8 encoded texts, so would
that be an option for you? Then you shouldn't have the trouble with odd
characters. Else you might like opening your texts in Microsoft Word or
some other text editor (e.g., vim, emacs, etc.) and convert the format
to UTF-8 or ASCII. It would be possible to write a perl or php script
that would quickly and easily convert your texts. (the appropriate utf-8
to ascii function in php is utf8_decode($string) and is provided by the
xml extension)

Hth,

Klaus

Klaus Guenther
University of Freiburg, Germany

On 5/28/2005 10:23 PM, sara chen spake the following words:

> Hi Klaus,
>
> Thank you very much for your prompt reply!
>
> Unfortunately, my data are not tagged and I'm not planning to do so
> unless it's necessary. And then, any way I can tag my data
> automatically, in order to locate those " ? " ?
>
> Ok, some concordance lines actually don't include any " ? " , and " '
> " in original data becomes " ? " so they appear in the result of my
> searching. Of course, there are accurate concordance lines, but I need
> to pick them out mannually. Weird!
>
> Li-chin
>
>
>
> */Klaus Guenther <klaus at capitalfocus.org>/* wrote:
>
>     Hi Li-chin,
>
>     What would the character encoding be? And what special characters?
>
>     If your original data is tagged, you should be able to search for
>     the question mark tag. I've not had that much trouble doing what I
>     needed to do. (The only exception was that it doesn't offer
>     regular expression support...)
>
>     Klaus
>
>     On 5/28/2005 8:59 PM, sara chen spake the following words:
>
>>     Hi everyone,
>>
>>     I'm wonder if any WordSmith expert can help me to solve few
>>     questions.
>>
>>     1) How to avoid producing those strange codes when transferring
>>     my original text data into the txt. files, which is recognized by
>>     WordSmith? I have copied the origninal text on Notepad and then
>>     save it as txt file.
>>
>>     2) How to search question mark "?" from my data with WordSmith? I
>>     used concorrdance to search them but some strange codes came out.
>>
>>     Many thanks
>>
>>     Li-chin
>>
>>
>>     ------------------------------------------------------------------------
>>     Do You Yahoo!?
>>     Yahoo! Small Business - Try our new Resources site!
>>     <http://us.rd.yahoo.com/evt=31637/*http://smallbusiness.yahoo.com/resources/>
>
>
>
> ------------------------------------------------------------------------
> Do You Yahoo!?
> Yahoo! Small Business - Try our new Resources site!
> <http://us.rd.yahoo.com/evt=31637/*http://smallbusiness.yahoo.com/resources/>



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20050528/7cd1b1d9/attachment.htm>


More information about the Corpora mailing list