[Corpora-List] Linguistic Tree Constructor
Ulrik Petersen
ulrikp at hum.aau.dk
Wed Aug 29 15:08:57 UTC 2007
maxwell at umiacs.umd.edu wrote:
> Hanane wrote:
>
>> My data is a word file but i didn't succeed in opening it through ltc
>> ...
>> how can i change the extention of a .doc file to .txt or .gen file? and
>> does it help if i put my file under a format other than word?
>>
>
> I don't know anything about ltc, but I can't imagine any program other
> than Word being able to read a Word doc file. (Or any comp ling program
> being able to read any other word processing file, for that matter.)
>
> As for making this Word doc file usable, it's not (just) the file
> extension that you want to change, it's the contents of the file.
> Probably you want to do something like 'File | Save As...' to save it in
> some kind of text format. The particular text format you want to use will
> depend on your application; Word can save-as text files where new lines
> happen at each paragraph, or it can break paragraphs into lines at the
> points where you would get an apparent line break on-screen (or in a
> printed version of the document).
>
> Word will also let you choose whether to use LF or CR-LF as your line
> break characters (I would suggest LF, assuming you'll be working with
> Linux programs).
>
> And finally, unless the file is vanilla English, you'll probably need to
> choose the encoding. Again, the correct choice depends on your
> application program, but for most purposes today, Unicode in the UTF-8
> encoding would be appropriate (and if it gives you a choice, don't save it
> with a BOM).
>
> If this doesn't give you a file your program can work with, you may want
> to sit down with someone who understands more about the nature of
> application data and file formats.
>
> Mike Maxwell
> CASL/ U MD
>
Thanks, Dr. Maxwell. As the author of Linguistic Tree Constructor, I
had already sent Hanane a reply off-list, saying much the same thing as
you did, only less detailed. Thanks again.
Ulrik Petersen
--
Ulrik Petersen, PhD candidate
University of Aalborg, Denmark
http://ulrikp.org -- Homepage
http://emdros.org -- Emdros is a corpus query system
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list