<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
{mso-style-priority:99;
mso-style-link:"Balloon Text Char";
margin:0cm;
margin-bottom:.0001pt;
font-size:8.0pt;
font-family:"Tahoma","sans-serif";}
span.BalloonTextChar
{mso-style-name:"Balloon Text Char";
mso-style-priority:99;
mso-style-link:"Balloon Text";
font-family:"Tahoma","sans-serif";}
span.EmailStyle19
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
span.EmailStyle20
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page Section1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.Section1
{page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>Dear
all,<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>This
is a general query about comparing language variety corpora <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>following
Asim’s questions (see below).<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>I
am looking for any automated corpus studies and tools <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>for
comparing the varieties of a language, <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>in
order to take them as a basis for further research<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>on
the development of tools for the systematic and automated<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>comparison
of linguistic varieties on the basis of text corpora.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>Up
to now I have contacted researchers of several variety corpus projects,<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>e.g.
the ‘International Corpus of English’ ICE, <o:p></o:p></span></p>
<p class=MsoNormal><span lang=IT style='font-size:10.0pt;font-family:"Arial","sans-serif"'>the
‘Trésor de la Langue Française informatisé’ TLFi, or<o:p></o:p></span></p>
<p class=MsoNormal><span lang=IT style='font-size:10.0pt;font-family:"Arial","sans-serif"'>the
‘Proyecto para el Estudio Sociolingüístico del Español de España y
América’ PRESEA.<o:p></o:p></span></p>
<p class=MsoNormal><span lang=IT style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>I
got pointed to semi-automatic studies on the lexical level, <o:p></o:p></span></p>
<p class=MsoNormal><span lang=IT style='font-size:10.0pt;font-family:"Arial","sans-serif"'>e.g.
at the Centro de Linguística da Universidade de Lisboa (CLUL).<o:p></o:p></span></p>
<p class=MsoNormal><span lang=IT style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>As
far as I can see now, there have not been any publications <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>on
automated comparison tools for higher levels of linguistic description, <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>e.g.
on collocations, syntactic differences or even on the textual level.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>So
I’d appreciate references to such studies, starting from the lexical
level.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>In
addition, I’d be grateful about any other ideas on contrasting
‘similar’ corpora / data sets,<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>which
might also come from quite different research fields.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>I
will post a summary with the replies I get.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>Thank
you for any kinds of hints,<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>Stefanie<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span style='font-size:9.0pt;
font-family:"Arial","sans-serif"'>--<o:p></o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span style='font-size:9.0pt;
font-family:"Arial","sans-serif"'>Stefanie Anstein<o:p></o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span style='font-size:9.0pt;
font-family:"Arial","sans-serif"'>Institute for Specialised Communication and
Multilingualism<o:p></o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span style='font-size:9.0pt;
font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span lang=IT style='font-size:
9.0pt;font-family:"Arial","sans-serif"'>EURAC research<o:p></o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span lang=IT style='font-size:
9.0pt;font-family:"Arial","sans-serif"'>Viale Druso 1, I-39100 Bolzano<o:p></o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span lang=DE style='font-size:
9.0pt;font-family:"Arial","sans-serif"'>t +39 0471 055 135<o:p></o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span lang=DE style='font-size:
9.0pt;font-family:"Arial","sans-serif"'>f +39 0471 055 199<o:p></o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span lang=DE style='font-size:
9.0pt;font-family:"Arial","sans-serif"'><a
href="mailto:stefanie.anstein@eurac.edu">stefanie.anstein@eurac.edu</a><o:p></o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span style='font-size:9.0pt;
font-family:"Arial","sans-serif"'><a href="www.eurac.edu">www.eurac.edu</a> <o:p></o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span style='font-size:10.0pt;
font-family:"Arial","sans-serif"'><o:p> </o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span style='font-size:8.0pt;
font-family:"Arial","sans-serif"'>This transmission is intended only for the
use of the addressee and may contain confidential or legally privileged
information. <o:p></o:p></span></p>
<p class=MsoNormal style='text-autospace:none'><span style='font-size:8.0pt;
font-family:"Arial","sans-serif"'>If you receive this transmission by error,
please notify the author immediately by mail and delete all copies of this
transmission and any attachments. <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:8.0pt;font-family:"Arial","sans-serif"'>Any
use or dissemination of this communication is strictly prohibited by the
"Privacy-Code", D.Lgs. 196/2003 and may conduct to penal prosecution
and liability for damages.</span><span style='font-size:10.0pt;font-family:
"Arial","sans-serif"'><o:p></o:p></span></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal style='margin-left:36.0pt'><b><span style='font-size:10.0pt;
font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;
font-family:"Tahoma","sans-serif"'> corpora-bounces@uib.no
[mailto:corpora-bounces@uib.no] <b>On Behalf Of </b>Asim<br>
<b>Sent:</b> Tuesday, 27 May, 2008 19:41<br>
<b>To:</b> corpora@uib.no<br>
<b>Subject:</b> [Corpora-List] request for parsing and making the data in a
form tobe used by wordsmith<o:p></o:p></span></p>
<p class=MsoNormal style='margin-left:36.0pt'><o:p> </o:p></p>
<p class=MsoNormal style='margin-left:36.0pt'>Hello<o:p></o:p></p>
<p class=MsoNormal style='margin-left:36.0pt'>I am working on Pakistani
English. I have compiled a 2.1 million word corpus of written Pakistani
English. It is the first ever corpus of Pakistani English .<o:p></o:p></p>
<p class=MsoNormal style='margin-left:36.0pt'>I want to study the features of
Pakistani variety of English. Could any tell me how to locate them. Any
suggestion would be welcome.<o:p></o:p></p>
<p class=MsoNormal style='margin-left:36.0pt'>I have tagged it and now trying
to analyse it using both top down and bottom up approaches.<o:p></o:p></p>
<p class=MsoNormal style='margin-left:36.0pt'>I want to study the verb
particles and for this I want to parse the data as I think it is the only
possibility that I can get the confirmation that either it is a preposition or
particle. If there is any other way except manual study just tell me and I will
be obliged.<o:p></o:p></p>
<p class=MsoNormal style='margin-left:36.0pt'><o:p> </o:p></p>
<p class=MsoNormal style='margin-left:36.0pt'>Another issue is when I use
some online available demo parsers like LFG how to store the results to
be used with wordsmith 4 and use them to locate all the particles from my data
.<o:p></o:p></p>
<p class=MsoNormal style='margin-left:36.0pt'>Is there any solution.<o:p></o:p></p>
<p class=MsoNormal style='margin-left:36.0pt'>Wish to hear from you soon.<o:p></o:p></p>
<p class=MsoNormal style='margin-left:36.0pt'>Regards<o:p></o:p></p>
<p class=MsoNormal style='margin-left:36.0pt'>Asim<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
</div>
</body>
</html>