<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=ISO-8859-1">
<META content="MSHTML 6.00.6001.18385" name=GENERATOR></HEAD>
<BODY id=role_body style="FONT-SIZE: 10pt; COLOR: #000000; FONT-FAMILY: Arial" bottomMargin=7 leftMargin=7 topMargin=7 rightMargin=7><FONT id=role_document face=Arial color=#000000 size=2>
<DIV>It is interesting to see how a serious question sparks a funny discussion,
which is somewhat carnivalesque even, featuring soldiers, rabbis, schmucks,
dead dogs and what have you galore. On a more sober note one might return
to the original question (Can corpora help distinguish dialect and a language?)
and say yes, to an extent, provided they have been carefully marked up for
social categories. One such well-designed and annotated corpus is the BNC. In
pursuit of the above question, it can be used in two basic ways. In
a top down approach, one may start with a set of pre-defined dialect
features, run queries for them using relevant restrictions such as speaker
location or speaker dialect. Such a query would show, for example, that the
quotative phrase *I says* is found predominatly in northern dialect areas of
Great Britain (and Ireland). A bottom up approach would be to start without a
set of pre-defined dialect features and instead examine a set of randomly
selected features using the same set of restrictions as above. Such a query type
might, perhaps, help discover dialect features that researchers have not
yet been aware of as dialect features.</DIV>
<DIV> </DIV>
<DIV>Apologies for this totally unMardi-Graslike note. </DIV>
<DIV> </DIV>
<DIV>Cheers</DIV>
<DIV> </DIV>
<DIV>Chris</DIV>
<DIV> </DIV>
<DIV>--------------------------------------------------------</DIV>
<DIV>Dr. Chrstoph Rühlemann, Munich</DIV></FONT></BODY></HTML>