<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=ISO-8859-1">

<META content="MSHTML 6.00.6001.18385" name=GENERATOR></HEAD>

<BODY id=role_body style="FONT-SIZE: 10pt; COLOR: #000000; FONT-FAMILY: Arial"   bottomMargin=7 leftMargin=7 topMargin=7 rightMargin=7><FONT id=role_document   face=Arial color=#000000 size=2>

<DIV>It is interesting to see how a serious question sparks a funny discussion, 

which is somewhat carnivalesque even, featuring soldiers, rabbis, schmucks, 

dead dogs and what have you galore. On a more sober note one might return 

to the original question (Can corpora help distinguish dialect and a language?) 

and say yes, to an extent, provided they have been carefully marked up  for 

social categories. One such well-designed and annotated corpus is the BNC. In 

pursuit of the above question, it can be used in two basic ways. In 

a top down approach, one may start with a set of pre-defined dialect 

features, run queries for them using relevant restrictions such as speaker 

location or speaker dialect. Such a query would show, for example, that the 

quotative phrase *I says* is found predominatly in northern dialect areas of 

Great Britain (and Ireland). A bottom up approach would be to start without a 

set of pre-defined dialect features and instead examine a set of randomly 

selected features using the same set of restrictions as above. Such a query type 

might, perhaps, help discover dialect features that researchers have not 

yet been aware of as dialect features.</DIV>

<DIV> </DIV>

<DIV>Apologies for this totally unMardi-Graslike note.  </DIV>

<DIV> </DIV>

<DIV>Cheers</DIV>

<DIV> </DIV>

<DIV>Chris</DIV>

<DIV> </DIV>

<DIV>--------------------------------------------------------</DIV>

<DIV>Dr. Chrstoph Rühlemann, Munich</DIV></FONT></BODY></HTML>