[Corpora-List] The genre of the Web

santinim at inwind.it santinim at inwind.it
Thu Sep 22 13:32:17 UTC 2005


Hi Mark,

I've been following the thread on your question on the genre(!) of the Web and feel quite confused by most of the comments.  I am working intensively with web genres and I have done a lot of thinking about this issue. I will not give you my personal view because, probably, I can't understand your question thoroughly, and the answer would be too long or misleading for you. But I am wondering: are you interested in "the language used on the Web", or in "web registers" or in "web genres" or ...? These terms are not sheer synonyms even if sometimes they are used interchangeably (for a clarification, have a look at Lee David 2001, "Genres, Registers, Text types, Domains, and Styles: Clarifying the concepts and navigating a path through the BNC Jungle", Language Learning and Technology, Vol. 5, Num. 3, pp. 37-72). I am sure you know the BNC Web Indexer (http://www.comp.lancs.ac.uk/computing/research/ucrel/bncindex/form.html) where BNC documents can be selected according to many categories).

If you are interested in genres on the Web, have a look at:
Crowston K., Williams M. (1997), "Reproduced and Emergent Genres of Communication on the World-Wide Web", Proceedings of the 30th Hawaii International Conference on System Sciences (HICSS-30).
Shepherd M. And Watters C. (1998), "The Evolution of Cybergenre", Proceedings of the 31st Hawaii International Conference on System Sciences (HICSS-31).
Shepherd M. And Watters C. (1999), "The Functionality Attribute of Cybergenres", Proceedings of the 32nd Hawaii International Conference on System Sciences (HICSS-32).
These papers will give you an idea of what kind of changes are taking place on the genre system and genre repertoire on the Web.

If you are interested in web registers, send me an email and I will send you a preliminary study that D. Biber carried out on Yahoo categories last year.

If you interested in web language, maybe you should start with something like "the language and the internet" by Crystal (I am totally sure you are familiar with this book) and related references, such as “The Language of Websites” by M. Boardman, Routledge 2004 etc. From a quantitative point of view, then, I don't know if a project such as "http://www.webcorp.org.uk/" can be useful to you.

I don’t know if this helps...

Good luck

Marina Santini
PhD student at NLTG, Brighton University, UK
santinim at inwind.it
Marina.Santini at itri.brighton.ac.uk



---------- Initial Header -----------

>From      : owner-corpora at lists.uib.no
To          : corpora at uib.no
Cc          : 
Date      : Sun, 18 Sep 2005 10:16:30 -0600
Subject : [Corpora-List] The genre of the Web







> I'm looking for publications or URLs that look at the genre of the web in quantitative terms.
>  
> In other words, if one looks at the four major genres/registers SPOKEN, FICTION, NEWSPAPER, ACADEMIC, most would probably agree that the web is more like NEWSPAPER and ACADEMIC than it is SPOKEN or FICTION, although there are certainly bits and pieces of all of these genres/registers on the web.
>  
> I imagine that something like the following has already been done, but it would seem that a person could look at the frequency of 50-60 words or phrases in the major genres/registers of the BNC, for example, and then compare this to the frequency of the same words and phrases on the Web.  In quantitative terms, the web would be "most like" the register with the highest correlation coefficient. 
>  
> Three notes: 
> 1) A BNC-based site like VIEW [http://view.byu.edu] allows users to quickly compare the frequency in different registers [use "Charts" on the VIEW site]. 
> 2) This assumes we can abstract away from the basic methodological problem of calculating frequencies from the web -- an issues that has been discussed in a number of threads here on CORPORA.
> 3) This is a very simplistic lexically-oriented comparison, with no attempt to look at syntactic features, etc.
>  
> On the other hand, does it even make sense to try and relate the overall genre orientation of the web to one of these four or five discrete genres?  Would it be better to simply refer to it as as mix of GENRE1 + GENRE2?  Going even further, does it make sense to even try and relate the web to pre-defined genres, rather than perhaps just referring to it as its own "Web" register?
>  
> Thanks in advance,
>  
> Mark Davies
>  
> =================================================
> Mark Davies
> Assoc. Prof., Linguistics
> Brigham Young University
> (phone) 801-422-9168 / (fax) 801-422-0906
> http://davies-linguistics.byu.edu
> 
> ** Corpus design and use // Linguistic databases **
> ** Historical linguistics // Language variation **
> ** English, Spanish, and Portuguese **
> ================================================= 
>  
> 
> 



More information about the Corpora mailing list