[Corpora-List] World of Warcraft Corpus

liling tan alvations at gmail.com
Wed Sep 11 21:56:55 UTC 2013


Dear all,

Thank you for all your input on how to go about a WoW corpus compilation.
I shall try to summarize the discussion to this point:

*Are there any existing compilation of WoW corpus?*

   - (for now) no openly available WoW corpus available but
   - given researches like http://dl.acm.org/citation.cfm?id=1920331.1920490
    or
   http://www.pitt.edu/~lbc8/FriedlineCollister-constructingpowerfulidentityinWoW.pdf
,
   there are already some existing closed in-group corpora for WoW corpus

*How could one go about collecting a WoW corpus?*

   - as a *field linguist*, join and embrace the WoW community, collect
   data using built-in chatlogging and ethnographic journal.
      - see Friedline, B., & Collister, L. (2012) “Constructing a Powerful
      Identity in World of Warcraft: A Sociolinguistic Approach to MMORPGs.” In
      Call, Voorhees, and Whitlock (eds.), Dungeons, Dragons, and Digital
      Denizens: The Digital Role-Playing Game. New York: Continuum.
   - as an *out-group* observer, join with a free-account and stay at the
   free locations to log the chats
      - Problem is that you will end up logging mostly auction related
      chats because the locations available to free-accounts are
usually use as a
      marketplace
   - from *second-hand* data, using gameplay videos openly available, run
   OCR to collect texts
      - foreseeable problems includes:
      - no speaker meta-data
         - trouble converting video to frames to image for OCR
         - low quality videos leading to OCR input noise
         - noisy OCR outputs
      - *ask data from game developer.*

*Issues that might be raised:*

   - *copyrights issues*, one needs to read through the TOS or consult
   Blizzard's staff
   - *ethical issues*, Should there be a need to ask for consent prior or
   posterior to data collections?
   - *data quality issues, *"*If you want chat that includes meaningful
   interactions, I think you have to actually be recording someone truly
   participating in the game, preferably someone in a functional guild.*"
   - Mary Elaine Califf
   - *corpus representation issues*, given a collection of chatlogs of
   different users in the community, the users' language use would differ as
   would humans with different level of prestige/power/solidarity and the
   function/domain of utterance

Regards,
liling
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130911/af37e900/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list