[Corpora-List] World of Warcraft Corpus
liling tan
alvations at gmail.com
Wed Sep 11 21:56:55 UTC 2013
Dear all,
Thank you for all your input on how to go about a WoW corpus compilation.
I shall try to summarize the discussion to this point:
*Are there any existing compilation of WoW corpus?*
- (for now) no openly available WoW corpus available but
- given researches like http://dl.acm.org/citation.cfm?id=1920331.1920490
or
http://www.pitt.edu/~lbc8/FriedlineCollister-constructingpowerfulidentityinWoW.pdf
,
there are already some existing closed in-group corpora for WoW corpus
*How could one go about collecting a WoW corpus?*
- as a *field linguist*, join and embrace the WoW community, collect
data using built-in chatlogging and ethnographic journal.
- see Friedline, B., & Collister, L. (2012) “Constructing a Powerful
Identity in World of Warcraft: A Sociolinguistic Approach to MMORPGs.” In
Call, Voorhees, and Whitlock (eds.), Dungeons, Dragons, and Digital
Denizens: The Digital Role-Playing Game. New York: Continuum.
- as an *out-group* observer, join with a free-account and stay at the
free locations to log the chats
- Problem is that you will end up logging mostly auction related
chats because the locations available to free-accounts are
usually use as a
marketplace
- from *second-hand* data, using gameplay videos openly available, run
OCR to collect texts
- foreseeable problems includes:
- no speaker meta-data
- trouble converting video to frames to image for OCR
- low quality videos leading to OCR input noise
- noisy OCR outputs
- *ask data from game developer.*
*Issues that might be raised:*
- *copyrights issues*, one needs to read through the TOS or consult
Blizzard's staff
- *ethical issues*, Should there be a need to ask for consent prior or
posterior to data collections?
- *data quality issues, *"*If you want chat that includes meaningful
interactions, I think you have to actually be recording someone truly
participating in the game, preferably someone in a functional guild.*"
- Mary Elaine Califf
- *corpus representation issues*, given a collection of chatlogs of
different users in the community, the users' language use would differ as
would humans with different level of prestige/power/solidarity and the
function/domain of utterance
Regards,
liling
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130911/af37e900/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list