Dear all,<br><br> I am  focusing on the extraction of maximal repeat patterns <br>from textual information, meanwhile compute the frequency distribution of these patterns over time(pattern history).<br><br><br>

There is a web site for pattern history and <br>its URL is (<a href="http://120.108.115.115/TM/Search_PubMed_Simple.php" target="_blank">http://120.108.115.115/TM/Search_PubMed_Simple.php</a>).<br>

The pattern history extracted from  medicine articles <br>"PubMed"( from 1990 to 2009),<br>

containing 3,225,549 articles containing 677,728,269 words (600M+<br>

MILLION WORDS) .<br>

Note that the type of these patterns extracted not only<br>

include single-word but also phrases (multi-words),<br>

e.g. "patients with squamous cell carcinoma of the head and neck".<br>

To more specific, any segment (a sequence of words) within sentences<br>

in corpus will be extracted if that segment appear twice;<br>

meanwhile the corresponding frequency distribution of that segment<br>

over time, defined as "pattern history",  would be computed.<br>

<br>

I am looking forward to have more retrospective<br>

(historial)(chronological) corpus, publications or literatures for<br>

experiements to make my  experiments more robust, and seek for linguistic experts<br>

for cooperation  if they could provide the text with timestamp. <br>

<br>I will also provide them with the patterns histories extracted from these corpus as the feedback.<br>

please let me know if you have textual data(Corpus) with timestamp<br>

<br>

Yours faithfully,<br>

<br>

ps. There is an abstract about what I am doing as attached.<br>

<br>

--<br>

<br>

Jing-Doo Wang<br>

<br>

Assistant Professer<br>

Department of Computer Science and Information Engineering<br>

Asia Universiyt, Taiwan.<br>

<br>

886-4-23323456-ext 1847<br>

<a href="http://asia.edu.tw/%7Ejdwang" target="_blank">http://asia.edu.tw/~jdwang</a><br>

<a href="mailto:jdwang@asia.edu.tw">jdwang@asia.edu.tw</a><br>

<a href="mailto:wangjingdoo@gmail.com">wangjingdoo@gmail.com</a><br><br>