[Corpora-List] Anonymization tools for patient record research methods

Aberdeen, John S. aberdeen at mitre.org
Tue May 31 14:21:33 UTC 2011


Hi Eric,

We've developed a tool for this called MIST (MITRE Identification Scrubber
Toolkit), which combines an annotation interface and CRF-based learner to
develop de-identification systems. With it you can iteratively build up a
corpus of de-identified notes, and successively better models for
de-identification via a tag-a-little-learn-a-little cycle. It is
research-ware, but well documented.

http://mist-deid.sourceforge.net/

Best,
John

John Aberdeen
Lead Scientist - Human Language Technology
The MITRE Corporation
202 Burlington Rd.
Bedford, MA 01730
+1.781.271.2840
aberdeen at mitre.org



-----Original Message-----
From: Eric Atwell <csc6ea at leeds.ac.uk>
Date: Fri, 27 May 2011 18:12:41 -0400
To: "corpora at uib.no" <corpora at uib.no>
Subject: [Corpora-List] Anonymization tools for patient record research
methods

>We are investigating research methods for patient records.
>To be available for Corpus Linguistics analysis, patient records
>have to be anonymised, so individual patients cannot be identified.
>Can anyone point us at tools to (semi-)automate anonymization or
>deidentification of health text data (or any other text data)?
>
>I managed to find "deid" in Physionet
>http://www.physionet.org/physiotools/deid/
>Neamatullah I, Douglass M, Lehman LH, Reisner A, Villarroel M, Long WJ,
>Szolovits P, Moody GB, Mark RG, Clifford GD. Automated De-Identification
>of Free-Text Medical Records. British Medical Council: Medical
>Informatics 
>and Decision Making, 2008, 8:32.
>
>and a survey:
>Ozlem Uzuner, Yuan Luo, Peter Szolovits. Evaluating the State-of-the-Art
>in Automatic De-identification. JAMIA Journal of the American Medical
>Informatics Association, 2007,14:550-563
>
>thanks forany other recommendations
>
>Eric Atwell, Senior Lecturer, Language research group,
>  I-AIBS Institute for Artificial Intelligence and Biological Systems
>  School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
>  Leeds LS2 9JT, England.        TEL: 0113-3435430  FAX: 0113-3435468
>  WWW: http://www.comp.leeds.ac.uk/arabic
>       http://www.comp.leeds.ac.uk/nlp
>
>_______________________________________________
>UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>Corpora mailing list
>Corpora at uib.no
>http://mailman.uib.no/listinfo/corpora
>



_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list