[Corpora-List] Anonymization tools for patient record research methods
Aberdeen, John S.
aberdeen at mitre.org
Tue May 31 14:21:33 UTC 2011
Hi Eric,
We've developed a tool for this called MIST (MITRE Identification Scrubber
Toolkit), which combines an annotation interface and CRF-based learner to
develop de-identification systems. With it you can iteratively build up a
corpus of de-identified notes, and successively better models for
de-identification via a tag-a-little-learn-a-little cycle. It is
research-ware, but well documented.
http://mist-deid.sourceforge.net/
Best,
John
John Aberdeen
Lead Scientist - Human Language Technology
The MITRE Corporation
202 Burlington Rd.
Bedford, MA 01730
+1.781.271.2840
aberdeen at mitre.org
-----Original Message-----
From: Eric Atwell <csc6ea at leeds.ac.uk>
Date: Fri, 27 May 2011 18:12:41 -0400
To: "corpora at uib.no" <corpora at uib.no>
Subject: [Corpora-List] Anonymization tools for patient record research
methods
>We are investigating research methods for patient records.
>To be available for Corpus Linguistics analysis, patient records
>have to be anonymised, so individual patients cannot be identified.
>Can anyone point us at tools to (semi-)automate anonymization or
>deidentification of health text data (or any other text data)?
>
>I managed to find "deid" in Physionet
>http://www.physionet.org/physiotools/deid/
>Neamatullah I, Douglass M, Lehman LH, Reisner A, Villarroel M, Long WJ,
>Szolovits P, Moody GB, Mark RG, Clifford GD. Automated De-Identification
>of Free-Text Medical Records. British Medical Council: Medical
>Informatics
>and Decision Making, 2008, 8:32.
>
>and a survey:
>Ozlem Uzuner, Yuan Luo, Peter Szolovits. Evaluating the State-of-the-Art
>in Automatic De-identification. JAMIA Journal of the American Medical
>Informatics Association, 2007,14:550-563
>
>thanks forany other recommendations
>
>Eric Atwell, Senior Lecturer, Language research group,
> I-AIBS Institute for Artificial Intelligence and Biological Systems
> School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
> Leeds LS2 9JT, England. TEL: 0113-3435430 FAX: 0113-3435468
> WWW: http://www.comp.leeds.ac.uk/arabic
> http://www.comp.leeds.ac.uk/nlp
>
>_______________________________________________
>UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>Corpora mailing list
>Corpora at uib.no
>http://mailman.uib.no/listinfo/corpora
>
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list