Motivation
As you may know, health care information is increasingly distributed across many independent databases and systems, both within and among organizations as separate islands with different patient identifiers. This is the case for data collected about the same patient at different health care institutions, different pharmacy systems, different payers, and so on. This situation interferes with the aggregation of information about individuals across such databases as needed for many health care use-cases: public health reporting, clinical research, outcomes management, and administrative reporting. Aggregation is important not only to determine a patients’ health care status, but also for population based studies.
Why is a patient matching module needed for OpenMRS?
As mentioned above, health care data is scattered across disparate systems, and as OpenMRS implementations grow, they will begin facing multiple instances of the same patient across and within their implementation. One OpenMRS implementation is known to contain more than 700,000 patients(!) In a related fashion, it's also the case that duplicate patient registrations will accumulate over time in the same single system. Processes to link entities (e.g., patients) across and within OpenMRS implementations will become increasingly important.
What functionality will patient identity management add to OpenMRS?
The patient matching module will initially provide two core types of functionality. First, it provides a stand-alone application that implements sophisticated probabilistic matching algorithms for both a) identifying duplicates in a single data source, and b) identifying matches between two generic data sources. The current output from the stand-alone application is a delimited file containing matches with associated match scores: the higher the score, the more likely the match.
The second core function addresses the issue of duplicate patient registrations in an instance of OpenMRS. Many OpenMRS implementations have 10's if not 100's of thousands of patient records. Over time, duplicate patient records will creep in. The patient matching module will identify and provide a list of likely duplicates to OpenMRS administrators. Because patient identifiers vary across countries and culture, we've designed the patient matching module to adapt to widely varying patient identifiers.
Screen-shot of the OpenMRS de-duplication Admin Screen
When will this functionality be available to the OpenMRS community?
The source code for the patient matching module is currently available in the OpenMRS subversion repository by clicking here (http://svn.openmrs.org/openmrs-modules/patientmatching). The stand-alone application has expanding functionality. The OpenMRS de-duplication module is currently under active development, with great support provided through the Google Summer of Code 2008 initiative! We anticipate delivering the de-duplication functionality to the community by Summer's end.
Join In the Fun!
We welcome those with an interest in this area (you know who you are)! To become further acquainted and involved, we encourage you to:
- read the OpenMRS developers "Where to Get Started" page
- check out the blog of our excellent GSoC intern, Nyoman Ribea
- peruse the source code
- review outstanding developer tickets
- find us on the OpenMRS IRC Channel (sgrannis, james_regen, nribeka)
- email shaun: s g r a n n i s { a t } r e g e n s t r i e f { d o t } o r g
- check back here from time-to-time