Dropping patient_details
We made a change to the Generalized Data Model (GDM). Many times, when a data model undergoes a change, some new tables or columns are added. We’re doing the opposite. We’re dropping a table from the GDM, namely, the patient_details
table.
Originally, we intended the patient_details
table to store facts that were more demographic than clinical. For example, we used the patient_details
table to store information from the SEER-Medicare linked data, such as marital status, urban/rural status, and census tract-based socio-economic data.
As we began to consider how to create algorithms based on information in the patient_details
table using ConceptQL, we found ourselves questioning the purpose of the table. When we looked at the clinical_codes
table and associated tables like the measurement_details
table, we realized that the clinical_codes
and other existing tables were well-suited to store all observations about a patient, not just those made in a clinical setting.
At that point, we were faced with either 1) expanding the patient_details
table along the lines of our existing clinical_codes
and measurement_details
tables, or 2) moving the information from patient_details
into the clinical_codes
and associated tables.
We went with the latter for two reasons:
- We could not identify a compelling use case for distinguishing between “patient” details and “clinical” details in the context of extracting data to create an analysis-ready dataset.
- To the extent that this distinction is important, it could be captured in ways that do not require separate tables.
- As a side note, the use of tables to partition data into separate semantic groups is an important distinction among various data models. This will be the subject of a future blog post.
- Storing all observations for a patient in a single table facilitates more powerful algorithms and simpler queries.
- Creating algorithms that combine personal and clinical constructs is easier to do and to explain.
- Although this was not a requirement, it turns out that, because ConceptQL is already designed to query the
clinical_codes
table, no additional work was required for implementation.
It would be appropriate to rename clinical_codes
to observations
or something similar, but we have a lot of software that is hard-coded to use the name clinical_codes
so we’ll continue to use clinical_codes
for the time-being.
Interestingly, using a single table for all observations is not a new idea. After we decided on this direction for clinical_codes
, we realized this is similar to how I2B2 structures its data as a star schema.
We’re very excited about this change to the GDM. It isn’t often one gets to remove something from a system and gain features as a result.
You can read the original paper on GDM here: