Skip to content
All posts

One Value Set Reference to Rule Them All?

By Michael Buck, Navidence SVP of Knowledge Engineering

One of the most recognized pieces of a computable operation definition (CODef) is the value set of standard terminology codes (e.g. ICD, SNOMED, RxNorm, ATC, etc.). When building an indication library for a disease condition Navidence will look for reputable literature references that demonstrate how a particular CODef was built and in rare, ideal cases with strong validation metrics for its use with a particular data source.

After years of experience and thousands of CODefs built, it may surprise some people to know that it is rare to impossible to find any reference that has a comprehensive list of codes for a particular concept sufficient to use as a value set.

The real-world difficulties can be illustrated with a few examples of different content types like diagnoses and medications.

For each diagnosis concept Navidence creates a basic set of value sets including ICD-10-WHO, ICD-10-CM, ICD-9-CM, SNOMED CT US. Even in the scenario when an organization only requires a single terminology such as ICD-10-CM for US-based research, the paper very often has study-specific exclusions that are unlikely to be applicable in a different research setting so it is often desirable to be more comprehensive as a gold-reference standard. The paper is often sufficiently out-of-date with new codes that have been added or deprecated under that concept. Furthermore as more research is conducted across multiple institutions and countries we almost never see a reference that has an aligned set of all terminologies which are required for conducting research across all relevant geographic regions. This is the value of a CODEF library provided by Navidence.

For medications, it is even more difficult. As an example Navidence was creating a grouping value set for Antidiabetic medications, which is composed of 10 drug classes used to treat diabetes. We had a number of reputable references from the NIH NIDDK, the NHS, UK diabetes foundation, Mayo Clinic, and Cleveland Clinic. Not one of them matches on a comprehensive list of medications for each drug class. In fact, none of them even tried. They all were very consistent in which drug classes are included as "antidiabetic agents", but all of them just provided examples of medications in each class, not a comprehensive list. That required us to go to other references to come up with the list of medications for each drug class. This is particularly true as we look for medications found in the United States versus those available in Europe. In another example, we developed a comprehensive list of immunosuppressive therapies as described in detail here.

In another example, we developed a comprehensive list of immunosuppressive therapies as described in detail here (https://doi.org/10.1016/j.jval.2023.09.2747). From five different reputable sources each containing a range of 71 to 110 value sets, a final comprehensive set was identified containing 157 distinct medications highlighting the need to search multiple sources.

From these and other experiences we regularly educate our researchers using Sherpa that in most cases it is neither desirable nor accurate to limit a gold standard value set to one reference.