At this month’s meeting of the Portland R User’s Group, I learned about the existence of the Athena system, a downloadable and keyword-searchable repository of every medical term appearing in every medical coding scheme out there. Why is this important? Over the years, I have been involved in many projects where the goal was to identify, say, every Hodgkin lymphoma patient, or every patient prescribed tamoxifen, or every patient who should have been prescribed tamoxifen. If everything was coded properly, this would be easy. And indeed, if you just search for diagnosed Hodgkin lymphoma patients, there’s not much to it. But instead, researchers are interested in finding every Hodgkin lymphoma patient, not just those with a diagnosis code indicating such. To do this, you look for all of the treatments, drugs, tests, and procedures you can think of – both online and in paper manuals – that might be related to the disease. You end up with a lot of false positives, but some true positives also. It’s all very time-consuming. What this Athena resource can do is immediately narrow the search space to something reasonable. I can’t believe I did not know about it before.
The site is intuitive. At the top there is a button for downloading all the codes. While I couldn’t get this to work, the “download results” button just below allowed me to download the results of a specific query. I don’t really want or need to store every code locally, so that’s fine. When you “download results”, you get a tab-delimited file with a .csv extension. Excel doesn’t like this combination, but changing the extension to .xls (but not .xlsx) did the job.
To test Athena, I typed in “mammogram” and got 209 results. 30 of these were in coding systems relevant to the data I use, and it included all 6 of the ones used in this paper. That’s hardly a rigorous test, but it appears that Athena will err on the side of inclusion. No longer will I have to scan multiple web sites and thumb through old paper copies of code books, just to be sure I haven’t missed anything. I can start with Athena and filter out what I don’t need. Bravo!