Has anyone explored a tool to extract unstructured data out of “locked-up” EHR repositories? i.e. RCP.
I came across this journal paper [1] which discusses the deployment of CogStack [2] at UCL Hospitals. So far, it’s processed 30 million records at UCLH.
I wonder if there’s the appetite to pilot something similar in NZ? I ask as we have several use cases on our backlog this would benefit - particularly if such a tool can be integrated as part of an end-to-end RPA solution.
We’ve built a PoC RPA bot that can search clinic letters in RCP using keywords (i.e. diagnosis) but a platform like CogStack would be significantly more efficient, both in cost and throughput. NB. this aforementioned bot is not productionised and was tested against a small batch in an RCP Acceptance environment.
Not endorsing but CogStack is a free-text analytics platform i.e. it claims to also offer native data ingestion, harmonisation (i.e. ETL) and processing (i.e. NLP). It would probably benefit Data Science more than Automation.
Hi Parag, Precision Driven Health have supported some research in this area and have created a tool that can extract unstructured data from text and link it to standardised international medical
dictionaries (e.g. SNOMED, ICD-10). I’d be interested in discussing further with you and/or connecting you with some of the researchers to see if there is a way we can collaborate on a pilot that could help your backlog. Alternatively, we have a network of Data Scientists so I could also put you in touch with an NLP specialist who may know more about CogStack.
We have started doing a little bit. Has stalled while I recruit, but have tried some off the shelf solutions and we have found they miss local idioms and acronyms.
Also something @kevinrossnz and his team have been doing lots of research in is reliable negation interpretation which is really important (eg. CVD <> no hx CVD).
John Snow Labs is worth looking at and PDH/Orion have smart coder.
I think there is a fair bit of regex for different document structures that will need to be managed and then standardised which is the hard bit. I was talking to @alikhannz about this and they were looking at how to create a templated regex approach.
If there’s been any movement on this or, even better, an established design pattern, we’d be keen. Our immediate interests would be in helping automate data entry for registers, i.e. Breast Cancer, etc. that are hosted with third-party providers and where back-end integration is hard. Here, a RPA bot would read from a data source that has parsed the unstructured data via API, i.e., PDH or JSL and emulate the front-end data entry routines staff currently do.