I have clinical trial CSRs (case study reports) hard copies scanned into PDFs. I would like to pool data from several CSRs and analyse the combined data using various research questions.
That is tricky - you actually need some pretty solid Optical Character Recognition (OCR) to achieve that. And for complex information like that it will take some serious horsepower.
There are several ad-supported online tools that can do this, but with variable efficacy and variable trustworthiness (with the data). Apparently, OneNote can do OCR, and you might be able to do this within your organisational environment as you really don’t want to put private / sensitive info like that somewhere unsafe.
Any premium generative AI (i.e. paid account) should be able to help with assembling stuff like that once you’ve got the text extracted - or maybe even extract the text as well.
I recently used a couple of online generative AI tools to do just this with non-patient related information (PDFs of protocol tables). With the right prompt Gemini in particular did a stellar job and created some very clean JSON.
But… if any of the content is sensitive you would need to use a private hosted model, and validating the output could take a lot of time…
That’s very insightful, thank you. Yes, it will need to be a private hosted approach, even though any personal identifying information will be redacted. I am anxious about the validation though; I wonder if random sampling will suffice or some other approach?
Still, the process is likely to be less onerous than a person sitting in a windowless room transposing the hard copy data on to an evaluable digital platform
Hi, there are tools already in use in HNZ that should be able to do this. The Waikato team are using TotalAgility software that should have this capability and there are other regions where the software is also in use. Let me know if you want me to connect you to the Waikato team
I don’t think so - it is CoPilot (i.e. OpenAI) all the way in Health NZ unless there is a special arrangement (such as the one alluded to by @NualaFitzabove).
My advice, is if you are looking at a solution, make sure you understand not just how accurate it is, but perhaps more importantly, what the user interface is like for managing the ‘exceptions’. In my experience this is critical to making the solution as efficient as possible for use. Reach out for a chat if you think it would help, I’m happy to talk you through my learnings and experience.