Natural language processing (NLP) refers to the interpretation of natural language with the help of machine leaning algorithms. Object and pattern recognition mechanisms are used to extract information out of written text. Due to the complexity and ambiguity of the human language, gathered information, especially from medical documents, is difficult to interpret. Data can be presented as numbers for laboratory values, standard results text elements such as ECG reports or complex texts such as psychiatric evaluations. Until now, the available technology was insufficient for reliable analyses.
In a first proof of concept, we evaluated 600 surgical reports of pancreas resections for the correct classification of the reconstruction (either pancreaticojejonostomy or pancreaticogastrostomy). We split the dataset 3 to 1 in a training and test dataset. Best results yieled a naive bayesian classefier with an accuracy of 92%.
We show, that an automated classification of surgical reports via natural language processing is possible with an acceptable accuracy.