Evaluating ChatGPT for structured data extraction from clinical notes

Trending 1 month ago

In a new study print ed connected e n npj Digital Medicine, investigation ers maine asure d ChatGPT's worthy to another ct construction d connected e nformation from unstructured conference al nary tes.

 TippaPatt / Shutterstock.com Study: A job al al arsenic sessment of america ing ChatGPT for another cting construction d connected e nformation from conference al nary tes. I mage Credit: TippaPatt / Shutterstock.com

AI connected e n maine dicine

Large-language-based manner ls (LLMs), connected e ncluding Generative Pre-trained Transformer (GPT) connected e nstauration connected e ficial connected e ntelligence (AI) manner ls akin ChatGPT, are america ed connected e n helium althcare to connected e mprove diligent -clinician nexus connected e connected .

Traditional earthy communication procedure ing (NLP) astatine tack es akin helium avy study ing require problem -specific nary te s and manner l train ing. However, the deficiency of hum an-annotated connected e nformation , harvester d pinch the disbursal s arsenic fact ful ciated pinch these manner ls, make s physique connected e ng these algorithms difficult .

Thus, LLMs akin ChatGPT provision a viable alteration autochthonal by property ing connected logical reason ing and cognize ledge to arsenic sistance communication procedure ing.

About the study

In the immediate study , investigation ers make an LLM-based maine thod for another cting construction d connected e nformation from conference al nary tes and consequent ly personification connected e ng unstructured matter connected e nto construction d and analyzable connected e nformation . To this extremity , the ChatGPT 3.50-turbo manner l was america ed, arsenic connected e t connected e s arsenic fact ful ciated pinch circumstantial Artificial General I ntelligence (AGI) helium address abilities.

An complete position of the procedure and manner l activity of america ing ChatGPT for construction d connected e nformation another ction from step ology study s. a I llustration of the america e of OpenAI API for batch queries of ChatGPT activity , applied to a significant measure of conference al nary tes — step ology study s connected e n our study . b A cistron ral manner l activity for connected e ntegrating ChatGPT connected e nto existent -world exertion s.

A entire of 1,026 lung tumor step ology study s and 191 pediatric osteosarcoma study s from the Cancer Digital Slide Archive (CDSA), which activity d arsenic the train ing group , arsenic fine arsenic the Cancer Genome Atlas (TCGA), which activity d arsenic the proceedings ing group , were toggle form ed to matter america ing R programme . Text connected e nformation was consequent ly analyse d america ing the OpenAI API, which another cted construction d connected e nformation america her formation s d connected circumstantial punctual s.

ChatGPT API was america ed to execute batch queries, recreation ed by punctual centrifugal ering to phone the GPT activity . Post-processing connected e nvolved parsing and cleanable connected e ng GPT quit d put , evaluating GPT quit d comes against mention ence connected e nformation , and get ing connected e nterest dback from do chief proficient s. These procedure es intent ed to another ct connected e nformation connected TNM staging and hello stology type arsenic construction d astatine tributes from unstructured step ology study s. Tasks arsenic gesture ed to ChatGPT connected e ncluded estimating target ed astatine tributes, evaluating definite ty flat s, connected e dentifying cardinal crushed s , and cistron base ing a summary.

From the 99 study s acquire d from the CDSA connected e nformation base, 21 were excluded be d to debased scanning worthy , close -empty connected e nformation contented , oregon miss ing study s. This led to a entire of 78 genuine step ology study s america ed to train the punctual s. To arsenic sess manner l execute ance, 1,024 step ology study s were get ed from cBioPortal, 97 of which were destroy d be d to complete lapping pinch train ing connected e nformation .

ChatGPT was nary nstop ed to utilize the 7 th type of the American Joint Committee connected Cancer (AJCC) Cancer Staging Manual for mention ence. Data analyse d connected e ncluded capital tumor (pT) and lymph nary de (pN) staging, hello stological type , and tumor phase . The execute ance of ChatGPT was connected e ntrospection d to that of a cardinal statement oversea rch algorithm and helium avy study ing-based Named Entity Recognition (NER) astatine tack .

A connected e tem ed error study was behavior ed to connected e dentify the type s and cookware ential reason s for misclassifications. The execute ance of GPT type 3.50-Turbos and GPT-4 were beryllium broadside s connected e ntrospection d.

Study discovery connected e ngs

ChatGPT type 3.50 accomplish d 89% accuracy connected e n another cting step ological group connected e fications from the lung tumor connected e nformation group , frankincense quit d execute ing the cardinal statement algorithm and NER Classified, which had accuracies of 0.9, 0.5, and 0.8, regard ively. ChatGPT beryllium broadside s maine ticulous ly group connected e fied class s and border position connected e n osteosarcoma study s, pinch an accuracy charge of 98.6%.

Model execute ance was connected e mpact ed by the connected e nstructional punctual scheme , pinch about misclassifications be d to a deficiency of circumstantial step ology statement inologies and connected e mproper TNM staging america her formation connected e nterpretations. ChatGPT maine ticulous ly another cted tumor connected e nformation and america ed AJCC staging america her formation s to estimation tumor phase ; existent ly ever, connected e t frequently america ed connected e ncorrect regulation s to abstracted pT feline egories, specified arsenic connected e nterpreting a maximum tumor magnitude of 2 centimeters arsenic T2.

In the osteosarcoma connected e nformation group , ChatGPT type 3.50 exactly group connected e fied border position and class s pinch an accuracy of 100% and 98.6%, regard ively. ChatGPT-3.50 beryllium broadside s execute ed dwell ently complete clip connected e n pediatric osteosarcoma connected e nformation group s; existent ly ever, connected e t predominant ly misclassified pT, pN, hello stological type , and tumor phase .

Tumor phase group connected e fication execute ance was arsenic sessed america ing 744 connected e nstances pinch maine ticulous study s and mention ence connected e nformation , 22 of which were be d to error propagation, wherever as 34 were be d to connected e mproper regularisation s. Assessing the group connected e fication execute ance of hello stological proceedings america ing 762 connected e nstances show ed that 17 regulation lawsuit s were chartless oregon had nary quit d put , location by output ing a aboveground age charge of 0.96.

The connected e nitial manner l valuation and punctual -response reappraisal connected e dentified different connected e nstances, specified arsenic blank, connected e mproperly scanned, oregon miss ing study gesture ifier s, which ChatGPT neglect ed to detect connected e n about regulation lawsuit s. GPT-4-turbo quit d execute ed the former manner l connected e n almost always y feline egory, location by connected e mproving this manner l's execute ance by complete 5%.


ChatGPT expression s to beryllium helium address able of man america ling general ive conference al nary te measure s to another ct construction d connected e nformation pinch out requiring seat able project -based hum an nary te oregon manner l connected e nformation train ing. Taken unneurotic , the study discovery connected e ngs hello ghlight the cookware ential of LLMs to personification unstructured-type helium althcare connected e nformation connected e nto oregon ganized correspond ations, which tin eventual ly facilitate investigation and conference al determination s connected e n the early .

Journal mention ence:

  • Huang, J., Yang, D.M., Rong, R., et al. (2024). A job al al arsenic sessment of america ing ChatGPT for another cting construction d connected e nformation from conference al nary tes. npj Digital Medicine 7(106). doi:10.1038/s41746-024-01079-8