LI-EMRSQL: Linking Information Enhanced Text2SQL Parsing on Complex Electronic Medical Records

Qing Li, Tao You, Jinchao Chen, Ying Zhang, Chenglie Du

Research output: Contribution to journalArticlepeer-review

50 Scopus citations

Abstract

Converting natural language text into executable SQL queries significantly impacts the healthcare domain, specifically when applied to electronic medical records. Given that electronic medical records store extensive patient information in a relational multitable database, developing a Text-to-SQL parser would enable the correlation of intricate medical terminology through semantic parsing. A major challenge is designing a versatile Text2SQL parser applicable to new databases. A critical step towards this goal involves schema linking - accurately identifying references to previously unseen columns or tables during SQL creation. In response to these key challenges, we propose a novel framework - Linking Information Enhanced Text2SQL Parsing on Complex Electronic Medical Records (LI-EMRSQL). This model leverages the Poincaré distance metric detection procedure, utilizing induced relations to enhance the performance of pre-existing graph-based parsers and improve schema linkage. To enhance the generalizability of LI-EMRSQL, the detection process is completely unsupervised and does not necessitate additional parameters. On two conventional Text2SQL datasets and two EMRs Text2SQL datasets, the system delivers SOTA performance. Furthermore, notable enhancements in the model's comprehension and alignment of schemas are observed.

Original languageEnglish
Pages (from-to)1280-1290
Number of pages11
JournalIEEE Transactions on Reliability
Volume73
Issue number2
DOIs
StatePublished - 1 Jun 2024

Keywords

  • Electronic medical records
  • Text2SQL
  • health informatics
  • natural language processing
  • semantic parser

Fingerprint

Dive into the research topics of 'LI-EMRSQL: Linking Information Enhanced Text2SQL Parsing on Complex Electronic Medical Records'. Together they form a unique fingerprint.

Cite this