摘要
Converting natural language text into executable SQL queries significantly impacts the healthcare domain, specifically when applied to electronic medical records. Given that electronic medical records store extensive patient information in a relational multitable database, developing a Text-to-SQL parser would enable the correlation of intricate medical terminology through semantic parsing. A major challenge is designing a versatile Text2SQL parser applicable to new databases. A critical step towards this goal involves schema linking - accurately identifying references to previously unseen columns or tables during SQL creation. In response to these key challenges, we propose a novel framework - Linking Information Enhanced Text2SQL Parsing on Complex Electronic Medical Records (LI-EMRSQL). This model leverages the Poincaré distance metric detection procedure, utilizing induced relations to enhance the performance of pre-existing graph-based parsers and improve schema linkage. To enhance the generalizability of LI-EMRSQL, the detection process is completely unsupervised and does not necessitate additional parameters. On two conventional Text2SQL datasets and two EMRs Text2SQL datasets, the system delivers SOTA performance. Furthermore, notable enhancements in the model's comprehension and alignment of schemas are observed.
源语言 | 英语 |
---|---|
页(从-至) | 1280-1290 |
页数 | 11 |
期刊 | IEEE Transactions on Reliability |
卷 | 73 |
期 | 2 |
DOI | |
出版状态 | 已出版 - 1 6月 2024 |