Large language models, such as the GPT series of models used in ChatGPT, are trained using large amounts of text and can predict the probabilities of series of words in a given language. This can be used for a variety of applications, e.g., to generate a probable text output based on a user input. The chemical literature also contains vast amounts of text, and performing a comprehensive literature review and extracting useful data and insights for a specific application quickly can be challenging. Large language models could help with this issue.
Omar M. Yaghi, University of California, Berkeley, USA, and King Abdulaziz City for Science and Technology, Riyadh, Saudi Arabia, and colleagues have used ChatGPT to automate text mining and quickly create datasets on difficult-to-aggregate research about metal-organic frameworks (MOFs). The team curated 228 relevant peer-reviewed research papers, and then used ChatGPT to process the relevant sections in the papers and to extract, clean up, and organize the data. ChatGPT successfully extracted 26,257 distinct synthesis parameters for ca. 800 MOFs reported in the selected research articles. It mined the synthetic conditions of the MOFs with high accuracy and very quickly.
The extracted datasets can then be used to inform predictive models, which might help chemists to develop new MOFs. Using the data gathered by text mining, the team created a machine-learning model that achieved 87 % accuracy in predicting MOF experimental crystallization outcomes.
According to the researchers, the text-mining approach can be easily transferred to other contexts with minimal coding knowledge. Further exploration of large language models for AI-assisted chemistry could, thus, be useful for accelerating research.
- ChatGPT Chemistry Assistant for Text Mining and the Prediction of MOF Synthesis,
Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, Omar M. Yaghi,
J. Am. Chem. Soc. 2023.
https://doi.org/10.1021/jacs.3c05819
Also of Interest
Discussing science communication, AI in chemistry, publication ethics, and the purpose of life with an AI
An expanding compilation of articles concerning the intersection of artificial intelligence and chemistry