- Address
- 305-0044 茨城県つくば市並木1-1 [アクセス]
研究内容
- Keywords
Generative AI, Active Learning, Bayesian Optimisation, Knowledge Retrieval, Crystal Structure Generation, Molecular Structure Generation, Materials Design, Language Models, Deep Learning, Machine Learning, Natural Language Processing, Organic/Inorganic Chemistry
Generative AI, Active Learning, Materials Optimisation, Knowledge Retrieval, Artificial Intelligence, Materials Science, Theoretical Physics and Chemistry
出版物2004年以降のNIMS所属における研究成果や出版物を表示しています。
論文
- Guillaume Lambard, Ekaterina Gracheva. SMILES-X: autonomous molecular compounds characterization for small datasets without descriptors. Machine Learning: Science and Technology. 1 [2] (2020) 025004 10.1088/2632-2153/ab57f3 Open Access
- Sirawit Pruksawan, Guillaume Lambard, Sadaki Samitsu, Keitaro Sodeyama, Masanobu Naito. Prediction and optimization of epoxy adhesive strength from a small dataset through active learning. Science and Technology of Advanced Materials. 20 [1] (2019) 1010-1021 10.1080/14686996.2019.1673670 Open Access
- Asep Sugih Nugraha, Guillaume Lambard, Jongbeom Na, Md Shahriar A. Hossain, Toru Asahi, Watcharop Chaikittisilp, Yusuke Yamauchi. Mesoporous trimetallic PtPdAu alloy films toward enhanced electrocatalytic activity in methanol oxidation: unexpected chemical compositions discovered by Bayesian optimization. Journal of Materials Chemistry A. 8 [27] (2020) 13532-13540 10.1039/d0ta04096g
会議録
- MOREAU, Louis Etienne, LAMBARD, Guillaume, Stehpane Gorsse, MURAKAMI, Hideyuki. Study of oxidation resistance of Co-Cr-Ta ternary alloy system for ultra-high temperature applications. International Symposium on High temperature Oxidation and Corrosion 2022 Abstracts. 1 (2022) 31-34
口頭発表
- Adroit Fajar, LAMBARD, Guillaume. Molecular Design with Generative AI for CO2 Capture. International Symposium on Green Transformation Initiative and Innovative Zero-Carbon Energy Systems (GXI-ZES). 2025
- Abdullah Al Abdulghani, Nobutaka Maeda, Adroit Fajar, LAMBARD, Guillaume. Towards Low-Energy Input CO2 Capture: Lowering the Regeneration Temperature by Functionalisation of Polyethylenimine. WPI-SKCM2 and WPI-I2CNER Joint Symposium. 2024
- Adroit Fajar, LAMBARD, Guillaume. Fine-tuning LLM for Ionic Liquid Design in CO2 Capture. WPI-SKCM2 and WPI-I2CNER Joint Symposium. 2024
公開特許出願
所属学会
日本MRS
マテリアル基盤研究センター
SMILES-X: autonomous molecular compounds characterization for small datasets without descriptors
Cheminformatics, Small molecules, SMILES, Natural language processing, Machine learning, Attention mechanism, Small datasets
概要
Key Takeaways:
• The SMILES-X is an autonomous pipeline that uses machine learning to predict physicochemical properties of molecular compounds, overcoming the challenges of small datasets and the need for task-specific descriptors.
• The SMILES-X achieves state-of-the-art results in predicting aqueous solubility, hydration free energy, and octanol/water distribution coefficient of molecular compounds.
• The SMILES-X is a valuable tool for materials scientists and chemists, providing interpretable predictions and improving the accuracy of physicochemical property inference.
新規性・独創性
• The SMILES-X is an autonomous pipeline for molecular compounds characterization
• The SMILES-X is based on a neural architecture with a data-specific Bayesian hyper-parameters optimization
• The attention mechanism in the SMILES-X enables the interpretation of output predictions
• The SMILES-X shows state-of-the-art results in the inference of aqueous solubility, hydration-free energy, and octanol/water distribution coefficient of molecular compounds
• The source code for the SMILES-X is available at https://github.com/Lambard-ML-Team/SMILES-X
内容
• The SMILES-X is a novel approach that tackles both the issue of small datasets and the difficulty of developing task-specific descriptors, making it a valuable asset in the toolkit of materials scientists and chemists.
• The SMILES-X can be used in various applications such as drug discovery, material design, and chemical synthesis. It can help researchers predict the physicochemical properties of molecular compounds accurately and efficiently,
which can save time and resources.
• The marketability of the SMILES-X depends on its ability to accurately predict the physicochemical properties of molecular compounds and its ease of use. If the method proves accurate and efficient, it could be a valuable tool for
researchers in materials science and related fields.
• The potential exit for the SMILES-X could be through licensing the technology to companies in materials science and related fields or through the development of a software tool that researchers can use to predict the physicochemical properties of molecular compounds.
F1: The SMILES-X pipeline
F2: Fixed skeleton of the neural architecture in the SMILES-X
F3: Visualisation of the importance of each token within the SMILES towards the final prediction of the property of interest. The illustration is done on the structure Cc1ccc(O)cc1C from the FreeSolv [31] dataset, with hydration-free energy as
the corresponding property. The 1D (a) and 2D (b) attention maps show the projections of the attention vector α on the SMILES string and molecular graph, respectively. The redder and darker the colour is, the stronger is the attention on a given token. The temporal relative distance is shown in (c). The closer to zero is the distance value, the closer is the temporary prediction on the SMILES fragment to the whole SMILES prediction.
まとめ
• The SMILES-X can be applied in various areas such as drug discovery, material design, and chemical synthesis to predict the physicochemical properties of molecular compounds accurately and efficiently.
• The SMILES-X can be integrated with other materials informatics methods to create a more comprehensive toolkit for researchers in materials science and related fields.