a national resource for computational science education

HOME BWSIP Shodor Blue Waters

View Position
Position TitleLearning from History: Using past genetic algorithm searches to seed future runs with untested, chemically relevant structures
SummaryGenetic algorithms are used to search for the global minimum of chemical structures of a given stoichiometry. The goal of this project is to deliberately create chemically intuitive starting populations in unexplored regions of the potential energy surface in order to broaden and target the scope of the search for the global minimum structure to efficiently use computational resources over the course of many runs.
Job DescriptionA typical genetic algorithm runs in a loop where trial structures are selected from an exclusive population and are paired and/or mutated. Molecular configurations that are better will replace weaker candidates in the population. We have created an intelligent molecule creator which uses standard atomic orbital hybridization geometries to make randomly generated systems which are chemically relevant. Now that we have demonstrated the benefit to improved starting populations, it is now important to create a systematic search scheme which leverages improved starting conditions.

The student will use existing protocols in Python to make unique groups of intelligently created candidate structures based on the results of previous runs. The successful intern will apply machine learning algorithms on previously conducted genetic algorithm runs to train an initial population generator that will seed the starting population in the current run. Each successive run of the genetic algorithm, having learned what has already been searched, can then examine novel regions which reduces duplication of candidate structures. This leads to an overall search strategy that is continually focused on unexplored regions. During the test phase, we will use a stoichiometry with a known global minimum. Typically, when measuring the success of a genetic algorithm run, we compare the number of iterations required to find the global minimum in each run. With this holistic approach, the termination of a search will focus more on the extent of the potential energy surface that is explored and the likelihood of finding anything more favorable than the current best structure. It is expected that this procedure should reduced the number of tested candidate structures over all genetic algorithm runs by limiting duplicate trial structures typically seen in multiple runs. This new strategy will then be tested on a variety of trial systems in order to determine a generic procedure which would work on a broad set of chemical systems.
Use of Blue WatersTo test the effectiveness of global optimisation search schemes, hundreds of runs comprised of thousands of trial structures must be calculated to get the statistics necessary to quantify improvements. Local resources here at CSUF will be used to prototype initial population generation protocols. The resources at Blue Waters will be used for promising candidates to parallelize all the necessary runs to get relevant statistical accuracy to benchmark improvements. Also the parallelization available at Blue Waters will enhance the chemical accuracy of the quantum mechanics based calculations as well as enhance the clustering algorithm's ability to categorize structures. This will be useful when the improved code is applied to new, catalytically relevant systems to determine important chemical properties.
Conditions/QualificationsMust be an undergraduate at CSUF
Must have Python programming experience
Start Date05/31/2018
End Date05/31/2019
LocationGroves Research Group
Department of Chemistry and Biochemistry
California State University, Fullerton
Fullerton, CA
Carlos Barragan