Proteins are crucial for the function of cells in living organisms. Now, researchers have developed a machine leaning-led process capable of designing proteins better that rival the functionality of those in nature.
Proteins consist of hundreds or thousands of amino acids that specify both its structure and function. Although past methods have been able to design proper structure, function has been more difficult.
To address these issues, Rama Ranganathan, a professor of biochemistry and molecular biology at the University of Chicago, and his colleagues realized that in the last 15 years, genome bases have been growing exponentially and contain massive amounts of data about both protein structure and function. Creating mathematical models from the data, they were then able to use machine-learning methods to unlock new information about their design rules.
"We generally assume that to build something, you have to first deeply understand how it works," says Ranganathan. "But if you have enough data examples, you can use deep learning methods to learn the rules of design, even as you are understanding how it works or why it's built that way."
In the research, the team studied the chorismate mutase family of metabolic enzymes, a kind of protein essential for bacteria, fungi, and plants. With their machine-learning models, they were then able to decode the design rules behind the proteins.
To see how they worked, the researchers then created synthetic genes to encode for proteins, cloned them into bacteria, and watched them proliferate. All in all, they saw that their artificial proteins had the same catalytic function as natural chorismate mutase proteins.
The researchers will now work to understand how the models came to their conclusions. They also hope to use their new platform to develop proteins that may be able to address societal problems like climate change. Ranganathan and Associate Professor Andrew Ferguson have already created a company called Evozyne to commercialize their technology in domains spanning energy, environment, catalysis, and agriculture.
"This system gives us a platform for rationally engineering protein molecules in a way that we always dreamed we could," says Ranganathan. "Not only can it teach us the physics of how proteins work and how they evolve, it can help us find solutions for issues like carbon capture and energy harvesting. Even more generally, the studies in proteins might even help teach us how the deep neural networks behind modern machine learning actually work."