How can large language models (LLMs) be improved to perform simple tasks, as they are often designed to perform complex tasks? This is what a recently submitted study hopes to address as a team of researchers led by the Massachusetts Institute of Technology (MIT) introduced CodeSteer, which is designed to improve LLM functionality for performing simple tasks by inputting additional data and information. This study has the potential to help researchers and engineers develop more efficient machine learning models capable of seamlessly performing tasks of all kinds.
For the study, the researchers tested CodeSteer as a LLM “trainer” to provide data and input, enabling the LLM to perform simple tasks that it wasn’t initially designed to perform. Working with the LLM, CodeSteer makes recommendations on appropriate responses, specifically code or other prompts, the LLM can use when the latter is presented with a problem it’s never faced before. CodeSteer will even encourage the LLM to try different options.
“There is a race to develop better and better models that are capable of doing everything, but we’ve taken a complementary approach,” said Dr. Chuchu Fan, an associate professor of aeronautics and astronautics (AeroAstro), principal investigator in the MIT Laboratory for Information and Decision Systems (LIDS), and a co-author on the study. “Researchers have spent years developing effective technologies and tools to tackle problems in many domains. We want to enable LLMs to select the right tools and methods and make use of others’ expertise to enhance their own capabilities.”
Going forward, the researchers aspire to enhance CodeSteer in both its speed and reasoning skills, thus increasing accuracy and efficiency.
How will MIT’s CodeSteer help improve language models in the coming years and decades? Only time will tell, and this is why we science!
As always, keep doing science & keep looking up!
Sources: arXiv, EurekAlert!