Researchers at Google Research and Google DeepMind have developed a method for augmenting large language models (LLMs) without the need for retraining or expensive fine-tuning sessions. The approach allows developers to enhance existing language models with new abilities, improving performance at existing tasks and enabling the models to perform new tasks that wouldn't be achievable individually. The research used Google's PaLM2-S LLM, comparable to GPT-4, and demonstrated significant improvements in translation and coding tasks when augmented with smaller, specialized language models. The development could have potential implications for addressing legal challenges related to copyright concerns in the AI sector.


Key Points:

  • Augmenting LLMs: Google researchers have developed a method to augment large language models (LLMs) with other language models without the need for retraining or costly fine-tuning sessions.

  • Performance Improvement: The approach demonstrated improvements in performance for existing tasks and enabled new tasks that the models couldn't achieve individually.

  • Benchmarking with PaLM2-S: The research used Google's PaLM2-S LLM, described as comparable to GPT-4, for benchmarking. The experiments included tasks such as translation and coding.

  • Translation Tasks: In translation tasks, the augmented model showed up to a 13% improvement over the baseline, particularly in translating languages with low support into English.

  • Coding Tasks: The hybrid model exhibited a relative improvement of 40% over the base model for code generation and explanation tasks, similar to fully fine-tuned counterparts.

  • Potential Legal Implications: The research addresses challenges in the AI sector related to legal issues, including copyright concerns. Large language model developers have faced lawsuits alleging the use of copyrighted data for training.

  • Sustainability of AI Services: The development has potential implications for addressing the scalability and cost challenges of training large language models, making them more sustainable in a regulated AI landscape.

Conclusion: The research by Google to augment large language models without the need for retraining or fine-tuning sessions represents a significant advancement in the AI sector. The approach allows developers to enhance the capabilities of existing models, leading to improved performance in existing tasks and enabling the models to perform new tasks. This development could have implications for addressing legal challenges related to copyright concerns in the AI sector and potentially contribute to the sustainability of AI services in regulated environments.


(TRISTAN GREENE, COINTELEGRAPH, 2023)