Microsoft Research Asia and Peking University develop COLE, an AI tool that generates graphic designs with editable text and visual elements.
Graphic designers and their reliance on traditional design methods may soon face disruption with the advent of COLE, an AI tool developed by Microsoft Research Asia and Peking University. Named after Henry Cole, the creator of the first graphical Christmas card, COLE utilizes a combination of AI models and an open-source graphics renderer to generate graphic designs based on text prompts. This article explores the capabilities of COLE and its potential impact on the graphic design industry.
A Framework for Graphic Design Generation
COLE is currently more of a framework than a finished product. It combines various AI models, including fine-tuned versions of Meta’s Llama2-13B, DeepFloyd IF, LLaVA1.5-13B, and GPT-4V, along with the open-source graphics renderer Skia. The researchers chose this combination to address the complexity of graphic design and the lack of training data for .SVG files. Instead, COLE consolidates SVG elements and embellishments into a unified image layer, allowing AI to extract the background layer and describe it in text.
Impressive Results and Editable Elements
COLE produces high-quality graphic designs by generating crisp, organized visuals combined with stylized text. Unlike other AI art generators, COLE successfully incorporates editable blocks for text and objects within the image. This feature allows users to make changes directly within the COLE framework, eliminating the need to export the design to other programs for revision. Users can modify the text displayed, change fonts, and introduce new prompts for different visual elements, providing a flexible editing space.
Competitive Quality and Potential for Improvement
The results generated by COLE are highly competitive in terms of quality, even compared to the latest version of OpenAI’s DALL-E. The researchers tested COLE on 200 different graphic design projects, ranging from advertisements to event promotions and marketing materials. However, COLE currently has limitations, such as the inability to change the arrangement or placement of typography blocks, limited color options for typography, and the absence of multiple typography block placements. The researchers acknowledge these issues and plan to address them in future work.
Threat or Complement to Graphic Designers?
COLE’s capabilities raise questions about the future of graphic designers. While the tool allows users to refine the output and integrate human expertise when necessary, it also empowers those without graphic design training to generate high-quality designs. COLE’s ease of use and ability to produce superior graphic design images with simple user intention may challenge the need for extensive professional expertise in the field. However, the researchers emphasize that graphic design training is still valuable for achieving the best results with the AI framework.
Conclusion:
COLE, the AI tool developed by Microsoft Research Asia and Peking University, has the potential to disrupt the graphic design industry. By combining various AI models and an open-source graphics renderer, COLE generates high-quality graphic designs based on text prompts. The tool’s ability to incorporate editable text and visual elements within the image provides users with a flexible editing space. While COLE may raise concerns about the future of graphic designers, it also offers opportunities for those without design training to create professional-level designs. As COLE continues to evolve, it may serve as a complement or a threat to traditional graphic design methods, depending on how it is utilized.

Leave a Reply