Although Apple is not yet among the top players in the field of artificial intelligence, a new open source model of artificial intelligence for image editing shows what the company is able to contribute to the field. The model called MLLM-Guided Image Editing (MGIE) uses multimodal large language models (MLLM) to interpret text commands when manipulating images. In other words, the tool has the ability edit photos based on text commandswhich the user enters.
The company developed MGIE together with researchers from the University of California, Santa Barbara. MLLMs have the ability transform simple or ambiguous text instructions into more detailed and clear instructions, which can be controlled by the photo editor itself. For example, if a user wants to edit a photo of a pepperoni pizza to make it look healthier, MLLM can interpret that as “add a vegetable garnish” and edit the photo that way.
In addition to making major changes to images, MGIE can also photo crop, resize and rotate them, as well as increase their brightness, contrast and color balanceand all that via text instructions. He can too edit specific areas of the photo and can, for example, adjust the hair, eyes, and clothing of the person on it, or remove elements in the background.
Apple released the model via GitHub, but those interested can try it out as well demo version, which is currently hosted on the Hugging Face Spaces website. Unfortunately, Apple has not yet said whether it will incorporate anything from this project into any of its products.