What I learnt this week
Apple intelligence architecture: At WWDC, Apple announced new on-device and server foundation models as part of Apple Intelligence, a personal intelligence system integrated into iOS 18, iPadOS 18, and macOS Sequoia. These foundation models are designed to be used for various tasks, including writing and refining text, summarizing notifications, creating images, and taking actions in apps. In the demo, the AI systems seems seamlessly integrated into multiple “tasks” and Siri also seems like got a much needed improvement, powered by Language models. I was curious about the high-level architecture used here and this blog by folks at predibase does a great job of explaining it. Key points:
Starting Small: Apple started with a relatively small language model (3B parameters) and then used a parameter-efficient fine-tuning technique called LoRA (Low-Rank Adaptation) to create multiple adapters tailored for each of the individual tasks the model needs to handle.
LoRA Fine-Tuning: Instead of changing the parameters of the model on a specific task, LoRA creates a new set of weights (e.g., in the self-attention layer) that are tuned for a specific task. The original weights of the “main” model stay intact.
Adapters: Each adapter is very small (usually less than 1% of parameters) but contains information on how to adapt the model’s responses to the particular task.
Dynamic Serving: Once Apple creates all of these fine-tuned task-specific adapters, they then have a dynamic serving system where all of these adapters can be hot-swapped in and out on top of the same base Apple AI model.
Benefits: This approach enables a single small language model (SLM) to do many tasks with GPT-4-like performance but at a fraction of the size and significantly less expensive to train compared to training an LLM from scratch.
ARM announces new chips/software for smartphones: ARM introduced new CPU cores, Cortex-X4, designed specifically for AI workloads. Arm said the Cortex-X4 is the fastest CPU it has made so far and will bring 15% more performance than its predecessor, the Cortex X-3, with a focus on enabling artificial intelligence and machine learning-based apps. Arm also announced a new platform for mobile computing called Arm Total Compute Solutions 2023 (TCS23), which will include IP like the Immortalis GPU, Armv9 CPUs and software enhancements. Overall, ARM's goal is to provide a more complete package for smartphone makers, allowing them to develop next-generation devices with efficient and powerful AI capabilities. This could translate to features like improved facial recognition, enhanced camera functionalities, better virtual assistants, and overall smoother user experiences on smartphones.