2024-06-28-LLM As Compiler

🔷 Subscribe to get breakdowns of the most important developments in AI in your inbox every morning.

Here’s today at a glance:

LLM As Compiler

Meta’s released what is likely to be the most useful open-source language model this week, which deserves a little explanation.

Paper Title: Meta Large Language Model Compiler: Foundation Models of Compiler Optimization

Compilers convert source code, your C++, Java, etc, to assembly language. During this process they first produce an Intermediate Representation and then optimize it. The optimization process involves making various choices on flags and produces the assembly code as the end product.

As the universe of possible programs is large, automating the flag selection process has been a difficult problem. Meta attempts to build an LLM that can predict what a compiler would do without doing it.

Specifically, the LLM has been trained to

a) emulate the compiler: predicting how code will change after specific optimizations

b) tune flags: Suggesting optimization flags for best results (usually smallest binary size in this case) (77% success rate)

c) disassemble: Converting assembly language back to the IR (45% success rate)

Overview of our approach, showing the model input (Prompt) and output (Label) during training •1 and inference •2 . The prompt contains unoptimized code. The label contains an optimization pass list, binary size, and the optimized code. To generate the label for the training prompt, the unoptimized code is compiled against multiple random pass lists. The pass list achieving the minimum binary size is selected, minimized, and checked for correctness with PassListEval. The final pass list together with its corresponding optimized IR is used as a label during training. In a last step, the top 100 most often selected pass lists are broadcast among all programs. For deployment we generate only the optimization pass list which we feed into the compiler, ensuring that the optimized code is correct.

This is a pretty amazing first pass and indicates the potential of transformers as differentiable general-purpose computers. It brings us closer to a time when the LLM will handle the entire intermediate stack between human and bare metal.

What products does this enable?

Near Term:

  • Advanced Code Optimization Tools: Develop software that can automatically suggest optimal compiler flag configurations for improved performance and reduced binary size.

  • Intelligent Disassembly Tools: Create applications that can accurately convert assembly code back into higher-level intermediate representations.

  • Compiler Behavior Prediction Systems: Build tools that can predict the effects of different optimization passes without actually running the compiler.

Long Term:

  • AI-Driven Compiler Design: Develop AI systems that can propose novel compiler optimizations or even design entire compiler architectures.

  • Intelligent Code Migration Tools: Create systems that can automatically port legacy code to new architectures by understanding both low-level assembly and high-level representations.

  • AI-Assisted Compiler Debugging: Develop tools that can identify and suggest fixes for compiler bugs by understanding complex interactions between optimization passes.


The research was conducted by a team from Meta AI, including Chris Cummins, Volker Seeker, Dejan Grubisic, Baptiste Rozière, Jonas Gehring, Gabriel Synnaeve, and Hugh Leather. The collaboration appears to be internal to Meta AI, leveraging their expertise in large language models and compiler optimization.


  • The main reason for this research was to address the gap in applying Large Language Models (LLMs) to the domain of code and compiler optimization.

  • The study is important because it provides a foundation for further research and development in compiler optimization, making it more accessible to both academic researchers and industry practitioners.

  • They aimed to address the lack of robust, openly available, pre-trained models specifically designed for code optimization tasks.


  • The researchers developed a suite of models called LLM Compiler, built on the foundation of Code Llama.

  • They trained the models on a vast corpus of 546 billion tokens of LLVM-IR and assembly code.

  • The team used instruction fine-tuning to enhance the models' understanding of compiler behavior.

  • They created two versions of the model: 7 billion and 13 billion parameters.

  • The researchers developed fine-tuned versions for specific tasks like optimizing code size and disassembling assembly back into LLVM-IR.

  • They used datasets from MiBench benchmark suite for evaluation.

  • The team employed techniques such as random search, pass list minimization, and PassListEval for generating and validating optimization pass lists.

What did they find

  • LLM Compiler FTD models outperformed existing models like GPT-4 Turbo and Code Llama in compiler optimization tasks.

  • The 13B parameter model of LLM Compiler FTD achieved a 5.26% improvement over the -Oz optimization flag in reducing binary size.

  • In disassembly tasks, LLM Compiler FTD models achieved significantly higher success rates and accuracy compared to other models.

  • The models showed good performance in predicting binary sizes before and after optimization.

  • Each stage of compiler-centric training caused a slight regression in general Python programming ability, but the models still outperformed base Llama 2 on these tasks.

What are the limitations and what's next

  • The main limitation is the finite sequence length of inputs (context window), which can be insufficient for large programs.

  • The accuracy of model outputs remains a concern, as with all LLMs, requiring rigorous testing of suggested compiler optimizations.

  • Future research could focus on expanding the context window to handle larger programs.

  • Developing methods to constrain LLM generations to regular expressions or combine them with automatic verification could improve accuracy and reliability.

Why it matters

  • This research provides a scalable, cost-effective foundation for further research and development in compiler optimization.

  • It demonstrates the potential of LLMs in understanding and manipulating low-level code representations like assembly and compiler IRs.

  • The findings could lead to more efficient and automated compiler optimization processes, potentially improving software performance across various domains.

🌠 Enjoying this edition of Emergent Behavior? Send this web link with a friend to help spread the word of technological progress and positive AI to the world!

Or send them the below subscription link:

🖼️ AI Artwork Of The Day

The Simpsons | Game of Thrones Edition - u/lilsussy88 from r/midjourney

That’s it for today! Become a subscriber for daily breakdowns of what’s happening in the AI world:


or to participate.