Logo

dev-resources.site

for different kinds of informations.

GCC: Automatic Function Multi-Versioning Pt.1

Published at
12/6/2024
Categories
gcc
compiling
afmv
Author
arilloid
Categories
3 categories in total
gcc
open
compiling
open
afmv
open
Author
8 person written this
arilloid
open
GCC: Automatic Function Multi-Versioning Pt.1

Before we start...

If you have just stumbled upon my SPO600 series of blog posts, it has been created to document and share my learnings as I progress through my Software Portability and Optimization college course.

In this blog post, I’ll start sharing the progress I made after transitioning to Stage 2 of our class project: implementing a new experimental feature in the GNU Compiler Collection (GCC) — Automatic Function Multi-Versioning.

What is Automatic Function Multi-Versioning?

As promised in my 1st project-related blog post, before diving into the details of Project Stage 2, I’d like to give you a better idea of the project's specifics.

Automatic Function Multi-Versioning should enable the compiler to automatically identify functions that could benefit from being cloned and optimized for specific architectural variants without requiring developers to modify the source code. By using a compiler option to specify the target architectures, AFMV will clone, optimize, and prune redundant function versions, enhancing software performance without requiring any manual efforts from the devs.

However, this 2024 Fall term, our focus is determining how to detect when two function versions are "fundamentally the same," which is essential for implementing function pruning.

Project Stage 2 Objective

In Stage 2 of the Project, we are aiming to add a compilation pass to the GCC compiler that determines whether functions should be pruned. This pass should iterate through the program's code, identify cloned functions, and analyze their gimple representations to determine if their implementations are identical. If duplicates are detected, the pass will output a decision indicating that the redundant clones can be pruned (by returning true or false).

To limit the scope, we will be making the following assumptions:

  • There is only one cloned function in a program

  • There are only two versions of that function.

Initial Setup

For the initial setup, I created a folder called Stage-2 in my home directory, along with the following sub-folders:

Initial-setup

  • /src - contains the source code of the GCC compiler

  • /build - used to configure and build the compiler

  • /install - serves as the destination directory for installing the GCC binaries

The server where I initially built the compiler was put down, so I rebuilt it on our class x86_64 server. The clean build took about 22 minutes.

gcc --version

You can read about building GCC from source in my previous blog post: Building GCC from Source

Adding a New Pass

Before conducting any code experiments, I made sure that I could successfully integrate new passes into GCC.

Our professor, Chris Tyler, gave us his demo pass as a starter code for Project Stage 2. The pass was located in the public folder (/public/spo600-gcc-pass-demo.tgz). I moved it to my home directory and extracted it using the following command:

tar -xvzf spo600-gcc-pass-demo.tgz
Enter fullscreen mode Exit fullscreen mode

Folder structure

The extracted files included three modified files from the /gcc subdirectory of the source tree (Makefile.in, passes.def, tree-pass.h) and one new file, tree-ctyler.cc, containing the pass logic.

Let's Take a Look at the Files, What Changes Were Made to Add a Pass?

  • Makefile.in: on line 1712, tree-ctyler.o was added. tree-ctyler.o is a language-independent object file generated from the corresponding source file tree-ctyler.cc.

    tree-ctyler.o \
    
  • passes.def: this file defines the sequence of passes that transform and optimize code during compilation. The demo pass, pass_ctyler, is added on line 444, ensuring it is executed late in the compilation process, after all significant optimizations have been completed.

    NEXT_PASS (pass_ctyler)
    
  • tree-pass.h: this file contains the structures for defining optimization passes in GCC. It includes a new line that declares the entry point for our custom pass. The declaration introduces the external function make_pass_ctyler, which initializes the pass in the GCC compiler framework. This function takes a pointer to a gcc::context object as its argument and returns a pointer to a gimple_opt_pass object, representing the configuration of our custom pass.

    //Line: 491
    extern gimple_opt_pass *make_pass_ctyler (gcc::context *ctxt);
    
  • tree-ctyler.cc: this file contains the logic of the demo pass. The pass iterates over each basic block and statement within a function, printing diagnostic information about the number of basic blocks and statements and detailed GIMPLE statements if a dump file is enabled.

    // Omitted the License and the include statements
    // ============================================================= vvv
    // Test pass
    namespace {
    
    const pass_data pass_data_ctyler =
    {
      GIMPLE_PASS, /* type */
      "ctyler", /* name */
      OPTGROUP_NONE, /* optinfo_flags */
      TV_NONE, /* tv_id */
      PROP_cfg, /* properties_required */
      0, /* properties_provided */
      0, /* properties_destroyed */
      0, /* todo_flags_start */
      0, /* todo_flags_finish */
    };
    
    class pass_ctyler : public gimple_opt_pass
    { 
    public:
      pass_ctyler (gcc::context *ctxt)
        : gimple_opt_pass (pass_data_ctyler, ctxt)
      {}
    
      /* opt_pass methods: */
      bool gate (function *)  final override { 
        return 1; // always execute pass
      }
      unsigned int execute (function *) final override;
    
    }; // class pass_ctyler
    
    unsigned int
    pass_ctyler::execute (function *fun)
      {
        basic_block bb;
    
        int bb_cnt = 0, stmt_cnt = 0;
    
        FOR_EACH_BB_FN (bb, fun)
          {
            bb_cnt++;
            if (dump_file)
              {
                fprintf (dump_file, "===== Basic block count: %d =====\n", bb_cnt);
              }
    
            for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p 
        (gsi); gsi_next (&gsi))
              {
                gimple *g = gsi_stmt (gsi);
                stmt_cnt++;
                if (dump_file)
                  {
                    fprintf (dump_file, "----- Statement count: %d -----\n", stmt_cnt);
                    print_gimple_stmt (dump_file, g, 0, 
    TDF_VOPS|TDF_MEMSYMS);
                  }
             }
           }
    
        return 0;
    
        if (dump_file)
          {
            fprintf (dump_file, "\n\n##### End ctyler diagnostics, start regular dump of current gimple #####\n\n\n");
          }
    
      }
    
    } // anon namespace
    
    gimple_opt_pass *
    make_pass_ctyler (gcc::context *ctxt)
    {
      return new pass_ctyler (ctxt);
    }
    
    // ============================================================= ^^^
    

While the GCC documentation is rather sparse, you can find some helpful info on adding a pass here:

Rebuilding GCC with a New Pass

After wrapping my head around the changes required to add the sample pass, I integrated it into the source code and rebuilt the project.

I started by copying the necessary source code files using the following commands:

[akolodeznikova@x86-001 gcc]$ pwd
/home/akolodeznikova/Stage-2/src/gcc
cp ../../../test/gcc/Makefile.in .
cp ../../../test/gcc/passes.def .
cp ../../../test/gcc/tree-pass.h .
cp ../../../test/gcc/tree-ctyler.cc .
Enter fullscreen mode Exit fullscreen mode

Then, I went into the build folder and ran the make command to rebuild the project with applied changes.

The build process took approximately five minutes. Once completed, I ran the make install command to install the newly built GCC compiler into the /install folder.

Note: the make utility efficiently rebuilds a codebase by comparing the timestamps of dependencies (inputs) and targets (outputs). It determines which source files have been modified and rebuilds only the affected targets (Chris Tyler’s Wiki).

With the build and installation complete, I tested the setup to verify that the sample pass was working.

To do this, I quickly created a sample file in the test folder:

hello.c

#include<stdio.h>

int main(){
    printf("Hello");
    return 0;
}
Enter fullscreen mode Exit fullscreen mode

And then compiled it, triggering the newly added pass using the following command:

~/Stage-2/install/bin/gcc hello.c -fdump-tree-ctyler
Enter fullscreen mode Exit fullscreen mode

After running the command, a file named a-hello.c.263t.ctyler was generated. The file contained the following output:

//a-hello.c.263t.ctyler
;; Function main (main, funcdef_no=0, decl_uid=3267, cgraph_uid=1, symbol_order=0)

===== Basic block count: 1 =====
----- Statement count: 1 -----
# .MEM_2 = VDEF <.MEM_1(D)>
printf ("Hello");
----- Statement count: 2 -----
_3 = 0;
===== Basic block count: 2 =====
----- Statement count: 3 -----
<L0>:
----- Statement count: 4 -----
# VUSE <.MEM_2>
return _3;
int main ()
{
  int D.3270;
  int _3;

  <bb 2> :
  printf ("Hello");
  _3 = 0;

  <bb 3> :
<L0>:
  return _3;

}
Enter fullscreen mode Exit fullscreen mode

The output confirmed the added pass was working as expected. It successfully iterated through the code, producing diagnostic information about basic blocks and statements.

🎉 🎉 🎉

Stay tuned...

In my next post, I’ll share the code experiments I conducted to implement the proposed functionality. So, stay tuned for more updates on my progress!

compiling Article's
30 articles in total
Favicon
TypeScript type predicate generator
Favicon
Bug of the week #2
Favicon
🌥️ Exploring Cloud Computing: A Simple Guide
Favicon
what is compiler ?
Favicon
GCC: Automatic Function Multi-Versioning Wrap Up
Favicon
GCC: Automatic Function Multi-Versioning Pt.1
Favicon
Top 5 Compiler Flags to Prevent Stack-Based Attacks
Favicon
Why implement custom copy constructor in C++?
Favicon
Why do compilers have different components?
Favicon
Java Compilation Process : From Source Code to Bytecode Execution
Favicon
GCC: Debug Dumps
Favicon
GCC: Automatic Function Multi-Versioning Pt.2
Favicon
Goodbye useCallback and useMemo: How React Compiler Takes Over
Favicon
Parallelizing Sorting Algorithms using OpenMP
Favicon
Running a Compiled Python Script from C# Applications
Favicon
Building a Pawn to Python Compiler in PHP
Favicon
Part I: Implement an expression interpreter for building DSL - Introduce the PEG parser
Favicon
Exploring Graal: Next-Generation JIT Compilation for Java
Favicon
Teste de fogo: fatorial recursivo
Favicon
React 19 Compiler
Favicon
Intro: Implement an expression interpreter for building DSL
Favicon
Lendo dados da entrada padrĂŁo (stdin)
Favicon
Building a simple word counting parser using pari.
Favicon
Sentenças aninhadas
Favicon
Resolução de alguns bugs
Favicon
Learn the smart way of building TypeScript in VS Code!
Favicon
Suporte às funções do Pascal
Favicon
Captura de erros, operadores relacionais para string e procedures
Favicon
How Compilers Work: A step by step guide
Favicon
Estruturas de repetição: repeat, while e for

Featured ones: