Version 18 (modified by Bart, 10 years ago) (diff)


Gitification of Indymedia linksunten


Until May 2012,Indymedia linksunten used the Good Old Fashioned Way™ to keep track of upstream changes to it's Drupal (in fact: Pressflow) core and modules: drush dl mymodule. At least in theory. In reality, the core has been patched twice, many modules even more and some self-written modules do not even exist in a public drupal repository. linksunten has some 80 modules installed and keeping track of updates is wearisome for the non-patched modules and troublesome for the patched ones. We learned that a version control system could ease the error-prone update procedure and as has migrated to git we decided to do the same. Beforehand, we evaluated mercurial and bazaar and from a technical point of view we could have chosen all three.

Since a Drupal website is not a monolithic bloc and nearly each module is maintained by different developers we needed to find a way to update the core and each module separately from one another. The traditional way git offers for this is a concept called git-submodule. It is complicated, unintuitive and detested by many for good reasons. But as git follows the TMTOWTDI paradigm we could avoid using git-submodule and settled for git-subtree instead which has recently been merged into git core. Besides the possibility to update the core and each module separately and replaying our patches to the updated version automatically, we want the linksunten code to be one git repository which simply "works" after cloning it. After our move to git we will use the features and ctools modules to version control as much of our configuration data as possible.


Drupal core

We create a new directory, initialise git, tell git which name and email to use, create a temporary file which we commit, delete and commit again to create a master branch:

mkdir liu_d6
cd liu_d6
git init
git config "Indymedia linksunten"
git config "línksunten@índymedí"
touch liu_d6
git add liu_d6
git commit -m "Initial commit Indymedia linksunten Drupal 6."
git rm liu_d6
git commit -m "Created master branch."

We don't want the settings.php file in our repository as it contains database login credentials so we tell git to exclude it:

echo settings.php >> .git/info/exclude

We add Pressflow 6 as a new remote and fetch it:

git remote add pressflow-6.x git:// master
git fetch pressflow-6.x

Now we can add Drupal core from our pressflow-6.x remote via git-subtree in a subdirectory core. We use the --squash parameter as we do not need the whole commit history of Pressflow in our master branch:

git subtree add --squash --prefix=core pressflow-6.x/drupal-6.26

As we are creating a production environment, we'll delete some of the unnecesary files:

git rm core/install.php core/*.txt
git commit -m "Delete install.php and text files from core directory."

We copy our settings.php to core/sites/default and delete the default.settings.php:

cp ~/settings.php core/sites/default
git rm core/sites/default/default.settings.php
git commit -m "Deleted default.settings.php."

There are some more modifications to do but they are really linksunten specific (like applying the two core patches via git am which have been created via git format-patch before) so we leave them out.

Drupal modules

Normally, Drupal modules are installed under sites/default/modules. This would be fine with our approach but it would create unnecessary huge merges when updating the core and it keeping all parts separately accessible from the root directory of our installation is much clearer arranged. So we create a modules (and perhaps also a files, libraries and themes) directory, a symbolic link to it and add commit the link:

mkdir modules
cd core/sites/default
ln -s ../../../modules
cd ../../../
git add core/sites/default/modules
git commit -m "Add symbolic link to modules directory."

Now we install a module in it. As an example we chose the i18n module. At the Drupal project page we click on the green "Version Control" tab and chose "Version to work from: 6.x-1.x". There we get the URL which we need to add the repository as a remote repository. The parameter -f triggers an instant fetch:

git remote add -f i18n-6.x-1.x git:// 6.x-1.x

Now we do not install the latest version 6.x-1.10 but version 6.x-1.9. The reason is that we have patched that version and we want to use git-subtree and git-rebase to reapply our patches to the newest version. First, we install 6.x-1.9:

git subtree add --squash --prefix="modules/i18n" 6.x-1.9

Then we overwrite the newly imported files with our patched version and commit the patches. At this point, interactive staging might be a good idea.

cp ~/ modules/i18n
cp ~/i18nsync.module modules/i18n/i18nsync
git add modules/i18n/
git commit -m "i18n: Exchange title with nid in translation box."
git add modules/i18n/i18nsync/i18nsync.module
git commit -m "i18n: Inherit path when syncing."


We are going to update a patched Drupal module using git-subtree and git-rebase. We create a new branch containing the version of our module we want to update to. As we have already added i18n as a remote we can checkout the desired version in a separate branch using the corresponding tag. As long as there are no separate namespaces for remote tags in git we definitely want to start with a git fetch to be sure to refer to tag of the right module:

git fetch i18n-6.x-1.x
git branch i18n-6.x-1.10 6.x-1.10

Then we extract the patched version of our module into a branch along with it's history:

git subtree split --rejoin --prefix=modules/i18n --branch=i18n-linksunten

Now we can rebase our branch on top of the new version of the module:

git rebase i18n-6.x-1.10 i18n-linksunten

Finally, we have to merge the patched new version into our master branch:

git checkout master
git subtree merge --squash --prefix=modules/i18n i18n-linksunten

After that, you can delete the two branches:

git branch -D i18n-6.x-1.10 i18n-linksunten

The same process can be applied to update the core. Hopefully, you did not need to patch the core (as we did) so a git fetch will reside in fast-forward merges

Literally controlling versions with git_deply

In the past, used CVS as version control system and switched to git quite recently. Unfortunately, not all module maintainers have adapted their code base to the new revision control system manually. Instead, the Drupal team migrated lots of projects automatically. So at least at the moment, you'll discover that the module releases are not tagged ad all.

Another problem is the available updates page at /admin/reports/updates. When checking out a module via git there are no version information added by the package manager. Enter git_deploy. More precisely, commit 68bd1a8219cbe59e7fbe56b317a600321116ddfa from Thu Apr 26 10:15:16 2012 -0700.

git_deploy 6.x-2.x


 * @file
 * This module add versioning information to projects checked out of git.

 * Implement hook_system_info_alter() to provide metadata to drupal from git.
 * We support populating $info['version'] and $info['project'].
 * @param $info
 *   The module/theme info array we're altering.
 * @param $file
 *   An object describing the filesystem location of the module/theme.
function git_deploy_system_info_alter(&$info, $file) {
  $type = isset($info['engine']) ? 'theme' : 'module';
  if (empty($info['version'])) {
    $directory = dirname($file->filename);
    // Check whether this belongs to core. Speed optimization.
    if (substr($directory, 0, strlen($type)) != $type) {
      while ($directory && !is_dir("$directory/.git")) {
        $directory = substr($directory, 0,  strrpos($directory, '/'));
      $git_dir = "$directory/.git";
      // Theoretically /.git could exist.
      if ($directory && is_dir($git_dir)) {
        $git = "git --git-dir $git_dir";
        // Find first the project name based on fetch URL.
        // Eat error messages. >& is valid on Windows, too. Also, $output does
        // not need initialization because it's taken by reference.
        exec("$git remote show -n origin 2>&1", $output);
        if ($fetch_url = preg_grep('/^\s*Fetch URL:/', $output)) {
          $fetch_url = current($fetch_url);
          $project_name = substr($fetch_url, strrpos($fetch_url, '/') + 1);
          if (substr($project_name, -4) == '.git') {
            $project_name = substr($project_name, 0, -4);
          $info['project'] = $project_name;
        // Try to fill in branch and tag.
        exec("$git rev-parse --abbrev-ref HEAD 2>&1", $branch);
        $tag_found = FALSE;
        if ($branch) {
          $branch = $branch[0];
          // Any Drupal-formatted branch.
          $branch_preg =  '\d+\.x-\d+\.';
          if (preg_match('/^' . $branch_preg . 'x$/', $branch)) {
            $info['version'] = $branch . '-dev';
            // Nail down the core and the major version now that we know
            // what they are.
            $branch_preg = preg_quote(substr($branch, 0, -1));
          // Now try to find a tag.
          exec("$git rev-list --topo-order --max-count=1 HEAD 2>&1", $last_tag_hash);
          if ($last_tag_hash) {
            exec("$git describe  --tags $last_tag_hash[0] 2>&1", $last_tag);
            if ($last_tag) {
              $last_tag = $last_tag[0];
              // Make sure the tag starts as Drupal formatted (for eg.
              // 7.x-1.0-alpha1) and if we are on a proper branch (ie. not
              // master) then it's on that branch.
              if (preg_match('/^(' . $branch_preg . '\d+(?:-[^-]+)?)(-(\d+-)g[0-9a-f]{7})?$/', $last_tag, $matches)) {
                $tag_found = TRUE;
                $info['version'] = isset($matches[2]) ? $matches[1] . '.' . $matches[3] . 'dev' : $last_tag;
        if (!$tag_found) {
          $last_tag = '';
        // The git log -1 command always succeeds and if we are not on a
        // tag this will happen to return the time of the last commit which
        // is exactly what we wanted.
        exec("$git log -1 --pretty=format:%at $last_tag 2>&1", $datestamp);
        if ($datestamp && is_numeric($datestamp[0])) {
          $info['datestamp'] = $datestamp[0];

        // However, the '_info_file_ctime' should always get the latest value.
        if (empty($info['_info_file_ctime'])) {
          $info['_info_file_ctime'] = $datestamp[0];
        else {
          $info['_info_file_ctime'] = max($info['_info_file_ctime'], $datestamp[0]);

Analysis of git_deploy

Version 1.x of git_deploy was based on glip, a Git Library In PHP. Version 2.x of git_deploy calls the git executable directly and parses the output instead. There might be issues in a shared hosting environment but many people report that the 2.x version works far better than the 1.x version, so we'll adapt git_deploy 2.x to git-subtree.

Let's analyse what git_deplploy does. The module implements only one hook: hook_system_info_alter. With this hook the module info obtained through git can be induced. But the module searches for a .git directory, so it only works for git-submodules in the current version.

      while ($directory && !is_dir("$directory/.git")) {
        $directory = substr($directory, 0,  strrpos($directory, '/'));

It uses the Fetch URL obtained by git remote show -n origin to determine the project name. This won't work with git-subtree as remotes are usually not exported so a clone won't know the original fetch urls used for git subtree add. Fortunaltely, uses well-defined fetch urls so we can reconstruct the information. But the process will be much more time-consuming with git-subtree than it is with git-submodule.

        exec("$git remote show -n origin 2>&1", $output);
        if ($fetch_url = preg_grep('/^\s*Fetch URL:/', $output)) {
          $fetch_url = current($fetch_url);
          $project_name = substr($fetch_url, strrpos($fetch_url, '/') + 1);
          if (substr($project_name, -4) == '.git') {
            $project_name = substr($project_name, 0, -4);
          $info['project'] = $project_name;