Converting from Subversion to Git

Note: This tutorial is about completely replacing a server-side Subversion repository by a Git repository, for a workflow with a central Git repository. If you just want to use Git as a frontend to a Subversion repository, you are probably better off with the standard Git SVN documentation.

For a personal project I wanted to convert from subversion to Git. There are a lot of tutorials out there. Why add another one? Well, I haven’t found one that is generic and tells me everything I want to do. The most notable omission from most tutorials is, what happens to your Subversion branches and tags. I want them to be properly converted, and I want to replace my server side SVN repo with a Git repo, so that I have a central place somewhere that stores my work.

So, I’m recording the steps I need to take, as much for my own future reference as for the “public benefit”. I’m by no means a Git expert (I hope to become one once I have converted a few of my projects to Git), so please bare with me and forgive any mistakes I might make (or rather, please point them out in the comments).

Importing the subversion history

We’ll assume you have some version of Git on your system. I’m using msysgit on Windows, but this should work equally well on any install. Older versions of Git may require you to not use the single git entrypoint command, but have you combine it with the subcommand, so “git svn” becomes “git-svn”. I do believe this is an indication your Git install is pretty old, though (correct me if I’m wrong).

I’ve started out with these commands to create a home for the imported svn repo (mscn for Mini Seven Club Nederland – that’s the project I’m converting, remember – and temp because we’re going to throw this out later – we’re going to move this to the server and clone that for our working repo later on).

$ mkdir project_git
$ cd project_git

Now it’s time to initialize this repo as a subversion enabled repository (the location is the location of the project on my local network):

$ git svn init svn://example.com/project --stdlayout --no-metadata
Initialized empty Git repository in c:/data/Project/project_git/.git/

The –stdlayout is there to tell Git that our Subversion repository has the standard layout, i.e. trunk, branches and tags directories at the root level of the repository. This will make sure that these are imported correctly. If omitted, you’ll get a Git repository with a single branch, containing your entire SVN structure. You don’t want that.

The –no-metadata tells Git not to record the original SVN URL locations, which we won’t need and will only be legacy clutter, since this is a one way conversion.

Before we continue with the actual import, there’s one more step. We need to prepare a file that describes to the import process how to translate usernames in the SVN repository (a simple username like “jdoe”) to the format Git uses (a more elaborate name of the form “John Doe <j.doe@example.com>”). Now, this repo has been through a lot already; it actually started out as a CVS repository and has been hosted at different locations, so eventhough I am the only person who’s ever worked on it, there are several usernames used to check in stuff.

Update 30/11/2012: When Googling for this again to see whether there are any more good procedures, I found a good description By John Albin Wilkins, who displays some strong command line foo to create some scaffolding for this file. Execute the following from your terminal (Mac/Linux/…) to get a file listing all usernames associated with revisions in the SVN repo:

1
$ svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" &lt;"$2"&gt;"}' | sort -u &gt; users.txt

It looks like this command will only look at the history for the current location in your working copy (current branch, if you are at the root), so if you may want to check if there are any authors exclusive to some branch you did not run this command on.

Edit the file to add actual names and email addresses, like so:

eelke = Eelke Blok <eelke@example.com>
eblok = Eelke Blok <eelke@example.com>
Eelke = Eelke Blok <eelke@example.com>
Administrator = Eelke Blok <eelke@example.com>
cvsowner = Eelke Blok <eelke@example.com>
eelkecvs = Eelke Blok <eelke@example.com>
(no author) = Eelke Blok <eelke@example.com>

The last line I’ve learned the hard way; apparently, the CVS to SVN conversion process (way back when) has resulted in commits without an author, resulting in Git telling me (after issuing a fetch, see further):

Author: (no author) not defined in users.txt file

After a bit of fiddling, I found out that adding this line to the users.txt file let the process run through succesfully (it was my second guess, after leaving the space before the = emtpy :) ).

I’ve stored the file as users.txt in the directory where we’re creating the git repo.

Next, let’s tell Git about this file:

$ git config svn.authorsfile users.txt

All set, let’s see how this works out.

$ git svn fetch

Git will start importing your SVN revisions one by one. Should you also run into the problem that your SVN repo contains an author you didn’t specify in the authors file, don’t worry. Simply add the appropriate line and rerun the fetch; it will continue where it left off.

You might notice that Git is importing your SVN branches into a namespace called refs/remotes. We don’t like that, because we plan on using this repository as a remote repository, where these branches should just be the branch. We’ll fix this later.

Anyway, get yourself a coffee/tea/soda/beer/cocktail, depending on the time of day and your preference. Heck, the time of day may become appropriate for whatever you like, this might take a while.

Cleaning the SVN stuff out

Ready? Didn’t have too many cocktails so your thoughts are still reasonably coherent? OK, so now we have a plain old SVN-enabled Git repository. You can have a look at the branches that were created as such:

$ git branch -a
* master
 remotes/Pre-version
 remotes/banner
 remotes/bannermove
 remotes/dev
 remotes/devel
 remotes/eelkeblok
 remotes/ledenvoordeel
 remotes/mscn4
 remotes/register
 remotes/registrations
 remotes/remove_shop
 remotes/tags/demo
 remotes/tags/merge_dev_to_stable
 remotes/tags/online-20051224-2259
 remotes/tags/online-20051224-2311
 remotes/tags/online-20051224-2329
 remotes/tags/online-20051225-0001
[...]
 remotes/tags/online-20091116-2114
 remotes/tags/online-20091130-2127
 remotes/tags/start
 remotes/templateupdate-2.0.18
 remotes/trunk
 remotes/upgrade_forum_3_0
 remotes/wiki

You’ll notice that the Subversion “tags” (which in Subversion aren’t really tags at all, they’re just branches without any subsequent revisions) were converted into Git branches within their own namespace. This is what you might use to work with a project that has its main repository still on SVN; you can use Git on your workstation and have most of its benefits, while the project itself remains on SVN. Neat. But not what we came here to do.

We basically want to do two things;

  • Move the generated branches out of the remotes space
  • Turn the SVN “tags” into proper Git tags

(The following steps were taken directly from reference 5).

First, let’s convert the tags. Note: It has been reported that these steps may remove extra commits made in a tag on SVN (in SVN, tags really are no different from branches, except that we agreed to not commit to them, by convention). Proceed with caution.

Create a script with the following contents (Windows users should be able to execute this through Git Bash, which was installed along with msysgit):

for t in `git branch -r | grep 'tags/' | sed s_tags/__` ; do
     git tag $t tags/$t^
     git branch -d -r tags/$t
done

Save the script e.g. as converttags.sh and execute it.

$ converttags.sh
Deleted remote branch tags/demo (was 5cfe8f1).
Deleted remote branch tags/merge_dev_to_stable (was 2d47421).
Deleted remote branch tags/online-20051224-2259 (was 5a50f9d).
Deleted remote branch tags/online-20051224-2311 (was c6aeda6).
Deleted remote branch tags/online-20051224-2329 (was 92c1ad6).
Deleted remote branch tags/online-20051225-0001 (was 3432051).
Deleted remote branch tags/online-20051225-0008 (was f574ea8).
[...]

That’s our tags converted. Let’s also remove SVN compatibility references:

$ git branch -d -r trunk
Deleted remote branch trunk (was 6ade13f).

$ git config --remove-section svn-remote.svn

$ rm -rf .git/svn .git/{logs/,}refs/remotes/svn/

And let’s convert the remaining remote branches to local ones:

$ git config remote.origin.url .

$ git config --add remote.origin.fetch +refs/remotes/*:refs/heads/*

$ git fetch

All set. We now have a local Git repository with all the contents from our old SVN repository.

Getting a bare repository onto the server

We’re only a few steps away from victory. We need to turn our repository into a bare repository, which means just the Git data, not the accompanying working copy. We’ll do this by cloning our current repository:

$ cd ..
$ git clone --bare mscn_temp mscn.git

This will go superfast, presumably because Git uses hard links because we’re still on the same system.

Upload your bare repository to your server wherever you would like. Do this with the tool you are used to to upload stuff to your server. For the next steps, it is important that you have SSH access to your server; if you don’t, there are other ways to contact your Git repository, although chances are slim that you’ll actually be able to set up any of them if you don’t even get SSH access with your provider. Make sure that the user you are planning to use to connect to the remote repository has read-write access to the repository.

Cloning the remote repository to create a working repository

Now we’re ready to test the remote repository by cloning it to create our final working repository on our local system:

$ git clone eelke@myserver.net:/data/git/mscn.git
Initialized empty Git repository in c:/Data/Project/mscn/.git/
remote: Counting objects: 24568, done.
remote: Compressing objects: 100% (13012/13012), done.
remote: Total 24568 (delta 11682), reused 23571 (delta 10964)
Receiving objects: 100% (24568/24568), 52.27 MiB | 1022 KiB/s, done.
Resolving deltas: 100% (11682/11682), done.
Checking out files: 100% (1675/1675), done.

So, there you are. You have all existing history of your project on your server in a central Git repository and a local working repository to start Gittin’ with it.

References

  1. Cleanly Migrate Your Subversion Repository To a GIT Repository, John Madox
  2. Converting Subversion repositories to Git, Redline’s Weblog
  3. git-svn(1) Manual Page
  4. How to convert from Subversion to Git, Paul Dowman
  5. Convert a SVN Alioth repository to Git
  6. Converting a Subversion repository to Git, John Albin Wilkins
This entry was posted in Uncategorized and tagged , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

8 Comments

  1. Posted 4 April 2012 at 23:48 | Permalink

    Very complete manual.

    Thanks a lot !

  2. dmitry
    Posted 12 May 2012 at 17:59 | Permalink
  3. Posted 5 October 2012 at 09:40 | Permalink

    Better depends on your criteria. Also, this (closed source) product is targeted at mirroring (and thus keeping the SVN repository around). The above tutorial is about leaving SVN behind.

  4. Simon
    Posted 2 December 2012 at 01:31 | Permalink

    @Eelke Well, git-svn is open source. But did you see the code? I did. As for me, closed-source product with good support is better than open-source tool with no support at all (I doubt someone can provide any reasonable support for git-svn).

    You don’t need to keep SVN repository around after subgit converted it. Just drop it and you’re good to go with git.

    Regards, Simon.

  5. Jörgen Persson
    Posted 20 March 2013 at 20:59 | Permalink

    Thanks. It really helped me out.
    I made a script out of your instructions and executed it on a per project basis

  6. Peter van Dijk
    Posted 25 April 2013 at 16:15 | Permalink

    If you have ever committed to a tag, the “tags/$t^” advice in this post WILL throw that last commit away!

  7. Posted 2 May 2013 at 15:56 | Permalink

    Hmm… Have you tried? IIRC, the Git tag is placed on the tip of the tag branch and I would expect that to also hold true when there are more than one commit in the tag “branch” (you really shouldn’t be committing to tags, though ;) ). Anyway, I’ll update the post with a warning.

  8. Posted 3 May 2013 at 05:26 | Permalink

    What’s up friends, pleasant post and good arguments commented at this place, I am in fact enjoying by these.

3 Trackbacks

  • [...] follows a step-to-step migration history, loosely based on this article, that I found as the most useful resource on the Web among those suggested by Google. I publish the [...]

  • [...] a few false starts I found this post, which worked almost entirely, except for the section “Getting a bare repository onto the [...]

  • By A Programmer's Block on 16 April 2013 at 00:51

    Migrating JAIML from SVN to Git…

    Some time ago, I decided to revive my Java AIML Interpreter – I’ve been programming in Smalltalk for the past 4 years and I’m becoming a bit rusty. My trusty old SVN server has been dead for a while, so I decided to migrate to github …

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>