Converting from Subversion to Git

Note: This tutorial is about completely replacing a server-side Subversion repository by a Git repository, for a workflow with a central Git repository. If you just want to use Git as a frontend to a Subversion repository, you are probably better off with the standard Git SVN documentation.

For a personal project I wanted to convert from subversion to Git. There are a lot of tutorials out there. Why add another one? Well, I haven’t found one that is generic and tells me everything I want to do. The most notable omission from most tutorials is, what happens to your Subversion branches and tags. I want them to be properly converted, and I want to replace my server side SVN repo with a Git repo, so that I have a central place somewhere that stores my work.

So, I’m recording the steps I need to take, as much for my own future reference as for the “public benefit”. I’m by no means a Git expert (I hope to become one once I have converted a few of my projects to Git), so please bare with me and forgive any mistakes I might make (or rather, please point them out in the comments).

Importing the subversion history

We’ll assume you have some version of Git on your system. I’m using msysgit on Windows, but this should work equally well on any install. Older versions of Git may require you to not use the single git entrypoint command, but have you combine it with the subcommand, so “git svn” becomes “git-svn”. I do believe this is an indication your Git install is pretty old, though (correct me if I’m wrong).

I’ve started out with these commands to create a home for the imported svn repo (mscn for Mini Seven Club Nederland – that’s the project I’m converting, remember – and temp because we’re going to throw this out later – we’re going to move this to the server and clone that for our working repo later on).

$ mkdir project_git
$ cd project_git

Now it’s time to initialize this repo as a subversion enabled repository (the location is the location of the project on my local network):

$ git svn init svn://example.com/project --stdlayout --no-metadata
Initialized empty Git repository in c:/data/Project/project_git/.git/

The –stdlayout is there to tell Git that our Subversion repository has the standard layout, i.e. trunk, branches and tags directories at the root level of the repository. This will make sure that these are imported correctly. If omitted, you’ll get a Git repository with a single branch, containing your entire SVN structure. You don’t want that.

The –no-metadata tells Git not to record the original SVN URL locations, which we won’t need and will only be legacy clutter, since this is a one way conversion.

Before we continue with the actual import, there’s one more step. We need to prepare a file that describes to the import process how to translate usernames in the SVN repository (a simple username like “jdoe”) to the format Git uses (a more elaborate name of the form “John Doe <j.doe@example.com>”). Now, this repo has been through a lot already; it actually started out as a CVS repository and has been hosted at different locations, so eventhough I am the only person who’s ever worked on it, there are several usernames used to check in stuff.

Update 30/11/2012: When Googling for this again to see whether there are any more good procedures, I found a good description By John Albin Wilkins, who displays some strong command line foo to create some scaffolding for this file. Execute the following from your terminal (Mac/Linux/…) to get a file listing all usernames associated with revisions in the SVN repo:

1
$ svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" &lt;"$2"&gt;"}' | sort -u &gt; users.txt

Note: take care when copy-pasting this, you may end up with gt and lt combinations. Everything from the & up to the ; will need to be replaced with < for lt and > for gt.

It looks like this command will only look at the history for the current location in your working copy (current branch, if you are at the root), so if you may want to check if there are any authors exclusive to some branch you did not run this command on.

Edit the file to add actual names and email addresses, like so:

eelke = Eelke Blok <eelke@example.com>
eblok = Eelke Blok <eelke@example.com>
Eelke = Eelke Blok <eelke@example.com>
Administrator = Eelke Blok <eelke@example.com>
cvsowner = Eelke Blok <eelke@example.com>
eelkecvs = Eelke Blok <eelke@example.com>
(no author) = Eelke Blok <eelke@example.com>

The last line I’ve learned the hard way; apparently, the CVS to SVN conversion process (way back when) has resulted in commits without an author, resulting in Git telling me (after issuing a fetch, see further):

Author: (no author) not defined in users.txt file

After a bit of fiddling, I found out that adding this line to the users.txt file let the process run through succesfully (it was my second guess, after leaving the space before the = emtpy :)).

I’ve stored the file as users.txt in the directory where we’re creating the git repo.

Next, let’s tell Git about this file:

$ git config svn.authorsfile users.txt

All set, let’s see how this works out.

$ git svn fetch

Git will start importing your SVN revisions one by one. Should you also run into the problem that your SVN repo contains an author you didn’t specify in the authors file, don’t worry. Simply add the appropriate line and rerun the fetch; it will continue where it left off.

You might notice that Git is importing your SVN branches into a namespace called refs/remotes. We don’t like that, because we plan on using this repository as a remote repository, where these branches should just be the branch. We’ll fix this later.

Anyway, get yourself a coffee/tea/soda/beer/cocktail, depending on the time of day and your preference. Heck, the time of day may become appropriate for whatever you like, this might take a while.

Cleaning the SVN stuff out

Ready? Didn’t have too many cocktails so your thoughts are still reasonably coherent? OK, so now we have a plain old SVN-enabled Git repository. You can have a look at the branches that were created as such:

$ git branch -a
* master
 remotes/Pre-version
 remotes/banner
 remotes/bannermove
 remotes/dev
 remotes/devel
 remotes/eelkeblok
 remotes/ledenvoordeel
 remotes/mscn4
 remotes/register
 remotes/registrations
 remotes/remove_shop
 remotes/tags/demo
 remotes/tags/merge_dev_to_stable
 remotes/tags/online-20051224-2259
 remotes/tags/online-20051224-2311
 remotes/tags/online-20051224-2329
 remotes/tags/online-20051225-0001
[...]
 remotes/tags/online-20091116-2114
 remotes/tags/online-20091130-2127
 remotes/tags/start
 remotes/templateupdate-2.0.18
 remotes/trunk
 remotes/upgrade_forum_3_0
 remotes/wiki

You’ll notice that the Subversion “tags” (which in Subversion aren’t really tags at all, they’re just branches without any subsequent revisions) were converted into Git branches within their own namespace. This is what you might use to work with a project that has its main repository still on SVN; you can use Git on your workstation and have most of its benefits, while the project itself remains on SVN. Neat. But not what we came here to do.

We basically want to do two things;

  • Move the generated branches out of the remotes space
  • Turn the SVN “tags” into proper Git tags

(The following steps were taken directly from reference 5).

First, let’s convert the tags. Note: It has been reported that these steps may remove extra commits made in a tag on SVN (in SVN, tags really are no different from branches, except that we agreed to not commit to them, by convention). Proceed with caution.

Create a script with the following contents (Windows users should be able to execute this through Git Bash, which was installed along with msysgit):

for t in `git branch -r | grep 'tags/' | sed s_tags/__` ; do
     git tag $t tags/$t^
     git branch -d -r tags/$t
done

Save the script e.g. as converttags.sh and execute it.

$ converttags.sh
Deleted remote branch tags/demo (was 5cfe8f1).
Deleted remote branch tags/merge_dev_to_stable (was 2d47421).
Deleted remote branch tags/online-20051224-2259 (was 5a50f9d).
Deleted remote branch tags/online-20051224-2311 (was c6aeda6).
Deleted remote branch tags/online-20051224-2329 (was 92c1ad6).
Deleted remote branch tags/online-20051225-0001 (was 3432051).
Deleted remote branch tags/online-20051225-0008 (was f574ea8).
[...]

That’s our tags converted. Let’s also remove SVN compatibility references:

$ git branch -d -r trunk
Deleted remote branch trunk (was 6ade13f).

$ git config --remove-section svn-remote.svn

$ rm -rf .git/svn .git/{logs/,}refs/remotes/svn/

And let’s convert the remaining remote branches to local ones:

$ git config remote.origin.url .

$ git config --add remote.origin.fetch +refs/remotes/*:refs/heads/*

$ git fetch

All set. We now have a local Git repository with all the contents from our old SVN repository.

Getting a bare repository onto the server

We’re only a few steps away from victory. We need to turn our repository into a bare repository, which means just the Git data, not the accompanying working copy. We’ll do this by cloning our current repository:

$ cd ..
$ git clone --bare mscn_temp mscn.git

This will go superfast, presumably because Git uses hard links because we’re still on the same system.

Upload your bare repository to your server wherever you would like. Do this with the tool you are used to to upload stuff to your server. For the next steps, it is important that you have SSH access to your server; if you don’t, there are other ways to contact your Git repository, although chances are slim that you’ll actually be able to set up any of them if you don’t even get SSH access with your provider. Make sure that the user you are planning to use to connect to the remote repository has read-write access to the repository.

Cloning the remote repository to create a working repository

Now we’re ready to test the remote repository by cloning it to create our final working repository on our local system:

$ git clone eelke@myserver.net:/data/git/mscn.git
Initialized empty Git repository in c:/Data/Project/mscn/.git/
remote: Counting objects: 24568, done.
remote: Compressing objects: 100% (13012/13012), done.
remote: Total 24568 (delta 11682), reused 23571 (delta 10964)
Receiving objects: 100% (24568/24568), 52.27 MiB | 1022 KiB/s, done.
Resolving deltas: 100% (11682/11682), done.
Checking out files: 100% (1675/1675), done.

So, there you are. You have all existing history of your project on your server in a central Git repository and a local working repository to start Gittin’ with it.

References

  1. Cleanly Migrate Your Subversion Repository To a GIT Repository, John Madox
  2. Converting Subversion repositories to Git, Redline’s Weblog
  3. git-svn(1) Manual Page
  4. How to convert from Subversion to Git, Paul Dowman
  5. Convert a SVN Alioth repository to Git
  6. Converting a Subversion repository to Git, John Albin Wilkins
This entry was posted in Uncategorized and tagged , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

17 Comments

  1. Posted 4 April 2012 at 23:48 | Permalink

    Very complete manual.

    Thanks a lot !

  2. dmitry
    Posted 12 May 2012 at 17:59 | Permalink
  3. Posted 5 October 2012 at 09:40 | Permalink

    Better depends on your criteria. Also, this (closed source) product is targeted at mirroring (and thus keeping the SVN repository around). The above tutorial is about leaving SVN behind.

  4. Simon
    Posted 2 December 2012 at 01:31 | Permalink

    @Eelke Well, git-svn is open source. But did you see the code? I did. As for me, closed-source product with good support is better than open-source tool with no support at all (I doubt someone can provide any reasonable support for git-svn).

    You don’t need to keep SVN repository around after subgit converted it. Just drop it and you’re good to go with git.

    Regards, Simon.

  5. Jörgen Persson
    Posted 20 March 2013 at 20:59 | Permalink

    Thanks. It really helped me out.
    I made a script out of your instructions and executed it on a per project basis

  6. Peter van Dijk
    Posted 25 April 2013 at 16:15 | Permalink

    If you have ever committed to a tag, the “tags/$t^” advice in this post WILL throw that last commit away!

  7. Posted 2 May 2013 at 15:56 | Permalink

    Hmm… Have you tried? IIRC, the Git tag is placed on the tip of the tag branch and I would expect that to also hold true when there are more than one commit in the tag “branch” (you really shouldn’t be committing to tags, though ;)). Anyway, I’ll update the post with a warning.

  8. SDd
    Posted 24 May 2013 at 04:54 | Permalink

    Quick question: why the ^ at the end of:
    git tag $t tags/$t^

    ^ at the end of the ref usually means “the commit before”… so wouldn’t this tag the commit before the one we want to tag?

    I’m probably missing something obvious here, but all the similar scripts, including other websites you reference don’t use the carrot.

    I only happened to notice it due to a particular tag I had not having a previous ref and thus getting:
    fatal: ambiguous argument ‘tags/foo^': unknown revision or path not in the working tree.
    yet referencing ‘tags/foo’ works just fine.

    Thanks.

  9. Posted 24 May 2013 at 11:11 | Permalink

    I got the script from reference 5, so really my guess is as good as yours, in a way :) However, IIRC, at least when I last went through this process, svn tags would effectively become branches with a single commit branching off whatever branch the tag was placed on. This makes sense, because tagging in subversion effectively is creating a copy of part of the tree in a different location of the tree, creating a new tree revision. However, if we want to get a “proper” Git tag, we’d have to place the tag on the commit the tag-branch branches off from, i.e. one commit up in the tag-branch.

  10. Posted 11 September 2013 at 15:56 | Permalink

    Hi,

    Totally, this blog is too nice,

    I have started to migrate from my SVN repo into GIT

    http://192.168.0.58:8888/svn/IT
    tags/Archive
    tags/Baseline
    trunk/HSC_Apps
    trunk/Compliance
    After migration I could be able to see my GIT repo of IT/HSC_Apps and IT/Compliance.

    I couldn’t able to see my tags changes into my GIT repo.

    Can you please help me out what are the steps to do suppose if I have to see my tags as well.

    [gituser@ggns1git01 IT]$ git branch -r
    tags/Archive
    tags/Baseline
    tags/Baselined
    tags/LATEST
    trunk
    Could you please can anyone help me on this.

    Appreciate your support.

    Regards,
    Justin

  11. Posted 16 December 2013 at 00:15 | Permalink

    Re:

    “Quick question: why the ^ at the end of:
    git tag $t tags/$t^”

    I had to remove the trailing caret as half my tags failed when executing converttags.sh. My question is: What’s the worst-case scenario with this removed?

    git tag -l

    shows all tags correctly, so I’m curious if I’m missing anything important. Otherwise, an excellent guide that allowed me to migrate our WikkaWiki codebase to github. Thanks!

  12. Posted 28 December 2013 at 18:39 | Permalink

    @Brian Koontz: Also refer to comment 9. I believe the caret is intended to put the Git tag on the second to last commit in the Git branch that was created for the SVN tag (still with me?). The assumption is that the branch is only one commit long and the contents of that commit does not differ from its parent (because if you properly tagged in SVN, that would have been the situation there as well; you can, but should not commit to tags in SVN).

    I’m not sure why this fails for you, but leaving it off should not have too many problems, except that all your converted tags will be at the tip of a separate branch that has no other purpose than the tag.

  13. Posted 28 December 2013 at 18:52 | Permalink

    @Justin Raja Kumar: Sorry for the late response. Unfortunately, I am not sure I understand your problem. Do take into account the things that have already been noted about migrating SVN tags to Git tags. When following this guide to the letter, it is extremely important that you never committed to tags in SVN; you can, but you shouldn’t and the assumption of this guide is that you haven’t.

    If you have, that is not a huge problem, you can modify the convertags.sh script so that it puts the Git tag on the tip of the tag-branch; just change:

    1
    git tag $t tags/$t^

    to:

    1
    git tag $t tags/$t

    (Untested, but from the other commenters I get that some people have done this with success – I am actually not sure if it is possible to delete the branch when there is a tag on it). This would result in a less clean repository history, but it will preserve any history that may have been in your tags-that-were-really-branches ;)

  14. Daniel
    Posted 22 January 2014 at 14:39 | Permalink

    Hi Eelke,
    thanks for this great post! I have tried several ways now to migrate from svn to git including yours, but I always seem to loose all svn commit messages. Only when I don’t use the –no-metadata option in the beginning I get commit messages at all. But this is not what I want, I need the ‘real’ commit messages from the old svn repository. Does anybody else have this problem, too? I usually use ‘qgit –all’ to check the new git repo and to read the commit messages.

  15. Posted 2 February 2014 at 15:14 | Permalink

    @Daniel: Sorry, no, this doesn’t sound at all familiar. The –no-metadata switch will, as far as I know, only prevent Git from recording information that is required for it to be able to write back to the SVN repository, i.e. in case you use Git as an SVN client. Other data, like author and commit message, should be preserved (conceivably, you could call that metadata as well, but I could not imagine any scenario where you would want to leave *that* out).

  16. Rob
    Posted 15 May 2014 at 15:34 | Permalink

    Thanks for the instructions! After following these steps, it seemed like i couldn’t delete my temp repository, since the bare repo was hard linked to it.. so I created a bundle and cloned it into a bare repo.. So far so good.

    So instead of “Getting a bare repository onto the server”, i did this:

    git bundle create mybundle –all &&\
    cd .. &&\
    git clone –bare mscn_temp/mybundle -b master mscn.git

  17. Mike Bruins
    Posted 20 May 2014 at 15:12 | Permalink

    Regarding http://blokspeed.net/blog/2010/09/converting-from-subversion-to-git/

    Thanks for all the great work.
    For me, the instructions pushed trunk to the git server, but not branches nor tags.

    Doing the following pushed the branches and tags to the server.

    git push –all
    git push –tags

    Kind regards,

    Mike

3 Trackbacks

  • […] follows a step-to-step migration history, loosely based on this article, that I found as the most useful resource on the Web among those suggested by Google. I publish the […]

  • […] a few false starts I found this post, which worked almost entirely, except for the section “Getting a bare repository onto the […]

  • By A Programmer's Block on 16 April 2013 at 00:51

    Migrating JAIML from SVN to Git…

    Some time ago, I decided to revive my Java AIML Interpreter – I’ve been programming in Smalltalk for the past 4 years and I’m becoming a bit rusty. My trusty old SVN server has been dead for a while, so I decided to migrate to github …

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>