Migrating a SVN repo to Git, part deux: SubGit to the rescue

  1. Migrating a SVN repo to Git: a tale of hacking my way through
  2. ➤ Migrating a SVN repo to Git, part deux: SubGit to the rescue

To improve is to change; to be perfect is to change often.
— Winston Churchill

In my previous post about SVN→GIT conversion I’ve described steps to convert a nested SVN repo to GIT using svn2git, svndumpfilterIN, SVN::DumpReloc and some manual editing of SVN dump files.

This process worked fine for smaller repos, but after some threshold I’ve hit the wall: final conversion with svn2git for one of the larger repos was taking 5 days and was never quite finished because of Windows version of Git crashing in the middle of process. Those crashes were related to Cygwin’s implementation of fork which requires some address space to be reserved for Cygwin heap and 5 days long run was exhausting those addresses.

After a couple of attempts to convert a repo (which took about 2 weeks!), I’ve realized that I need a more robust and preferably faster solution. And that’s when I finally found SubGit!

SubGit is a tool for a smooth, stress-free SVN to Git migration. Create writable Git mirror of a local or remote Subversion repository and use both Subversion and Git as long as you like. You may also do a fast one-time import from Subversion to Git or use SubGit within Atlassian Bitbucket Server.

SubGit is a commercial closed-source Java application. Fortunately, it’s free for one-time conversions and mirroring for repos with up to 10 Git and SVN users. It also has time-trial version that will mirror repo with any amount of users for one month. If you’re daring enough, you can also use EAP or interim builds. Note that it seems that interim builds don’t have any time/user limits whatsoever.

With SubGit, I was able to convert abovementioned SVN repo to Git overnight without any extra steps, using this simple command:

subgit import --svn-url http://server/svn/my/nested/repo --authors-file .\authors.txt .\repo.git


SubGit also can filter/map SVN branches to Git ones which saves you from filtering SVN repo when you need to sync/migrate only parts of SVN repository. To do so, however, you need to install SubGit into the target SVN repository and configure it by modifying subgit.conf file.

Since SubGit requires JRE and I’m reluctant to install it system-wide, I’ve created a set of scripts to use Subgit with portable JRE. You can find them in this GitHub repository: SubGit-Portable.

Here is a step-by-step walkthrough of installing and configuring portable SubGit with JRE. Please note that SubGit can work in two modes: local and remote. This guide assumes local mode, when SVN sever, SubGit and Git (repo and executables) are on the same server.

Create portable SubGit installation
  • Clone or download and unpack SubGit-Portable repository
  • Get latest SubGit release and unpack it to the SubGit directory inside the repo
  • Get latest Oracle JRE/Server JRE and put it to the JRE_Installer directory inside the repo. You can use either exe installer or tar.gz archive.
  • Double-click unpack_jre.cmd in JRE_Installer folder. It will unpack all exe installers/tar.gz archives to JRE_Portable folder.
  • Edit subgit.cmd to set the name of the folder with portable JRE to be used. E.g: set jre_dir=jre-8u40-windows-x64

Now you can use subgit.cmd as if you were using original subgit.bat:

C:\Path\To\SubGit-Portable> subgit --version
SubGit version 3.1.1 ('Bobique') build #3448
  (c) TMate Software 2012-2016 (http://subgit.com/)

Install SubGit into the SVN repo
  • Run this command to install SubGit binaries into the target SVN repo: subgit configure X:\Path\To\SVN\Repo. You should see something like this:
C:\Path\To\SubGit-Portable> subgit configure X:\Path\To\SVN\Repo
SubGit version 3.1.1 ('Bobique') build #3448

Detecting paths eligible for translation... done.
Subversion to Git mapping has been configured:
     <root> : X:\Path\To\SVN\Repo\.git

CONFIGURATION SUCCESSFUL

To complete SubGit installation do the following:

1) adjust SubGit configuration file, if necessary:
    X:\Path\To\SVN\Repo\conf\subgit.conf
2) add custom authors mapping to the authors.txt file(s) at:
    X:\Path\To\SVN\Repo\authors.txt
3) run SubGit 'install' command:
    subgit install "X:\Path\To\SVN\Repo\" 

  • Create authors.txt that to map SVN commiters to to Git authors. You can use my New-GitSvnAuthorsFile script for this. See my previous post for details.

  • Setup a mapping of projects in SVN repo to corresponding Git repositories. In my case, SVN repo held multiple projects and I needed to create Git repositories only for some of them. SubGit can detect nested repositories and by default will add all of them to the configuration file. So I had to edit conf\subgit.conf in SVN repo folder to remove unneeded ones. Here is example of configuration section for a single nested project:

[git "my/nested/repo"]

    # Path within Subversion repository to the root of trunk/branches/tags structure.
    translationRoot = my/nested/repo

    # Path to the Git repository.
    repository = X:/Git/repo
    pathEncoding = UTF-8

    # Options below (trunk, branches, tags, shelves) define correspondece between Subversion
    # directories and Git references. Depending on the actual Subversion project layout and whether
    # all or only some of the branches have to be mirrored, these options might need to be adjusted.

    trunk = trunk:refs/heads/master
    branches = branches/*:refs/heads/*
    shelves = shelves/*:refs/shelves/*
    tags = tags/*:refs/tags/*

  • Start SVN→Git translation process: subgit install X:\Path\To\SVN\Repo

C:\Path\To\SubGit-Portable> subgit install X:\Path\To\SVN\Repo
SubGit version 3.1.1 ('Bobique') build #3448

Subversion to Git mapping has been found:
     /my/nested/repo: X:\Git\repo

Translating Subversion revisions to Git commits...

    Subversion revisions translated: 54321.
    Total time: 6543 seconds.

INSTALLATION SUCCESSFUL

If something goes wrong, files named subgit-COMMAND-DATE-TIME.zip in the Subgit-Portable folder and X:\Path\To\SVN\Repo\subgit\logs\ contain detailed operations log and error messages.

Basically that’s all you have to do to get SubGit up and running!

Updating SubGit

SubGit is actively developed and from time to time you’d need to update your existing SubGit installation to the latest build.

  • Get latest SubGit build
  • Delete all files in Subgit-Portable\SubGit folder and unpack new build here
  • Reinstall SubGit to the target SVN repo: subgit install X:\Path\To\SVN\Repo

For example, this is how upgrading from EAP build looks like:

SubGit version 3.0.0-EAP ('Bobique') build #3262
This is an EAP build, which you may not like to use in production environment.

Subversion to Git mapping has been found:
    /my/nested/repo: X:\Git\repo

About to shut down background translation process.
Background translation process is not running.

SubGit binaries have been upgraded (3.0.0-EAP#3141 > 3.1.1#3448).
Information on previously encountered errors is cleared.

Processing '/my/nested/repo'
    Translating Subversion revisions to Git commits...

    Subversion revisions translated: 54321.
    Total time: 6543 seconds.

INSTALLATION SUCCESSFUL

I’ve been using SubGit with custom hooks to enable one-way Git mirroring to Visual Studio Online for a quite a long time and it works perfectly. So if you ever find yourself in need of painless SVN→GIt translation, give it a try!

Migrating a SVN repo to Git: a tale of hacking my way through

  1. ➤ Migrating a SVN repo to Git: a tale of hacking my way through
  2. Migrating a SVN repo to Git, part deux: SubGit to the rescue

If you’re just looking for an easy way to do SVN-Git migration, skip this post and go directly to the part two instead.


We become what we behold. We shape our tools, and thereafter our tools shape us.
― Marshall McLuhan

Lately I’ve orchestrated a SVN to Visual Studio Online migration for one of our projects. Our developers opted to use a Git as version control solution, instead of Team Foundation Version Control (TFVC). Also, we have a pure Windows environment, running VisualSVN Server, so I’ll provide Windows-specific tips along the way.

Git and SVN are quite different beasts, especially when it comes to access control and branching strategies. Because of that, simply using Git’s bidirectional bridge to Subversion called git svn will produce suboptimal results. You will end with all branches and tags as remote svn branches, whereas what you really want is git-native local branches and git tag objects.

To alleviate this issue, a number of solutions is available:

reposurgeon
A tool for editing version-control repository history reposurgeon enables risky operations that version-control systems don’t want to let you do, such as editing past comments and metadata and removing commits. It works with any version control system that can export and import git fast-import streams, including git, hg, fossil, bzr, CVS, and RCS. It can also read Subversion dump files directly and can thus be used to script production of very high-quality conversions from Subversion to any supported DVCS.
agito
Agito is (yet another) Subversion to Git conversion script.It is designed to do a better job of translating history than git-svn, which has some subtleties in the way it works that cause it to construct branch histories that are suboptimal in certain corner case scenarios.
svn2git
svn2git is a tiny utility for migrating projects from Subversion to Git while keeping the trunk, branches and tags where they should be. It uses git-svn to clone an svn repository and does some clean-up to make sure branches and tags are imported in a meaningful way, and that the code checked into master ends up being what’s currently in your svn trunk rather than whichever svn branch your last commit was in.

We are all wonderful, beautiful wrecks. That’s what connects us ― that we’re all broken, all beautifully imperfect.
― Emilio Estevez

Initially I’ve planned to use reposurgeon, because it’s clearly wins over other solutions:

There are many tools for converting repositories between version-control systems out there. This file explains why reposurgeon is the best of breed by comparing it to the competition.

The problems other repository-translation tools have come from ontological mismatches between their source and target systems – models of changesets, branching and tagging can differ in complicated ways. While these gaps can often be bridged by careful analysis, the techniques for doing so are algorithmically complex, difficult to test, and have ugly edge cases.

Furthermore, doing a really high-quality translation often requires human judgment about how to move artifacts – and what to discard. But most lifting tools are, unlike reposurgeon, designed as run-it-once batch processors that can only implement simple and mechanical rules.

Consequently, most repository-translation tools evade the harder problems. They produce a sort of pidgin rendering that crudely and partially copies the history from the source system to the target without fully translating it into native idioms, leaving behind metadata that would take more effort to move over or leaving it in the native format for the source system.

But pidgin repository translations are a kind of friction drag on future development, and are just plain unpleasant to use. So instead of evading the hard problems, reposurgeon tackles them head-on.

Reposurgeon is written in Python and author recommends to run it using PyPy as it provides substantial speed increase (for Windows, get the latest Python 2.7 compatible PyPy binary). Unfortunately, I wasn’t able to do much with it, because reposurgeon failed to read Subversion dump of my repo:

reposurgeon% read repo.svn
reposurgeon: from repo.svn......(0.03 sec) aborted by error.
reposurgeon: EOL not seen where expected, Content-Length incorrect at line 187

This was a bit unexpected, so I decided to put reposurgeon aside for a time being and try something else. Choosing between agito and svn2git, I chose latter, mostly because it’s seemed to be actively maintained, whereas agito last update was about a year ago. Also svn2git usage is more straightforward (no config file needed).

To setup svn2git on Windows, follow this steps:

  • Install your favorite Git flavour (Git for Windows or plain Git)
  • Get Ruby v1.9.x via RubyInstaller
  • Start command prompt with Ruby
  • cd c:\path\to\svn2git
  • gem install jeweler
  • gem install svn2git

My repo has a standard layout with branches and trunk (no tags), but it’s nested. According to the documentation converting it with svn2git should’ve been easy as this:

svn2git http://server/svn/my/nested/repo --notags --authors authors.txt --no-minimize-url --verbose

But after some processing, svn2git just gave up:

error: pathspec 'master' did not match any file(s) known to git.

Browsing issues on Github lead me to this: error: pathspec ‘master’ did not match any file(s) known to git. Common solutions are to delete .git folder and start conversion anew and explicitly specify –trunk, –branches and –tags (or –notags in my case). Needles to say, that none of that worked for me. After some meddling with svn2git options, I’ve concluded, that problems with nested repos are common and I’d better do something about it. Digging further, led me to the svndumpfilter command and a way to move repo contents to the root folder:

If you want your trunk, tags, and branches directories to live in the root of your repository, you might wish to edit your dump files, tweaking the Node-path and Node-copyfrom-path headers so that they no longer have that first calc/ path component. Also, you’ll want to remove the section of dump data that creates the calc directory. It will look something like the following:

Node-path: calc
Node-action: add
Node-kind: dir
Content-length: 0

So, the first step would be to filter my nested repo from the dump:

svnfilter include "/nested/project" --drop-empty-revs < repo.svn > repo_filtered.svn

If svndumpfilter fails to process your dump (and that happens a lot) you might try svndumpfilterIN Python script. Beware, that on Windows, this script produces broken dumps due to CR+LF issues. To fix this you have to tell Python to open files in binary mode. Replacing this two lines in script:

with open(input_dump) as input_file:
with open(output_dump, 'a+') as output_file:

with

with open(input_dump, 'rb') as input_file:
with open(output_dump, 'ab+') as output_file:

will take care of this.

Update (02.01.2015): the issue above is fixed in the latest version of svndumpfilterIN (see this pull request). But I’ve faced another: when trying to filter heavily tangled repos, svndympfilterIN will crash while pulling large amount of tangled files from source repo. I was able to conjure a temporary workaround, see my issue on the GitHub: Crash when untangling large amount of files. Or just use my fork of the svndympfilterIN that has this any some other issues fixed and features added.

Example:

svndumpfilter.py repo.svn --repo=x:\svnpath\repo --output-dump=repo_filtered.svn include "nested/project" --stop-renumber-revs

Next, I’ve to search and replace all occurrences of /nested/project with /. There is a lot of sed on-liners available, but I’ve opted for SVN::DumpReloc Perl script. I’ve used Strawberry Perl to run it on Windows.

svn-dump-reloc "nested/project" "/" < repo_filtered.svn > repo_filtered_relocated.svn

But I can’t just directly import this dump to SVN, because due to relocation, the first commit will try to create a root directory (empty Node-path: entry), which is not allowed.

Revision-number: 123456
Prop-content-length: 111
Content-length: 111

K 7
svn:log
V 13
Start project
K 10
svn:author
V 3
John Doe
K 8
svn:date
V 27
2000-01-01T00:00:00.000000Z
PROPS-END

Node-path: 
Node-kind: dir
Node-action: add
Prop-content-length: 10
Content-length: 10

PROPS-END


Node-path: /subfolder
Node-kind: dir
Node-action: add
Prop-content-length: 10
Content-length: 10

PROPS-END

The marked section should be removed. Make sure to use editor, that will handle big files and wouldn’t change anything else (like line endings). If revision contains only one entry, the whole revision should be removed. This could be done either by editing dump manually, or by using svndumpfilter‘s –revision parameter, to skip this commit altogether. In my case, I had to remove only one section in revision.

Revision-number: 123456
Prop-content-length: 111
Content-length: 111

K 7
svn:log
V 13
Start project
K 10
svn:author
V 3
John Doe
K 8
svn:date
V 27
2000-01-01T00:00:00.000000Z
PROPS-END

Node-path: /subfolder
Node-kind: dir
Node-action: add
Prop-content-length: 10
Content-length: 10

PROPS-END

Then, I need to create a new SVN repo and load filtered and relocated dump:

svnadmin create x:\svnpath\newrepo
svnadmin load x:\svnpath\newrepo < repo_filtered_relocated.svn

Finally, let’s see if I’m able to run svn2git against new repo with success:

svn2git http://server/svn/newrepo --notags --authors authors.txt --verbose

And this time it works right and proper, so I can push my shiny new Git repo to the Visual Studio Online (don’t forget to setup alternate credentials):

git remote add origin https://project.visualstudio.com/DefaultCollection/_git/Project
git push -u origin --all

You can get much farther with a kind word and a PowerShell than you can with a kind word alone.

But thats not all, folks! This story wouldn’t be complete without some PowerShell lifesaver and I wouldn’t dream of disappointing you. Some of you may noticed, that svn2git requires authors file to map SVN commiters to to Git authors. There is plentiful of *nix solutions out there, but I needed a PowerShell one. Since we use VisualSVN Server, the SVN committers’ names are actually Windows domain accounts, so it also would be great to completely automate authors file creation using authors’ full names and emails from Active Directory.

First, I need to get the list of SVN committers for my repo. To do this, I’ve wrapped svn.exe -log command into the Powershell function Get-SvnAuthor. It returns the list of unique commit authors in one or more SVN repositories. I’m listing it here for your convenience, but if you intend to use it, grab instead the latest version from my GitHub repo.

<# .Synopsis Get list of unique commit authors in SVN repository. .Description Get list of unique commit authors in one or more SVN repositories. Requires Subversion binaries. .Parameter Url This parameter is required. An array of strings representing URLs to the SVN repositories. .Parameter User This parameter is optional. A string specifying username for SVN repository. .Parameter Password This parameter is optional. A string specifying password for SVN repository. .Parameter SvnPath This parameter is optional. A string specifying path to the svn.exe. Use it if Subversion binaries is not in your path variable, or you wish to use specific version. .Example Get-SvnAuthor -Url 'http://svnserver/svn/project' Description ----------- Get list of unique commit authors for SVN repository http://svnserver/svn/project .Example Get-SvnAuthor -Url 'http://svnserver/svn/project' -User john -Password doe Description ----------- Get list of unique commit authors for SVN repository http://svnserver/svn/project using username and password. .Example Get-SvnAuthor -Url 'http://svnserver/svn/project' -SvnPath 'C:\Program Files (x86)\VisualSVN Server\bin\svn.exe' Description ----------- Get list of unique commit authors for SVN repository http://svnserver/svn/project using custom svn.exe binary. .Example Get-SvnAuthor -Url 'http://svnserver/svn/project_1', 'http://svnserver/svn/project_2' Description ----------- Get list of unique commit authors for two SVN repositories: http://svnserver/svn/project_1 and http://svnserver/svn/project_2. .Example 'http://svnserver/svn/project_1', 'http://svnserver/svn/project_2' | Get-SvnAuthor Description ----------- Get list of unique commit authors for two SVN repositories: http://svnserver/svn/project_1 and http://svnserver/svn/project_2. #>
function Get-SvnAuthor
{
	[CmdletBinding()]
	Param
	(
		[Parameter(Mandatory = $true, ValueFromPipeline = $true, ValueFromPipelineByPropertyName = $true)]
		[ValidateNotNullOrEmpty()]
		[string[]]$Url,

		[Parameter(ValueFromPipelineByPropertyName = $true)]
		[ValidateNotNullOrEmpty()]
		[string]$User,

		[Parameter(ValueFromPipelineByPropertyName = $true)]
		[ValidateNotNullOrEmpty()]
		[string]$Password,

		[ValidateScript({
			if(Test-Path -LiteralPath $_ -PathType Leaf)
			{
				$true
			}
			else
			{
				throw "$_ not found!"
			}
		})]
		[ValidateNotNullOrEmpty()]
		[string]$SvnPath = 'svn.exe'
	)

	Begin
	{
		if(!(Get-Command -Name $SvnPath -CommandType Application -ErrorAction SilentlyContinue))
		{
			throw "$SvnPath not found!"
		}
		$ret = @()
	}

	Process
	{
		$Url | ForEach-Object {
			$SvnCmd = @('log', $_, '--xml', '--quiet', '--non-interactive') + $(if($User){@('--username', $User)}) + $(if($Password){@('--password', $Password)})
			$SvnLog = &$SvnPath $SvnCmd *>&1

			if($LastExitCode)
			{
				Write-Error ($SvnLog | Out-String)
			}
			else
			{
				$ret += [xml]$SvnLog | ForEach-Object {$_.log.logentry.author}
			}
		}
	}

	End
	{
		$ret | Sort-Object -Unique
	}
}

Second, I need to actually grab authors info from Active Directory and save resulting file. This is the job for my another script ― New-GitSvnAuthorsFile. It uses Get-SvnAuthor function, so place it alongside with it.

<# .Synopsis Generate authors file for SVN to Git migration. Can map SVN authors to domain accounts and get full names and emails from Active Directiry. .Description Generate authors file for one or more SVN repositories. Can map SVN authors to domain accounts and get full names and emails from Active Directiry Requires Subversion binaries and Get-SvnAuthor function: https://github.com/beatcracker/Powershell-Misc/blob/master/Get-SvnAuthor.ps1 .Notes Author: beatcracker (https://beatcracker.wordpress.com, https://github.com/beatcracker) License: Microsoft Public License (http://opensource.org/licenses/MS-PL) .Component Requires Subversion binaries and Get-SvnAuthor function: https://github.com/beatcracker/Powershell-Misc/blob/master/Get-SvnAuthor.ps1 .Parameter Url This parameter is required. An array of strings representing URLs to the SVN repositories. .Parameter Path This parameter is optional. A string representing path, where to create authors file. If not specified, new authors file will be created in the script directory. .Parameter ShowOnly This parameter is optional. If this switch is specified, no file will be created and script will output collection of author names and emails. .Parameter QueryActiveDirectory This parameter is optional. A switch indicating whether or not to query Active Directory for author full name and email. Supports the following formats for SVN author name: john, domain\john, john@domain .Parameter User This parameter is optional. A string specifying username for SVN repository. .Parameter Password This parameter is optional. A string specifying password for SVN repository. .Parameter SvnPath This parameter is optional. A string specifying path to the svn.exe. Use it if Subversion binaries is not in your path variable, or you wish to use specific version. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' Description ----------- Create authors file for SVN repository http://svnserver/svn/project. New authors file will be created in the script directory. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' -QueryActiveDirectory Description ----------- Create authors file for SVN repository http://svnserver/svn/project. Map SVN authors to domain accounts and get full names and emails from Active Directiry. New authors file will be created in the script directory. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' -ShowOnly Description ----------- Create authors list for SVN repository http://svnserver/svn/project. Map SVN authors to domain accounts and get full names and emails from Active Directiry. No authors file will be created, instead script will return collection of objects. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' -Path c:\authors.txt Description ----------- Create authors file for SVN repository http://svnserver/svn/project. New authors file will be created as c:\authors.txt .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' -User john -Password doe Description ----------- Create authors file for SVN repository http://svnserver/svn/project using username and password. New authors file will be created in the script directory. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' -SvnPath 'C:\Program Files (x86)\VisualSVN Server\bin\svn.exe' Description ----------- Create authors file for SVN repository http://svnserver/svn/project using custom svn.exe binary. New authors file will be created in the script directory. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project_1', 'http://svnserver/svn/project_2' Description ----------- Create authors file for two SVN repositories: http://svnserver/svn/project_1 and http://svnserver/svn/project_2. New authors file will be created in the script directory. .Example 'http://svnserver/svn/project_1', 'http://svnserver/svn/project_2' | New-GitSvnAuthorsFile Description ----------- Create authors file for two SVN repositories: http://svnserver/svn/project_1 and http://svnserver/svn/project_2. New authors file will be created in the script directory. #>
[CmdletBinding()]
Param
(
	[Parameter(Mandatory = $true, ValueFromPipeline = $true, ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Save')]
	[Parameter(Mandatory = $true, ValueFromPipeline = $true, ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Show')]
	[string[]]$Url,

	[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Save')]
	[ValidateScript({
		$ParentFolder = Split-Path -LiteralPath $_
		if(!(Test-Path -LiteralPath $ParentFolder  -PathType Container))
		{
			throw "Folder doesn't exist: $ParentFolder"
		}
		else
		{
			$true
		}
	})]
	[ValidateNotNullOrEmpty()]
	[string]$Path = (Join-Path -Path (Split-Path -Path $script:MyInvocation.MyCommand.Path) -ChildPath 'authors'),

	[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Show')]
	[switch]$ShowOnly,

	[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Save')]
	[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Show')]
	[switch]$QueryActiveDirectory,

	[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Save')]
	[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Show')]
	[string]$User,

	[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Save')]
	[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Show')]
	[string]$Password,

	[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Save')]
	[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Show')]
	[string]$SvnPath
)

# Dotsource 'Get-SvnAuthor' function:
# https://github.com/beatcracker/Powershell-Misc/blob/master/Get-SvnAuthor.ps1
$ScriptDir = Split-Path $script:MyInvocation.MyCommand.Path
. (Join-Path -Path $ScriptDir -ChildPath 'Get-SvnAuthor.ps1')

# Strip extra parameters or splatting will fail
$Param = @{} + $PSBoundParameters
'ShowOnly', 'QueryActiveDirectory', 'Path' | ForEach-Object {$Param.Remove($_)}

# Get authors in SVN repo
$Names = Get-SvnAuthor @Param
[System.Collections.SortedList]$ret = @{}

# Exit, if no authors found
if(!$Names)
{
	Exit
}

# Find full name and email for every author
foreach($name in $Names)
{
	$Email = ''

	if($QueryActiveDirectory)
	{
		# Get account name from commit author name in any of the following formats:
		# john, domain\john, john@domain
		$Local:tmp = $name -split '(@|\\)'
		switch ($Local:tmp.Count)
		{
			1 { $SamAccountName = $Local:tmp[0] ; break }
			3 {
				if($Local:tmp[1] -eq '\')
				{
					[array]::Reverse($Local:tmp)
				}

				$SamAccountName = $Local:tmp[0]
				break
			}
			default {$SamAccountName = $null}
		}

		# Lookup account details
		if($SamAccountName)
		{
			$UserProps = ([adsisearcher]"(samaccountname=$SamAccountName)").FindOne().Properties

			if($UserProps)
			{
				Try
				{
					$Email = '{0} <{1}>' -f $UserProps.displayname[0], $UserProps.mail[0]
				}
				Catch{}
			}
		}
	}

	$ret += @{$name = $Email}
}

if($ShowOnly)
{
	$ret
}
else
{
	# Use System.IO.StreamWriter to write a file with Unix newlines.
	# It's also significally faster then Add\Set-Content Cmdlets.
	Try
	{
		#StreamWriter Constructor (String, Boolean, Encoding): http://msdn.microsoft.com/en-us/library/f5f5x7kt.aspx
		$StreamWriter = New-Object -TypeName System.IO.StreamWriter -ArgumentList $Path, $false,  ([System.Text.Encoding]::ASCII)
	}
	Catch
	{
		throw "Can't create file: $Path"
	}
	$StreamWriter.NewLine = "`n"

	foreach($item in $ret.GetEnumerator())
	{
		$Local:tmp = '{0} = {1}' -f $item.Key, $item.Value
		$StreamWriter.WriteLine($Local:tmp)
	}

	$StreamWriter.Flush()
	$StreamWriter.Close()
}

And that’s all I need to create a fully functional authors file for my SVN repository:

.\New-GitSvnAuthorsFile.ps1 -Url 'http://server/svn/newrepo' -Path 'c:\svn2git\authors.txt' -QueryActiveDirectory

Here is the sample authors file, created by the command above:

john@domain = John Doe <john.doe@mycompany.com>
domain\jane = Jane Doe <jane.doe@mycompany.com>
doe = Doe <doe@mycompany.com>

And now that’s all for today, enjoy your winter holidays and stay tuned for more!