# Building PowerShell modules with Swagger Codegen

A word of warning. This was written some time ago and I did’t have time to actually publish it till now. It’s probably rendered obsolete by the release of PSSwagger, but I decided to post it anyway.

Web API’s are everywhere. They provide cross-platform interface for applications to communicate with each other enabling dev/ops people to create highly automated interconnected systems. But there is a catch: to use the API you need to write an API client in the language of your choice.

For PowerShell this means that you need to read API spec, write code that send correct POST/GET requests and transform raw XML/JSON responces to the .NET/PowerShell friendly objects. And don’t forget tests!

As it happens, this problem was solved a long ago for other languages by Swagger:

Swagger is the world’s largest framework of API developer tools for the OpenAPI Specification(OAS), enabling development across the entire API lifecycle, from design and documentation, to test and deployment.

Basically Swagger allows you to design and document web-API and share it with the world by using OpenAPI specification:

The OpenAPI Specification creates the RESTful contract for your API, detailing all of its resources and operations in a human and machine readable format for easy development, discovery, and integration.

Moreover, once you got your hands on the someone’s API spec, you can build a fully featured client in your programming language for this spec automatically, using Swagger Codegen:

Build APIs quicker and improve consumption of your Swagger-defined APIs in every popular language with Swagger Codegen. Swagger Codegen can simplify your build process by generating server stubs and client SDKs from your Swagger specification, so your team can focus better on your API’s implementation and adoption.

Sounds good, huh?

If a deal looks too good to be true, it probably is.
— Michael Douglas

As you might have already guessed, while Swagger Codegen does tremendous job of generating fully spec-compatible code for a ton of programming languages, PowerShell is not a one of them. Well, not until recently.

It’s all began when I’ve tried to find a way to avoid writing yet-another-web-API-module for PowerShell. The service I wished to to interact with, had an OpenAPi spec, so while I knew that Swagger Codegen didn’t have a PowerShell support I thought, that maybe I could generate a C# client and use that from PS. Sure, I’d have to wrap a .NET assemblies with some POSH functions, but that’s beats writing a bunch of of Invoke-WebRequests. After creating a first WIP wrapper the whole idea loked pretty solid, so I’ve dropped a comment in this GitHub issue in Swagger Codegen repo: PowerShell module.

Just FYI: since you already have a C# generator, you could just make a thin PowerShell wrapper around generated C# code. This could save some time.

One thing led to another and I’ve ended up providing a module template for top Swagger Codegen contributor William Cheng to build PowerShell generator upon.

William then did all the hard work by converting this template to the actual PowerShell generator, which has been merged to the master branch of the Swagger Codegen repo: [PowerShell] Add PowerShell API client generator.

Knowledge is of no value unless you put it into practice.
— Anton Chekhov

The PowerShell generator is still a beta-quality and still has some bugs and room for improvement. It’s scheduled to be released with v2.2.3 of Swagger Codegen. But it will benefit tremendously from actual users trying to put it to the work and that’s why I urge you tot test it right now and leave your feedback and ideas in the Swagger Codegen repo.

Swagger Codegen is a Java application itself, so to get a working version of current master branch with PowerShell support you need to build it from source. Fortunately for you, I’ve made a set of PowerShell scripts that will do all this for you and more:

This repository contains set of scripts that will:

The best thing is that you don’t even have to hunt for OpenAPI spec of your favorite service yourself! The Build.ps1 script supports APIs.guru OpenAPI directory, which has API specs for almost every web-service you can think of. The only thing you have to do, is specify a spec name, like xkcd.com and script will fetch it and build a PowerShell module based on it for you. You can view a full list of APIs.guru API specs here: https://apis.guru/browse-apis/

If run without arguments, Build.ps1 script will install all prerequisites, build Swagger Codegen and generate XKCD module using this spec.

All you need to do is:

• Run PowerShell/PowerShell ISE as Administrator
• Run Build.ps1 script

Generated module will be available in the xkcd.com\PowerShell\src\IO.Swagger directory. To import it use: Import-Module -Name '.\xkcd.com\PowerShell\src\IO.Swagger -Verbose'

#### Build custom module

If you already run Build.ps1 script and have all prerequisites, you can build custom PowerShell modules.

##### By API name

Browse API.GURUs API collection GitHub repo and pick up API name you’d like to build module for and use folder name as ApiName in Build.ps1 script.

.\Build.ps1 -ApiName instagram.com

##### From custom file

If you already have custom swagger.json/swagger.yml file, you can use it.

.\Build.ps1 -ApiName instagram.com -InFile .\path\to\spec\swagger.yml


#### Update Swagger Codegen

To pull latest changes and rebuild Swagger Codegen, add UpdateCodegen switch.

.\Build.ps1 -UpdateCodegen


# Visualizing PowerShell pipeline

A picture1 is worth a thousand words.

Occasionally, I see people having issues while trying to understand how PowerShell pipeline is executed. Most of them have no problems when Begin/Process/End blocks are in the single function. And if in doubt, I can always point them to the Don Jones’ The Advanced Function Lifecycle article. But when multiple cmdlets are chained into the one single pipeline, things become a little less clear.

Consider this example.

function Use-Begin {
Begin {
Write-Host 'Begin'
}
}

function Use-End {
End {
Write-Host 'End'
}
}


Let’s try to pipe one function into another:

PS C:\Users\beatcracker> Use-Begin | Use-End

Begin
End


So far, so good, nothing unexpected. The Begin block of the Use-Begin function executes first, and the End block of the Use-End function executes last.

But what happens if we swap the functions in our pipeline?

PS C:\Users\beatcracker>  Use-End | Use-Begin

Begin
End


Gotcha! As you can see, nothing changed. Regardless of the position of the cmdlet in the pipeline, Begin blocks are always executed first and End blocks last. This could be a bit counterintuitive, because it’s easy to imagine pipeline like this:

Begin-1 {} -> Process-1 {} -> End-1 {} | Begin-2 {} -> Process-2 {} -> End-2 {}


When in fact, the pipeline goes this way:

Begin-1 {}
Begin-2 {}
Process-1 {} | Process-2 {}
End-1 {}
End-2 {}


Which is more logical, when you think about it for a second: for every function in the pipeline, the Begin blocks have to be executed once, before the Process block and when Process blocks are finished iterating over every pipeline element, it’s time to finally run the End block. This gives us to the picture above.

To illustrate my point, I’ve created a View-Pipeline function, that generates a chained pipeline of advanced functions with Begin/Process/End blocks and displays their execution order. It makes it easy to visualize pipeline processing and get solid understanding of the pipeline lifecycle.

Here are some visualization examples made with this function:

• One advanced function with Begin/Process/End blocks
PS C:\Users\beatcracker> View-Pipeline

View-Pipeline-1

[View-Pipeline-1]::Begin
[View-Pipeline-1]::Process
In : ""
Out: "View-Pipeline-1"
[View-Pipeline-1]::End

• Three advanced functions, each with its own Begin/Process/End blocks, passing one item through the pipeline
PS C:\Users\beatcracker> View-Pipeline -Pipes 3

View-Pipeline-1 | View-Pipeline-2 | View-Pipeline-3

[View-Pipeline-1]::Begin
[View-Pipeline-2]::Begin
[View-Pipeline-3]::Begin
[View-Pipeline-1]::Process
In : ""
Out: "View-Pipeline-1"
[View-Pipeline-2]::Process
In : "View-Pipeline-1"
Out: "View-Pipeline-2"
[View-Pipeline-3]::Process
In : "View-Pipeline-2"
Out: "View-Pipeline-3"
[View-Pipeline-1]::End
[View-Pipeline-2]::End
[View-Pipeline-3]::End

• Two advanced functions, with only Process/End blocks, passing two items through the pipeline
PS C:\Users\beatcracker> View-Pipeline -Pipes 2 -Items 2 -NoBegin

View-Pipeline-1 | View-Pipeline-2

[View-Pipeline-1]::Process
In : ""
Out: "View-Pipeline-1"
[View-Pipeline-2]::Process
In : "View-Pipeline-1"
Out: "View-Pipeline-2"
[View-Pipeline-2]::Process
In : "View-Pipeline-1"
Out: "View-Pipeline-2"
[View-Pipeline-1]::End
[View-Pipeline-2]::End


That was easy, isn’t it?

I’m hoping that anyone, when shown this post, can get the gist of the PowerShell’s pipeline lifecycle in no time. If not – just let me know and I’ll do my best to improve it.

1. There are actually no pictures here. Sorry.

# Using Group Managed Service Accounts without Active Directory module

Hello and, again, welcome to the Aperture Science computer-aided enrichment center.
We hope your brief detention in the relaxation vault has been a pleasant one.

Managed Service Accounts (MSA) appeared first in the Windows Server 2008 R2 and received major overhaul (gMSA) in the Windows Server 2012. Those accounts have automatically managed passwords and tied to specific computer (WS 2008 R2) or group of computers (WS 2012). They cannot be locked out, and cannot perform interactive logons, which makes them ideal for running services. Under the hood MSA are the user accounts that inherit from a parent object class of “Computer” and the only supported way to manage them is the PowerShell.

To do so, you have to use cmdlets in the Active Directory module. This is two-step process: first, you create gMSA in the AD and then “install” this account on the target computer(s).

Here is the cmdlets used for the AD part of the process:

And here is the ones used to manage gMSA on the target computers:

When life gives you lemons, don’t make lemonade. Make life take the lemons back!
Get mad! I don’t want your damn lemons, what the hell am I supposed to do with these?
Demand to see life’s manager! Make life rue the day it thought it could give Cave Johnson lemons!
— Cave Johnson

While I don’t often create gMSAs in the AD, I do need to be able to install them en-masse on a servers, preferably via remote PowerShell session. And here comes the pain.

For starters, Active Directory module is not installed by default, unless the server is a domain controller. While this is easily solved by running

Import-Module ServerManager


You still have to do it for every machine you’re trying to use gMSA on. Even with automation, this is another thing to bear in mind.

But this is only minor issue, comparing to what comes next. This is what you get when you try to import Active Directory module in the remote session on a machine where you need to install gMSA:

Import-Module ActiveDirectory
WARNING: Error initializing default drive: 'Unable to contact the server.
This may be because this server does not exist, it is currently down,
or it does not have the Active Directory Web Services running.'.


This is the classic double hop issue, which can be fixed using several methods, some of them better than the others. Here is the most used ones:

#### Pass fresh credentials inside the Invoke-Command scriptblock

Pros: It works.
Cons: Makes unattended automation ungainly.

#### Configure CredSSP

Pros: This is the most used and well-documented approach.
Cons: Has security issues.

When using CredSSP, PowerShell will perform a “Network Clear-text Logon” instead of a “Network Logon”. Network Clear-text Logon works by sending the user’s clear-text password to the remote server

I definitely do not want to send plaintext passwords over network, even if the channel itself is encrypted. Because guess what? When it arrives at the destination server, it will be stored in memory in the plain sight. So if the target server is compromised, attacker can grab my domain admin credentials using Mimikatz with no effort at all. And even if you’re extra careful, and use non-privileged account for remoting, it’s still a valid AD account.

#### Use Resource-Based Kerberos Constrained Delegation

With Windows server 2012, Microsoft made improvements to the Kerberos delegation mechanism, which means that with some simple configuration, you can use solve double-hop issues in easy and secure manner: PowerShell Remoting Kerberos Double Hop Solved Securely.

Pros: Works, has no security issues.
Cons: Requires Windows Server 2012 and above for most servers involved.

Do you know who I am? I’m the man who’s gonna burn your house down! With the lemons!
I’m gonna get my engineers to invent a combustible lemon that burns your house down!
— Cave Johnson

All of my servers are Windows Server 2012 R2 and have resource-based Kerberos constrained delegation configured. This is what I use daily for my remoting tasks. Unfortunately I couldn’t make Active directory module work in the PSRemoting with it. No matter what I’ve tried, I still got the error message from above. So with this module, it’s CredSSP or nothing.

I’m OK with occasionally logging on the DC via RDP to create gMSAs, but repeating the same steps over and over for the every server where gMSA has to be installed? Hell no!

So the question is: do we really need the Active Directory module to install gMSAs on the target server? I did some digging and found that Active Directory module is using a set of the Win32 API functions to manage gMSA’s locally:

After that is was fairly easy to write my own implementation of the cmdlets that are used to install gMSA locally. For the sake of simplicity, I’ve decided to not create a full-blown module and instead made a self-contained function which even can be copy-pasted into the remote session. It doesn’t use the Active Directory module and works fine with resource-based Kerberos constrained delegation. You can grab it from my GitHub repo:

Here is the quick breakdown of original Active Directory gMSA cmdlets and their Use-ServiceAccount counterparts:

'GMSA_Acount' | Use-ServiceAccount -Add


This will install Group Managed Service Account with SAM account name GMSA_Account on the computer on which the cmdlet is run.

'GMSA_Acount' | Use-ServiceAccount -Query


Used to query the status of the service account from the local computer. Using -Detailed switch you can even get MSA_INFO_STATE enumeration containing detailed information on (g)MSA state:

'GMSA_Acount' | Use-ServiceAccount -Query -Detailed

'GMSA_Acount' | Use-ServiceAccount -Remove


Removes gMSA from the local computer or standalone managed service account (sMSA) from Active Directory. If local computer is disconnected from the domain, you can remove service account object and the stored in the LSA using -ForceRemoveLocal switch:

'GMSA_Acount' | Use-ServiceAccount -Remove -ForceRemoveLocal


As a bonus, you can also test whether the specified standalone managed service account (sMSA) or group managed service account (gMSA) exists in the Netlogon store on the this server. This action has no Active Directory module counterpart:

'GMSA_Acount' | Use-ServiceAccount -Test


Thank you for participating in this Aperture Science computer-aided enrichment activity.

Debugging this issue and writing Use-ServiceAccount function was fun, but I’d really like to avoid this altogether. If any of you managed to get Active Directory module working with resource-based Kerberos constrained delegation — don’t hesitate to let me know! And be sure drop a comment or create an issue on the GitHub if something doesn’t work for you.

# Try PowerShell on Linux in your browser

Try latest release of the PowerShell 6.0 on the Ubuntu 16.04 in the free cloud server from Dply:

### How-to

1. Click on the button, login with GitHub account and start your server.
2. When server is up (~3 minutes), navigate to server’s IP address in your browser.
3. Login with root as username and hostname you’ve set in the server configuration as a password.
4. Type powershell, hit Enter and start hacking around!

# Get/set XML Schema and Content Types for SharePoint list directly in the content database

There is a charm about the forbidden that makes it unspeakably desirable.
— Mark Twain

### Why would you do it?

Sometimes, despite all the warnings, you need to modify XML Schema and/or Content Types for SharePoint list directly in the content database. This could be caused by moving stuff, failed upgrade or removed SharePoint feature that resulted in broken lists.

In SharePoint 2007 and earlier that was fairly easy: you could just fire up SQL Management Studio, dig into the content database and fix it there: list’s Fields, that are part of list’s XML Schema are stored in tp_Fields column and Content Types are stored in the tp_ContentTypes column of the AllLists table in the content database.

### So, what’s changed?

With luck and some googling around I’ve found that compressed objects format is documented in [MS-WSSFO3]: Windows SharePoint Services (WSS): File Operations Database Communications Version 3 Protocol. Those objects are called WSS Compressed Structures and consist of simple header followed by zlib comressed string.

Zlib streams can be extracted using System.IO.Compression.DeflateStream class. To do so, you have to skip first 14 bytes: 12 bytes for WSS Compressed Structure header and 2 bytes for zlib stream header, since DeflateStream doesn’t understand these. Here is what decompressed data (beautified, it actually stored as one long string) looks like:

#### Fileds:

15.0.0.4701.0.0
<FieldRef
Name="ContentTypeId" />
<FieldRef
Name="Title"
ColName="nvarchar1" />
<FieldRef
ColName="ntext1" />
<FieldRef
Name="File_x0020_Type"
ColName="nvarchar2" />
<Field
ID="{246d0907-637c-46b7-9aa0-0bb914daa832}"
Name="_Author"
Group="$Resources:core,Document_Columns;" Type="Text" DisplayName="$Resources:core,Author;"
SourceID="http://schemas.microsoft.com/sharepoint/v3/fields"
StaticName="_Author"
Description="$Resources:core,_AuthorDesc;" Sealed="TRUE" AllowDeletion="TRUE" ShowInFileDlg="FALSE" ColName="nvarchar3" RowOrdinal="0" /> <Field ID="{875fab27-6e95-463b-a4a6-82544f1027fb}" Name="RelatedIssues" Group="$Resources:core,Extended_Columns;"
Type="LookupMulti"
Mult="TRUE"
DisplayName="$Resources:core,Related_Issues;" SourceID="http://schemas.microsoft.com/sharepoint/v3" StaticName="RelatedIssues" PrependId="TRUE" List="Self" ShowField="Title" ColName="int1" RowOrdinal="0" />  #### Content Types: <ContentType ID="0x01005144F19DD8291D42BAAA922235A381BD" Name="$Resources:core,Item;"
Group="$Resources:core,List_Content_Types;" Description="$Resources:core,ItemCTDesc;"
Version="4"
FeatureId="{695b6570-a48b-4a8e-8ea5-26ea7fc1d162}">
<FieldRefs>
<FieldRef
ID="{c042a256-787d-4a6f-8a8a-cf6ab767f12d}"
Name="ContentType" />
<FieldRef
ID="{fa564e0f-0c70-4ab9-b863-0177e6ddd247}"
Name="Title"
Required="TRUE"
ShowInNewForm="TRUE"
ShowInEditForm="TRUE" />
<FieldRef
ID="{246d0907-637c-46b7-9aa0-0bb914daa832}"
Name="_Author" />
</FieldRefs>
<XmlDocuments>
<XmlDocument
NamespaceURI="http://schemas.microsoft.com/sharepoint/v3/contenttype/forms">
<FormTemplates xmlns="http://schemas.microsoft.com/sharepoint/v3/contenttype/forms">
<Display>ListForm</Display>
<Edit>ListForm</Edit>
<New>ListForm</New>
</FormTemplates>
</XmlDocument>
</XmlDocuments>
<Folder
TargetName="Item" />
</ContentType>
<ContentTypeRef
ID="0x01200066684BCED23D0D4CAEE3EB61649D788E" />
<ContentType
ID="0x01" />
<ContentType
ID="0x0120" />


But if you want to modify resulting data and compress it back, DeflateStream isn’t the best option, since you’d have to manually add zlib header and ADLER32 checksum.

Fortunately, DotNetZip Library provides easy static methods to compress/expand zlib streams: CompressBuffer/UncompressBuffer. I’ve tested them and SharePoint accepts zlib data generated by CompressBuffer if it’s paired with correct WSS Compressed Structure header.

### I wish there was an easier way to mess up my database!

Me too, so I’ve made a PowerShell module to get/set XML Schema and Content Types for SharePoint list directly in the content database. It’s using Warren F‘s Invoke-SqlCmd2 function, so you can grab/modify data from SharePoint content database without messing with SQL queries:

### Any tips on using it?

Sure. This example shows how to modify XML Schema for list with ID cff8ae4b-a78d-444c-8efd-5fe290821cb9, which is stored in SharePoint content database SP_CONTENT on server SQLSRV.

#### Using module

• Download module as Zip (unblock zip file before unpacking) or clone this repo using Git
• Import module:
Import-Module -Path 'X:\Path\To\WssCompressedStructure\Module'

• Backup XML Schema blob for SharePoint list to file:
Get-SpListWssCompressedStructure -ServerInstance SQLSRV -Database SP_CONTENT -Fields -ListId 'cff8ae4b-a78d-444c-8efd-5fe290821cb9' | Export-WssCompressedStructureBinary -DestinationPath 'X:\Wss\'

• Export XML Schema for SharePoint list to file:
Get-SpListWssCompressedStructure -ServerInstance SQLSRV -Database SP_CONTENT -Fields -ListId 'cff8ae4b-a78d-444c-8efd-5fe290821cb9' | Expand-WssCompressedStructure -DestinationPath 'X:\Wss\'

• Modify file cff8ae4b-a78d-444c-8efd-5fe290821cb9.xml to your needs

• Update XML Schema in database for this list:

New-WssCompressedStructure -Path 'X:\Wss\cff8ae4b-a78d-444c-8efd-5fe290821cb9.xml' | Set-SpListWssCompressedStructure -ServerInstance SQLSRV -Database SP_CONTENT -Fields -ListId 'cff8ae4b-a78d-444c-8efd-5fe290821cb9'

• If something goes wrong, restore XML Schema from blob:
'X:\Wss\cff8ae4b-a78d-444c-8efd-5fe290821cb9.bin' | Import-WssCompressedStructureBinary | Set-SpListWssCompressedStructure -ServerInstance SQLSRV -Database SP_CONTENT -Fields -ListId 'cff8ae4b-a78d-444c-8efd-5fe290821cb9'


#### Small note

If you’ve upgarded you SharePoint installation (2007 → 2010), some of the lists in database can still contain uncompressed XML data in tp_Fileds and tp_ContentTypes columns. My module checks returned data from SQL to be valid WSS Compressed Structures, and will ignore such lists. Keep that in mind, if Get-SpListWssCompressedStructure returns nothing.

### Last warning

If you want a guarantee, buy a toaster.
— Clint Eastwood

I hope I’ve stressed it enough, that by directly modifying SharePoint database, you’re voiding any chance of getting official support from Microsoft. So make sure you have backups, backups of backups and a plan of rebuilding your SharePoint farm. Happy hacking!

# Migrating a SVN repo to Git, part deux: SubGit to the rescue

To improve is to change; to be perfect is to change often.
— Winston Churchill

In my previous post about SVN→GIT conversion I’ve described steps to convert a nested SVN repo to GIT using svn2git, svndumpfilterIN, SVN::DumpReloc and some manual editing of SVN dump files.

This process worked fine for smaller repos, but after some threshold I’ve hit the wall: final conversion with svn2git for one of the larger repos was taking 5 days and was never quite finished because of Windows version of Git crashing in the middle of process. Those crashes were related to Cygwin’s implementation of fork which requires some address space to be reserved for Cygwin heap and 5 days long run was exhausting those addresses.

After a couple of attempts to convert a repo (which took about 2 weeks!), I’ve realized that I need a more robust and preferably faster solution. And that’s when I finally found SubGit!

SubGit is a tool for a smooth, stress-free SVN to Git migration. Create writable Git mirror of a local or remote Subversion repository and use both Subversion and Git as long as you like. You may also do a fast one-time import from Subversion to Git or use SubGit within Atlassian Bitbucket Server.

SubGit is a commercial closed-source Java application. Fortunately, it’s free for one-time conversions and mirroring for repos with up to 10 Git and SVN users. It also has time-trial version that will mirror repo with any amount of users for one month. If you’re daring enough, you can also use EAP or interim builds. Note that it seems that interim builds don’t have any time/user limits whatsoever.

With SubGit, I was able to convert abovementioned SVN repo to Git overnight without any extra steps, using this simple command:

 subgit import --svn-url http://server/svn/my/nested/repo --authors-file .\authors.txt .\repo.git 

SubGit also can filter/map SVN branches to Git ones which saves you from filtering SVN repo when you need to sync/migrate only parts of SVN repository. To do so, however, you need to install SubGit into the target SVN repository and configure it by modifying subgit.conf file.

Since SubGit requires JRE and I’m reluctant to install it system-wide, I’ve created a set of scripts to use Subgit with portable JRE. You can find them in this GitHub repository: SubGit-Portable.

Here is a step-by-step walkthrough of installing and configuring portable SubGit with JRE. Please note that SubGit can work in two modes: local and remote. This guide assumes local mode, when SVN sever, SubGit and Git (repo and executables) are on the same server.

##### Create portable SubGit installation
• Get latest SubGit release and unpack it to the SubGit directory inside the repo
• Get latest Oracle JRE/Server JRE and put it to the JRE_Installer directory inside the repo. You can use either exe installer or tar.gz archive.
• Double-click unpack_jre.cmd in JRE_Installer folder. It will unpack all exe installers/tar.gz archives to JRE_Portable folder.
• Edit subgit.cmd to set the name of the folder with portable JRE to be used. E.g: set jre_dir=jre-8u40-windows-x64

Now you can use subgit.cmd as if you were using original subgit.bat:

C:\Path\To\SubGit-Portable> subgit --version
SubGit version 3.1.1 ('Bobique') build #3448
(c) TMate Software 2012-2016 (http://subgit.com/)


##### Install SubGit into the SVN repo
• Run this command to install SubGit binaries into the target SVN repo: subgit configure X:\Path\To\SVN\Repo. You should see something like this:
C:\Path\To\SubGit-Portable> subgit configure X:\Path\To\SVN\Repo
SubGit version 3.1.1 ('Bobique') build #3448

Detecting paths eligible for translation... done.
Subversion to Git mapping has been configured:
<root> : X:\Path\To\SVN\Repo\.git

CONFIGURATION SUCCESSFUL

To complete SubGit installation do the following:

1) adjust SubGit configuration file, if necessary:
X:\Path\To\SVN\Repo\conf\subgit.conf
2) add custom authors mapping to the authors.txt file(s) at:
X:\Path\To\SVN\Repo\authors.txt
3) run SubGit 'install' command:
subgit install "X:\Path\To\SVN\Repo\"


• Create authors.txt that to map SVN commiters to to Git authors. You can use my New-GitSvnAuthorsFile script for this. See my previous post for details.

• Setup a mapping of projects in SVN repo to corresponding Git repositories. In my case, SVN repo held multiple projects and I needed to create Git repositories only for some of them. SubGit can detect nested repositories and by default will add all of them to the configuration file. So I had to edit conf\subgit.conf in SVN repo folder to remove unneeded ones. Here is example of configuration section for a single nested project:

[git "my/nested/repo"]

# Path within Subversion repository to the root of trunk/branches/tags structure.
translationRoot = my/nested/repo

# Path to the Git repository.
repository = X:/Git/repo
pathEncoding = UTF-8

# Options below (trunk, branches, tags, shelves) define correspondece between Subversion
# directories and Git references. Depending on the actual Subversion project layout and whether
# all or only some of the branches have to be mirrored, these options might need to be adjusted.

shelves = shelves/*:refs/shelves/*
tags = tags/*:refs/tags/*


• Start SVN→Git translation process: subgit install X:\Path\To\SVN\Repo

 C:\Path\To\SubGit-Portable> subgit install X:\Path\To\SVN\Repo SubGit version 3.1.1 ('Bobique') build #3448 Subversion to Git mapping has been found: /my/nested/repo: X:\Git\repo Translating Subversion revisions to Git commits... Subversion revisions translated: 54321. Total time: 6543 seconds. INSTALLATION SUCCESSFUL 

If something goes wrong, files named subgit-COMMAND-DATE-TIME.zip in the Subgit-Portable folder and X:\Path\To\SVN\Repo\subgit\logs\ contain detailed operations log and error messages.

Basically that’s all you have to do to get SubGit up and running!

##### Updating SubGit

SubGit is actively developed and from time to time you’d need to update your existing SubGit installation to the latest build.

• Get latest SubGit build
• Delete all files in Subgit-Portable\SubGit folder and unpack new build here
• Reinstall SubGit to the target SVN repo: subgit install X:\Path\To\SVN\Repo

For example, this is how upgrading from EAP build looks like:

 SubGit version 3.0.0-EAP ('Bobique') build #3262 This is an EAP build, which you may not like to use in production environment. Subversion to Git mapping has been found: /my/nested/repo: X:\Git\repo About to shut down background translation process. Background translation process is not running. SubGit binaries have been upgraded (3.0.0-EAP#3141 > 3.1.1#3448). Information on previously encountered errors is cleared. Processing '/my/nested/repo' Translating Subversion revisions to Git commits... Subversion revisions translated: 54321. Total time: 6543 seconds. INSTALLATION SUCCESSFUL 

I’ve been using SubGit with custom hooks to enable one-way Git mirroring to Visual Studio Online for a quite a long time and it works perfectly. So if you ever find yourself in need of painless SVN→GIt translation, give it a try!

# Writing stealth code in PowerShell

What happens in module, stays in module.

Most of my scripts are using Import-Component function to bulk-import dependencies (PS1 files with functions, modules, source code, .Net assemblies).

To import PS1 files with functions, they have to be dot-sourced and that provided me with some challenge: if PS1 is dot-sourced inside the function, it will be available only in that function’s scope. To overcome this, I could scope each contained function, alias, and variable as global (nasty!) or call Import-Component function itself using dot-sourcing (yes, you can dot-source more than just files).

For a while, dot-sourcing Import-Component seemed to work fine, until one day, I realized, that this effectively pollutes caller’s scope with all Import-Component‘s internal variables. Consider this example:

function DotSource-Me
{
$MyString = 'Internal variable' }$MyString = 'External variable'

# Calling function as usual
DotSource-Me
Write-Host "Function was called, 'MyString' contains: $MyString" # Dot-sourcing function . DotSource-Me Write-Host "Function was dot-sourced, 'MyString' contains:$MyString"


If we run this script, the output will be:

Function was called, 'MyString' contains: External variable
Function was dot-sourced, 'MyString' contains: Internal variable

As you can see, when the DotSource-Me function is called as usual, it’s internal variable is restricted to the function’s scope and doesn’t affect the caller’s scope. But when it’s dot-sourced, variable in the caller’s scope is overwritten.

To remedy this, we could take advantage of the fact, that creating a new module creates an entirely new SessionState. It means that everything that happens inside the module is completely isolated. So if we place all code inside the function in the dynamically generated module, it wouldn’t affect anything outside, even if dot-sourced. Also we don’t want to actually pollute caller’s scope with newly created module object. Luckily for us, the New-Module cmdlet has ReturnResult parameter, that runs the script block and returns the results instead of returning a module object. So lets modify our example:

function DotSource-Me
{
New-Module -ReturnResult -ScriptBlock {
$MyString = 'Internal variable' } }$MyString = 'External variable'

# Calling function as usual
DotSource-Me
Write-Host "Function was called, 'MyString' contains: $MyString" # Dot-sourcing function . DotSource-Me Write-Host "Function was dot-sourced, 'MyString' contains:$MyString"


And then run it and observe the results:

Function was called, 'MyString' contains: External variable
Function was dot-sourced, 'MyString' contains: External variable

That’s so much better!

But what if our function that has to be dot-sourced has parameters? Unfortunately, PowerShell will create variable for each parameter, and because function is dot-sourced, those variables will be created in the callers scope:

function DotSource-Me
{
Param
(
$MyString ) }$MyString = 'External variable'

# Calling function as usual
DotSource-Me
Write-Host "Function was called, 'MyString' contains: $MyString" # Dot-sourcing function . DotSource-Me Write-Host "Function was dot-sourced, 'MyString' contains:$MyString"


And they will pollute and\or overwrite variables in callers scope:

Function was called, 'MyString' contains: External variable
Function was dot-sourced, 'MyString' contains:

To mitigate this issue, we can exploit the fact that PowerShell doesn’t create corresponding variables for DynamicParameters. Note, that code in the DynamicParam block has to be wrapped in the New-Module too, otherwise it will be executed in caller’s scope

function DotSource-Me
{
[CmdletBinding()]
Param()
DynamicParam
{
New-Module -ReturnResult -ScriptBlock {
# Set the dynamic parameters name
$ParameterName = 'MyString' # Create the dictionary$RuntimeParameterDictionary = New-Object System.Management.Automation.RuntimeDefinedParameterDictionary

# Create the collection of attributes
$AttributeCollection = New-Object System.Collections.ObjectModel.Collection[System.Attribute] # Create and set the parameters' attributes$ParameterAttribute = New-Object System.Management.Automation.ParameterAttribute

# Add the attributes to the attributes collection
$AttributeCollection.Add($ParameterAttribute)

# Create and return the dynamic parameter
$RuntimeParameter = New-Object System.Management.Automation.RuntimeDefinedParameter($ParameterName,
[string],
$AttributeCollection)$RuntimeParameterDictionary.Add($ParameterName,$RuntimeParameter)
$RuntimeParameterDictionary } } }$MyString = "External variable"

# Calling function as usual
DotSource-Me
Write-Host "Function was called, 'MyString' contains: $MyString" # Dot-sourcing function . DotSource-Me Write-Host "Function was dot-sourced, 'MyString' contains:$MyString"


And the result is:

Function was called, 'MyString' contains: External variable
Function was dot-sourced, 'MyString' contains: External variable

To make things easier, you can put my New-DynamicParameter function inside the New-Module‘ scriptblock and use it like this:

function DotSource-Me
{
[CmdletBinding()]
Param()
DynamicParam
{
New-Module -ReturnResult -ScriptBlock {
Function New-DynamicParameter
{
# function body here...
}

New-DynamicParameter -Name MyString -Type ([string])
}
}
}

$MyString = "External variable" # Calling function as usual DotSource-Me Write-Host "Function was called, 'MyString' contains:$MyString"

# Dot-sourcing function
. DotSource-Me
Write-Host "Function was dot-sourced, 'MyString' contains: $MyString"  ### Bonus chapter What if we really need to execute something in caller’s scope from the New-Module‘s scriptblock? In Import-Component function, dot-sourcing command itself has to be executed in the caller’s scope, while all other code should be well-hidden in New-Module. To achieve a desired result I’m using a not-so-well-know fact, that scriptblocks are bound to the session state: Any script block that’s defined in a script or script module (in literal form, not dynamically created with something like [scriptblock]::Create()) is bound to the session state of that module (or to the “main” session state, if not executing inside a script module.) There is also information specific to the file that the script block came from, so things like breakpoints will work when the script block is invoked. When you pass in such a script block as a parameter across script module boundaries, it is still bound to its original scope, even if you invoke it from inside the module. Here is the final example: function DotSource-Me { [CmdletBinding()] Param() DynamicParam { New-Module -ReturnResult -ScriptBlock { # Set the dynamic parameters name$ParameterName = 'ScriptBlock'

# Create the dictionary
$RuntimeParameterDictionary = New-Object System.Management.Automation.RuntimeDefinedParameterDictionary # Create the collection of attributes$AttributeCollection = New-Object System.Collections.ObjectModel.Collection[System.Attribute]

# Create and set the parameters' attributes
$ParameterAttribute = New-Object System.Management.Automation.ParameterAttribute # Add the attributes to the attributes collection$AttributeCollection.Add($ParameterAttribute) # Create and return the dynamic parameter$RuntimeParameter = New-Object System.Management.Automation.RuntimeDefinedParameter(
$ParameterName, [scriptblock],$AttributeCollection)
$RuntimeParameterDictionary.Add($ParameterName, $RuntimeParameter)$RuntimeParameterDictionary
}
}

Process
{
New-Module -ReturnResult -ScriptBlock {
$MyString = "Internal variable" # Execute scriptblock &$PSBoundParameters.ScriptBlock
}
}
}

$MyString = "External variable"$MyScriptBlock = {Write-Host "Scriptblock, 'MyString' contains: $MyString"} Write-Host "Script, 'MyString' contains:$MyString"

# Dot-sourcing function
. DotSource-Me -ScriptBlock $MyScriptblock  Note that although$MyString variable is defined inside the New-Module‘s scriptblock, the code in the MyScriptBlock‘s parameter’s scriptblock is executed in the caller’s scope and accesses $MyString variable from there: Script. 'MyString' contains: External variable Scriptblock. 'MyString' contains: External variable # Dynamic parameters, ValidateSet and Enums Good intentions often get muddled with very complex execution. The last time the government tried to make taxes easier, it created a 1040 EZ form with a 52-page help booklet. — Brad D. Smith I suppose that many of you have heard about Dynamic Parameters, but thought of them as too complicated to implement in real-life scenarios. Just look at the amount of code you have to write to add one simple parameter with dynamic ValidateSet argument. Recently I had to write a fair amount of functions which use Enum‘s values as parameters (Special Folders, Access Rights, etc). Naturally, I’d like to have this parameters validated with ValidateSet and have tab-completion as a bonus. But this means to hardcode every enum’s member name in the ValidateSet argument. Today’s example is a function, that returns Special Folder path. It accepts one parameter Name, validates it values against all known folders names and returns filesystem paths. Here is how it looks with hardcoded ValidateSet: function Get-SpecialFolderPath { [CmdletBinding()] Param ( [Parameter(Mandatory =$true, ValueFromPipeline = $true, ValueFromPipelineByPropertyName =$true, Position = 0)]
[ValidateNotNullOrEmpty()]
[ValidateSet(
'Desktop', 'Programs', 'MyDocuments', 'Personal', 'Favorites', 'Startup', 'Recent', 'SendTo',
'StartMenu', 'MyMusic', 'MyVideos', 'DesktopDirectory', 'MyComputer', 'NetworkShortcuts', 'Fonts',
'ApplicationData', 'PrinterShortcuts', 'LocalApplicationData', 'InternetCache', 'Cookies', 'History',
'CommonApplicationData', 'Windows', 'System', 'ProgramFiles', 'MyPictures', 'UserProfile', 'SystemX86',
'ProgramFilesX86', 'CommonProgramFiles', 'CommonProgramFilesX86', 'CommonTemplates', 'CommonDocuments',
)]
[array]$Name ) Process {$Name | ForEach-Object { [Environment]::GetFolderPath($_) } } }  Not fancy, to say the least. Sidenote: if you wonder, did I typed all this ValidateSet argument, the answer is no. Here is trick that I’ve used to get all enum’s members strings enclosed in single quotes and comma-separated. Just copy and paste this snippet to the PowerShell console and get formatted enum list in your clipboard: PS C:\Users\beatcracker> "'$([Enum]::GetNames('System.Environment+SpecialFolder') -join "', '")'" | clip

As you see, the ValidateSet above is as bad as you can get: it’s large, it’s easy to make typo and it’s hardcoded. Whenever the new special folder is added to the Windows or it doesn’t exists in previous versions of OS this code will fail.

Let’s try to remedy this by using Dynamic Parameters. The following example is based on aforementioned TechNet article Dynamic ValidateSet in a Dynamic Parameter.

function Get-SpecialFolderPath
{
[CmdletBinding()]
Param()
DynamicParam
{
# Set the dynamic parameters name
$ParameterName = 'Name' # Create the dictionary$RuntimeParameterDictionary = New-Object System.Management.Automation.RuntimeDefinedParameterDictionary

# Create the collection of attributes
$AttributeCollection = New-Object System.Collections.ObjectModel.Collection[System.Attribute] # Create and set the parameters' attributes$ParameterAttribute = New-Object System.Management.Automation.ParameterAttribute
$ParameterAttribute.ValueFromPipeline =$true
$ParameterAttribute.ValueFromPipelineByPropertyName =$true
$ParameterAttribute.Mandatory =$true
$ParameterAttribute.Position = 0 # Add the attributes to the attributes collection$AttributeCollection.Add($ParameterAttribute) # Generate and set the ValidateSet$arrSet = [Enum]::GetNames('System.Environment+SpecialFolder')
$ValidateSetAttribute = New-Object System.Management.Automation.ValidateSetAttribute($arrSet)

# Add the ValidateSet to the attributes collection
$AttributeCollection.Add($ValidateSetAttribute)

# Create and return the dynamic parameter
$RuntimeParameter = New-Object System.Management.Automation.RuntimeDefinedParameter($ParameterName, [string], $AttributeCollection)$RuntimeParameterDictionary.Add($ParameterName,$RuntimeParameter)
return $RuntimeParameterDictionary } Begin { # Bind the parameter to a friendly variable$Name = $PsBoundParameters[$ParameterName]
}

Process
{
$Name | ForEach-Object { [Environment]::GetFolderPath($_) }
}
}


This version of function definitely better than the first one: no hardcoded values and it wouldn’t break on different versions of Windows. But it still feels clumsy to me and my inner perfectionist. The whole DynamicParam block is dedicated to the single parameter and modifying it to suit your needs may be a job worth a day.

What if you’d like to define multiple parameters dynamically, with different arguments (Mandatory, ValueFromPipeline, etc), belonging to different parameter sets and so on. Moreover, you’d like to do it in fast and efficient way.

The solution? Dynamically create dynamic parameters! Luckily, I was not a first preson to think about this and there are folks who have done tremendous job already: Justin Rich (blog, GitHub) and Warren F. (blog, GitHub):

So all I had to do is to improve their work a little. I’ve took a liberty to extend Warren’s New-DynamicParam function to support full range of attributes and made a recreation of the variables from the bound parameters a bit easier:

It drastically reduces the amount of hoops you’d have to jump through and makes your code clean and crisp. Let’s see how we can use it to create dynamic parameters from enum values in our Get-SpecialFolderPath function:

function Get-SpecialFolderPath
{
[CmdletBinding()]
Param()
DynamicParam
{
# Get special folder names for ValidateSet attribute
$SpecialFolders = [Enum]::GetNames('System.Environment+SpecialFolder') # Create new dynamic parameter New-DynamicParameter -Name Name -ValidateSet$SpecialFolders -Type ([array])
-Position 0 -Mandatory -ValueFromPipeline -ValueFromPipelineByPropertyName -ValidateNotNull
}

Process
{
# Bind dynamic parameter to a friendly variable
New-DynamicParameter -CreateVariables -BoundParameters $PSBoundParameters$Name | ForEach-Object { [Environment]::GetFolderPath($_) } } }  As you see, it takes only three lines of code to create new dynamic parameter. But the more the merrier, so how about several dynamic parameters? Here is the example taken directly from the help of the New-DynamicParameter function. It will create several dynamic parameters, with multiple Parameter Sets. In this example three dynamic parameters are created. Two of the parameters are belong to the different parameter set, so they are mutually exclusive. One of the parameters belongs to both parameter sets. • The Drive‘s parameter ValidateSet is populated with all available volumes on the computer. • The DriveType‘s parameter ValidateSet is populated with all available drive types. • The Precision‘s parameter controls number of digits after decimal separator for Free Space percentage. Usage: PS C:\Users\beatcracker> Get-FreeSpace -Drive <tab> -Precision 2 PS C:\Users\beatcracker> Get-FreeSpace -DriveType <tab> -Precision 2 Parameters are defined in the array of hashtables, which is then piped through the New-Object to create PSObject and pass it to the New-DynamicParameter function. If parameter with the same name already exist in the RuntimeDefinedParameterDictionary, a new Parameter Set is added to it. Because of piping, New-DynamicParameter function is able to create all parameters at once, thus eliminating need for you to create and pass external RuntimeDefinedParameterDictionary to it. function Get-FreeSpace { [CmdletBinding()] Param() DynamicParam { # Array of hashtables that hold values for dynamic parameters$DynamicParameters = @(
@{
Name = 'Drive'
Type = [array]
Position = 0
Mandatory = $true ValidateSet = ([System.IO.DriveInfo]::GetDrives()).Name ParameterSetName = 'Drive' }, @{ Name = 'DriveType' Type = [array] Position = 0 Mandatory =$true
ValidateSet = [System.Enum]::GetNames('System.IO.DriveType')
ParameterSetName = 'DriveType'
},
@{
Name = 'Precision'
Type = [int]
# This will add a Drive parameter set to the parameter
Position = 1
ParameterSetName = 'Drive'
},
@{
Name = 'Precision'
# Because the parameter already exists in the RuntimeDefinedParameterDictionary,
# this will add a DriveType parameter set to the parameter.
Position = 1
ParameterSetName = 'DriveType'
}
)

# Convert hashtables to PSObjects and pipe them to the New-DynamicParameter,
# to create all dynamic parameters in one function call.
$DynamicParameters | ForEach-Object {New-Object PSObject -Property$_} | New-DynamicParameter
}
Process
{
# Dynamic parameters don't have corresponding variables created,
# you need to call New-DynamicParameter with CreateVariables switch to fix that.
New-DynamicParameter -CreateVariables -BoundParameters $PSBoundParameters if($Drive)
{
$Filter = {$Drive -contains $_.Name} } elseif($DriveType)
{
$Filter = {$DriveType -contains  $_.DriveType} } if(!$Precision)
{
$Precision = 2 }$DriveInfo = [System.IO.DriveInfo]::GetDrives() | Where-Object $Filter$DriveInfo |
ForEach-Object {
if(!$_.TotalFreeSpace) {$FreePct = 0
}
else
{
$FreePct = [System.Math]::Round(($_.TotalSize / $_.TotalFreeSpace),$Precision)
}
New-Object -TypeName psobject -Property @{
Drive = $_.Name DriveType =$_.DriveType
'Free(%)' = $FreePct } } } }  There is more examples in the help, that should get you started with Dynamic Parameters in no time. If something doesn’t work for you, feel free to drop me a note, I’ll be happy to fix it. # Parameter validation gotchas I didn’t fail the test, I just found 100 ways to do it wrong. — Benjamin Franklin PowerShell’s parameter validation is a blessing. Validate parameters properly and you’ll never have to write a code that deals with erroneous user input. But sometimes dealing with Validation Attributes, requires a bit more knowledge that built-in help can provide. Here is what I’ve learned so far and want to share with you. • #### You can have more than one Validation Attribute This may seem trivial, but PowerShell’s help and various online tutorials do not mention this fact (they just imply). You can have as much Validation Attributes as you like for your parameter. For example, this function requires parameter Number to be even and fall in range from 1 to 256: function Test-MultipleValidationAttributes { [CmdLetBinding()] Param ( [Parameter(Mandatory =$true, ValueFromPipeline = $true)] [ValidateScript({ if($_ % 2)
{
throw 'Supply an even number!'
}
$true })] [ValidateRange(1,256)] [int]$Number
)

Process
{
Write-Host "Congratulations, $Number is an even and it's between 1 and 256!" } }  • #### You can have more than one instance of specific Validation Attribute While not especially useful, it still could be used in some practical cases. Imagine, that you have some complex logic behind attribute validation with the ValidateScript attribute. Then it could be split between several ValidateScript attributes to decrease script complexity. Here is the the function from above, modified to validate parameters using two ValidateScript attributes. As a bonus, you can throw more meaningful error messages: function Test-MultipleSimilarValidationAttributes { [CmdLetBinding()] Param ( [Parameter(Mandatory =$true, ValueFromPipeline = $true)] [ValidateScript({ if($_ % 2)
{
throw 'Supply an even number!'
}
$true })] [ValidateScript({ if($_ -lt 1 -or $_ -gt 256) { throw 'Supply number between 1 and 256!' }$true
})]
[int]$Number ) Process { Write-Host "Congratulations,$Number is an even and it's between 1 and 256!"
}
}

• #### Once parameter is validated, it stays validated

Once you’ve set the rules, they apply to everyone, even to you. Want proof? Check this out:

function Test-PersistentValidation
{
[CmdLetBinding()]
Param
(
[Parameter(Mandatory = $true, ValueFromPipeline =$true)]
[ValidateRange(1,256)]
[int]$Number ) Process {$Number = 0
}
}


Let’s see, where our rebellious spirit will lead us:

PS C:\Users\beatcracker> Test-PersistentValidation -Number 1
The variable cannot be validated because the value 0 is not a valid value for the Number variable.
At C:\Scripts\Test-Validation.ps1:13 char:9
+         $Number = 0 + ~~~~~~~~~~~ + CategoryInfo : MetadataError: (:) [], ValidationMetadataException + FullyQualifiedErrorId : ValidateSetFailure Not actually damnation of the soul, but close enough. This happens because PowerShell assigns Validation Attributes to the variable and they will stay there until variable is destroyed. You can view them, and I’ll show you how. Place a breakpoint on the line$Number = 0, call the function and wait for the debugger to pop up:

PS C:\Users\beatcracker> Test-PersistentValidation -Number 1
Hit Line breakpoint on 'C:\Scripts\Test-Validation.ps1:13'
[DBG]: PS C:\Scripts>> $tmp = Get-Variable Number [DBG]: PS C:\Scripts>>$tmp.Attributes

TypeId
------
System.Management.Automation.ArgumentTypeConverterAttribute
System.Management.Automation.ValidateRangeAttribute
System.Management.Automation.ParameterAttribute

[DBG]: PS C:\Scripts>> $tmp.Attributes[1] MinRange MaxRange TypeId -------- -------- ------ 1 256 System.Management.Automation.ValidateRangeAttribute And there is more to it! Starting with the PowerShell 3.0, you can place an attribute on any variable: PS C:\Users\beatcracker> [ValidateRange(1,256)][int]$Number = 1

PS C:\Users\beatcracker> $Number = 0 The variable cannot be validated because the value 0 is not a valid value for the Number variable. At line:1 char:1 +$Number = 0
+ ~~~~~~~~~~~
+ FullyQualifiedErrorId : ValidateSetFailure

• #### Validation Attributes order matters

This the one of those things you learn the hard way. It’s not actually mentioned anywhere as far as I know. Below is a function, that accepts optional parameter Path, that has to be existing folder on disk. Since there is no sense in running validation script if supplied parameter is empty, there is a ValidateNotNullOrEmpty attribute added.

function Test-ValidationAttributesOrder
{
[CmdLetBinding()]
Param
(
[Parameter(ValueFromPipelineByPropertyName = $true)] [ValidateNotNullOrEmpty()] [ValidateScript({ if(!(Test-Path -LiteralPath$_ -PathType Container))
{
throw "Folder not found: $_" }$true
})]
[string]$Path ) Process { if($Path)
{
Write-Host "Optional parameter Path is a valid folder!"
}
}
}


Hence, when the function is called with the empty parameter, I expect it to throw “The argument is null or empty” error message. But instead, the validation fails where I do not expect it to:

PS C:\Users\beatcracker> Test-ValidationAttributesOrder -Path ''
Test-ValidationAttributesOrder : Cannot validate argument on parameter 'Path'. Cannot bind argument to parameter 'LiteralPath' because it is an empty string.
At line:1 char:38
+ Test-ValidationAttributesOrder -Path ''
+                                      ~~
+ CategoryInfo          : InvalidData: (:) [Test-ValidationAttributesOrder], ParameterBindingValidationException
+ FullyQualifiedErrorId : ParameterArgumentValidationError,Test-ValidationAttributesOrder

As you see, my empty string passed to the Path parameter, completely missed ValidateNotNullOrEmpty attribute and landed directly at the validation script, where PowerShell rightly failed to bind it to LiteralPath parameter of the Test-Path cmdlet. Makes you wonder, doesn’t it? After some trial and error I was finally able to make sense of it:

#### PowerShell evaluates Validation Attributes in a Bottom to Top order

So, armed with this knowledge, let’s fix the function above and make it validate parameters in the correct order:

function Test-ValidationAttributesOrder
{
[CmdLetBinding()]
Param
(
[Parameter(ValueFromPipelineByPropertyName = $true)] [ValidateScript({ if(!(Test-Path -LiteralPath$_ -PathType Container))
{
throw "Folder not found: $_" }$true
})]
[ValidateNotNullOrEmpty()]
[string]$Path ) Process { if($Path)
{
Write-Host "Optional parameter Path is a valid folder!"
}
}
}


And test it, passing the empty string to the Path parameter:

PS C:\Users\beatcracker> Test-ValidationAttributesOrder -Path ''
Test-ValidationAttributesOrder : Cannot validate argument on parameter 'Path'. The argument is null or empty. Provide an argument that is not null or empty, and then try the command again.
At line:1 char:38
+ Test-ValidationAttributesOrder -Path ''
+                                      ~~
+ CategoryInfo          : InvalidData: (:) [Test-ValidationAttributesOrder], ParameterBindingValidationException
+ FullyQualifiedErrorId : ParameterArgumentValidationError,Test-ValidationAttributesOrder

Voilà! Now it validates that parameter is not null or empty before passing parameter value to the validation script.

# Migrating a SVN repo to Git: a tale of hacking my way through

If you’re just looking for an easy way to do SVN-Git migration, skip this post and go directly to the part two instead.

We become what we behold. We shape our tools, and thereafter our tools shape us.
― Marshall McLuhan

Lately I’ve orchestrated a SVN to Visual Studio Online migration for one of our projects. Our developers opted to use a Git as version control solution, instead of Team Foundation Version Control (TFVC). Also, we have a pure Windows environment, running VisualSVN Server, so I’ll provide Windows-specific tips along the way.

Git and SVN are quite different beasts, especially when it comes to access control and branching strategies. Because of that, simply using Git’s bidirectional bridge to Subversion called git svn will produce suboptimal results. You will end with all branches and tags as remote svn branches, whereas what you really want is git-native local branches and git tag objects.

To alleviate this issue, a number of solutions is available:

reposurgeon
A tool for editing version-control repository history reposurgeon enables risky operations that version-control systems don’t want to let you do, such as editing past comments and metadata and removing commits. It works with any version control system that can export and import git fast-import streams, including git, hg, fossil, bzr, CVS, and RCS. It can also read Subversion dump files directly and can thus be used to script production of very high-quality conversions from Subversion to any supported DVCS.
agito
Agito is (yet another) Subversion to Git conversion script.It is designed to do a better job of translating history than git-svn, which has some subtleties in the way it works that cause it to construct branch histories that are suboptimal in certain corner case scenarios.
svn2git
svn2git is a tiny utility for migrating projects from Subversion to Git while keeping the trunk, branches and tags where they should be. It uses git-svn to clone an svn repository and does some clean-up to make sure branches and tags are imported in a meaningful way, and that the code checked into master ends up being what’s currently in your svn trunk rather than whichever svn branch your last commit was in.

We are all wonderful, beautiful wrecks. That’s what connects us ― that we’re all broken, all beautifully imperfect.
― Emilio Estevez

Initially I’ve planned to use reposurgeon, because it’s clearly wins over other solutions:

There are many tools for converting repositories between version-control systems out there. This file explains why reposurgeon is the best of breed by comparing it to the competition.

The problems other repository-translation tools have come from ontological mismatches between their source and target systems – models of changesets, branching and tagging can differ in complicated ways. While these gaps can often be bridged by careful analysis, the techniques for doing so are algorithmically complex, difficult to test, and have ugly edge cases.

Furthermore, doing a really high-quality translation often requires human judgment about how to move artifacts – and what to discard. But most lifting tools are, unlike reposurgeon, designed as run-it-once batch processors that can only implement simple and mechanical rules.

Consequently, most repository-translation tools evade the harder problems. They produce a sort of pidgin rendering that crudely and partially copies the history from the source system to the target without fully translating it into native idioms, leaving behind metadata that would take more effort to move over or leaving it in the native format for the source system.

But pidgin repository translations are a kind of friction drag on future development, and are just plain unpleasant to use. So instead of evading the hard problems, reposurgeon tackles them head-on.

Reposurgeon is written in Python and author recommends to run it using PyPy as it provides substantial speed increase (for Windows, get the latest Python 2.7 compatible PyPy binary). Unfortunately, I wasn’t able to do much with it, because reposurgeon failed to read Subversion dump of my repo:

reposurgeon% read repo.svn
reposurgeon: from repo.svn......(0.03 sec) aborted by error.
reposurgeon: EOL not seen where expected, Content-Length incorrect at line 187

This was a bit unexpected, so I decided to put reposurgeon aside for a time being and try something else. Choosing between agito and svn2git, I chose latter, mostly because it’s seemed to be actively maintained, whereas agito last update was about a year ago. Also svn2git usage is more straightforward (no config file needed).

To setup svn2git on Windows, follow this steps:

• Install your favorite Git flavour (Git for Windows or plain Git)
• Get Ruby v1.9.x via RubyInstaller
• Start command prompt with Ruby
• cd c:\path\to\svn2git
• gem install jeweler
• gem install svn2git

My repo has a standard layout with branches and trunk (no tags), but it’s nested. According to the documentation converting it with svn2git should’ve been easy as this:

svn2git http://server/svn/my/nested/repo --notags --authors authors.txt --no-minimize-url --verbose

But after some processing, svn2git just gave up:

error: pathspec 'master' did not match any file(s) known to git.

Browsing issues on Github lead me to this: error: pathspec ‘master’ did not match any file(s) known to git. Common solutions are to delete .git folder and start conversion anew and explicitly specify –trunk, –branches and –tags (or –notags in my case). Needles to say, that none of that worked for me. After some meddling with svn2git options, I’ve concluded, that problems with nested repos are common and I’d better do something about it. Digging further, led me to the svndumpfilter command and a way to move repo contents to the root folder:

If you want your trunk, tags, and branches directories to live in the root of your repository, you might wish to edit your dump files, tweaking the Node-path and Node-copyfrom-path headers so that they no longer have that first calc/ path component. Also, you’ll want to remove the section of dump data that creates the calc directory. It will look something like the following:

Node-path: calc
Node-kind: dir
Content-length: 0

So, the first step would be to filter my nested repo from the dump:

svnfilter include "/nested/project" --drop-empty-revs < repo.svn > repo_filtered.svn

If svndumpfilter fails to process your dump (and that happens a lot) you might try svndumpfilterIN Python script. Beware, that on Windows, this script produces broken dumps due to CR+LF issues. To fix this you have to tell Python to open files in binary mode. Replacing this two lines in script:

with open(input_dump) as input_file:
with open(output_dump, 'a+') as output_file:

with

with open(input_dump, 'rb') as input_file:
with open(output_dump, 'ab+') as output_file:

will take care of this.

Update (02.01.2015): the issue above is fixed in the latest version of svndumpfilterIN (see this pull request). But I’ve faced another: when trying to filter heavily tangled repos, svndympfilterIN will crash while pulling large amount of tangled files from source repo. I was able to conjure a temporary workaround, see my issue on the GitHub: Crash when untangling large amount of files. Or just use my fork of the svndympfilterIN that has this any some other issues fixed and features added.

Example:

svndumpfilter.py repo.svn --repo=x:\svnpath\repo --output-dump=repo_filtered.svn include "nested/project" --stop-renumber-revs

Next, I’ve to search and replace all occurrences of /nested/project with /. There is a lot of sed on-liners available, but I’ve opted for SVN::DumpReloc Perl script. I’ve used Strawberry Perl to run it on Windows.

svn-dump-reloc "nested/project" "/" < repo_filtered.svn > repo_filtered_relocated.svn

But I can’t just directly import this dump to SVN, because due to relocation, the first commit will try to create a root directory (empty Node-path: entry), which is not allowed.

Revision-number: 123456
Prop-content-length: 111
Content-length: 111

K 7
svn:log
V 13
Start project
K 10
svn:author
V 3
John Doe
K 8
svn:date
V 27
2000-01-01T00:00:00.000000Z
PROPS-END

Node-path:
Node-kind: dir
Prop-content-length: 10
Content-length: 10

PROPS-END

Node-path: /subfolder
Node-kind: dir
Prop-content-length: 10
Content-length: 10

PROPS-END

The marked section should be removed. Make sure to use editor, that will handle big files and wouldn’t change anything else (like line endings). If revision contains only one entry, the whole revision should be removed. This could be done either by editing dump manually, or by using svndumpfilter‘s –revision parameter, to skip this commit altogether. In my case, I had to remove only one section in revision.

Revision-number: 123456
Prop-content-length: 111
Content-length: 111

K 7
svn:log
V 13
Start project
K 10
svn:author
V 3
John Doe
K 8
svn:date
V 27
2000-01-01T00:00:00.000000Z
PROPS-END

Node-path: /subfolder
Node-kind: dir
Prop-content-length: 10
Content-length: 10

PROPS-END

Then, I need to create a new SVN repo and load filtered and relocated dump:

svnadmin create x:\svnpath\newrepo
svnadmin load x:\svnpath\newrepo < repo_filtered_relocated.svn

Finally, let’s see if I’m able to run svn2git against new repo with success:

svn2git http://server/svn/newrepo --notags --authors authors.txt --verbose

And this time it works right and proper, so I can push my shiny new Git repo to the Visual Studio Online (don’t forget to setup alternate credentials):

git remote add origin https://project.visualstudio.com/DefaultCollection/_git/Project
git push -u origin --all


You can get much farther with a kind word and a PowerShell than you can with a kind word alone.

But thats not all, folks! This story wouldn’t be complete without some PowerShell lifesaver and I wouldn’t dream of disappointing you. Some of you may noticed, that svn2git requires authors file to map SVN commiters to to Git authors. There is plentiful of *nix solutions out there, but I needed a PowerShell one. Since we use VisualSVN Server, the SVN committers’ names are actually Windows domain accounts, so it also would be great to completely automate authors file creation using authors’ full names and emails from Active Directory.

First, I need to get the list of SVN committers for my repo. To do this, I’ve wrapped svn.exe -log command into the Powershell function Get-SvnAuthor. It returns the list of unique commit authors in one or more SVN repositories. I’m listing it here for your convenience, but if you intend to use it, grab instead the latest version from my GitHub repo.

<# .Synopsis Get list of unique commit authors in SVN repository. .Description Get list of unique commit authors in one or more SVN repositories. Requires Subversion binaries. .Parameter Url This parameter is required. An array of strings representing URLs to the SVN repositories. .Parameter User This parameter is optional. A string specifying username for SVN repository. .Parameter Password This parameter is optional. A string specifying password for SVN repository. .Parameter SvnPath This parameter is optional. A string specifying path to the svn.exe. Use it if Subversion binaries is not in your path variable, or you wish to use specific version. .Example Get-SvnAuthor -Url 'http://svnserver/svn/project' Description ----------- Get list of unique commit authors for SVN repository http://svnserver/svn/project .Example Get-SvnAuthor -Url 'http://svnserver/svn/project' -User john -Password doe Description ----------- Get list of unique commit authors for SVN repository http://svnserver/svn/project using username and password. .Example Get-SvnAuthor -Url 'http://svnserver/svn/project' -SvnPath 'C:\Program Files (x86)\VisualSVN Server\bin\svn.exe' Description ----------- Get list of unique commit authors for SVN repository http://svnserver/svn/project using custom svn.exe binary. .Example Get-SvnAuthor -Url 'http://svnserver/svn/project_1', 'http://svnserver/svn/project_2' Description ----------- Get list of unique commit authors for two SVN repositories: http://svnserver/svn/project_1 and http://svnserver/svn/project_2. .Example 'http://svnserver/svn/project_1', 'http://svnserver/svn/project_2' | Get-SvnAuthor Description ----------- Get list of unique commit authors for two SVN repositories: http://svnserver/svn/project_1 and http://svnserver/svn/project_2. #>
function Get-SvnAuthor
{
[CmdletBinding()]
Param
(
[Parameter(Mandatory = $true, ValueFromPipeline =$true, ValueFromPipelineByPropertyName = $true)] [ValidateNotNullOrEmpty()] [string[]]$Url,

[Parameter(ValueFromPipelineByPropertyName = $true)] [ValidateNotNullOrEmpty()] [string]$User,

[Parameter(ValueFromPipelineByPropertyName = $true)] [ValidateNotNullOrEmpty()] [string]$Password,

[ValidateScript({
if(Test-Path -LiteralPath $_ -PathType Leaf) {$true
}
else
{
throw "$_ not found!" } })] [ValidateNotNullOrEmpty()] [string]$SvnPath = 'svn.exe'
)

Begin
{
if(!(Get-Command -Name $SvnPath -CommandType Application -ErrorAction SilentlyContinue)) { throw "$SvnPath not found!"
}
$ret = @() } Process {$Url | ForEach-Object {
$SvnCmd = @('log',$_, '--xml', '--quiet', '--non-interactive') + $(if($User){@('--username', $User)}) +$(if($Password){@('--password',$Password)})
$SvnLog = &$SvnPath $SvnCmd *>&1 if($LastExitCode)
{
Write-Error ($SvnLog | Out-String) } else {$ret += [xml]$SvnLog | ForEach-Object {$_.log.logentry.author}
}
}
}

End
{
$ret | Sort-Object -Unique } }  Second, I need to actually grab authors info from Active Directory and save resulting file. This is the job for my another script ― New-GitSvnAuthorsFile. It uses Get-SvnAuthor function, so place it alongside with it. <# .Synopsis Generate authors file for SVN to Git migration. Can map SVN authors to domain accounts and get full names and emails from Active Directiry. .Description Generate authors file for one or more SVN repositories. Can map SVN authors to domain accounts and get full names and emails from Active Directiry Requires Subversion binaries and Get-SvnAuthor function: https://github.com/beatcracker/Powershell-Misc/blob/master/Get-SvnAuthor.ps1 .Notes Author: beatcracker (https://beatcracker.wordpress.com, https://github.com/beatcracker) License: Microsoft Public License (http://opensource.org/licenses/MS-PL) .Component Requires Subversion binaries and Get-SvnAuthor function: https://github.com/beatcracker/Powershell-Misc/blob/master/Get-SvnAuthor.ps1 .Parameter Url This parameter is required. An array of strings representing URLs to the SVN repositories. .Parameter Path This parameter is optional. A string representing path, where to create authors file. If not specified, new authors file will be created in the script directory. .Parameter ShowOnly This parameter is optional. If this switch is specified, no file will be created and script will output collection of author names and emails. .Parameter QueryActiveDirectory This parameter is optional. A switch indicating whether or not to query Active Directory for author full name and email. Supports the following formats for SVN author name: john, domain\john, john@domain .Parameter User This parameter is optional. A string specifying username for SVN repository. .Parameter Password This parameter is optional. A string specifying password for SVN repository. .Parameter SvnPath This parameter is optional. A string specifying path to the svn.exe. Use it if Subversion binaries is not in your path variable, or you wish to use specific version. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' Description ----------- Create authors file for SVN repository http://svnserver/svn/project. New authors file will be created in the script directory. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' -QueryActiveDirectory Description ----------- Create authors file for SVN repository http://svnserver/svn/project. Map SVN authors to domain accounts and get full names and emails from Active Directiry. New authors file will be created in the script directory. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' -ShowOnly Description ----------- Create authors list for SVN repository http://svnserver/svn/project. Map SVN authors to domain accounts and get full names and emails from Active Directiry. No authors file will be created, instead script will return collection of objects. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' -Path c:\authors.txt Description ----------- Create authors file for SVN repository http://svnserver/svn/project. New authors file will be created as c:\authors.txt .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' -User john -Password doe Description ----------- Create authors file for SVN repository http://svnserver/svn/project using username and password. New authors file will be created in the script directory. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project' -SvnPath 'C:\Program Files (x86)\VisualSVN Server\bin\svn.exe' Description ----------- Create authors file for SVN repository http://svnserver/svn/project using custom svn.exe binary. New authors file will be created in the script directory. .Example New-GitSvnAuthorsFile -Url 'http://svnserver/svn/project_1', 'http://svnserver/svn/project_2' Description ----------- Create authors file for two SVN repositories: http://svnserver/svn/project_1 and http://svnserver/svn/project_2. New authors file will be created in the script directory. .Example 'http://svnserver/svn/project_1', 'http://svnserver/svn/project_2' | New-GitSvnAuthorsFile Description ----------- Create authors file for two SVN repositories: http://svnserver/svn/project_1 and http://svnserver/svn/project_2. New authors file will be created in the script directory. #> [CmdletBinding()] Param ( [Parameter(Mandatory =$true, ValueFromPipeline = $true, ValueFromPipelineByPropertyName =$true, ParameterSetName = 'Save')]
[Parameter(Mandatory = $true, ValueFromPipeline =$true, ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Show')] [string[]]$Url,

[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Save')] [ValidateScript({$ParentFolder = Split-Path -LiteralPath $_ if(!(Test-Path -LiteralPath$ParentFolder  -PathType Container))
{
throw "Folder doesn't exist: $ParentFolder" } else {$true
}
})]
[ValidateNotNullOrEmpty()]
[string]$Path = (Join-Path -Path (Split-Path -Path$script:MyInvocation.MyCommand.Path) -ChildPath 'authors'),

[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Show')] [switch]$ShowOnly,

[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Save')] [Parameter(ValueFromPipelineByPropertyName =$true, ParameterSetName = 'Show')]
[switch]$QueryActiveDirectory, [Parameter(ValueFromPipelineByPropertyName =$true, ParameterSetName = 'Save')]
[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Show')] [string]$User,

[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Save')] [Parameter(ValueFromPipelineByPropertyName =$true, ParameterSetName = 'Show')]
[string]$Password, [Parameter(ValueFromPipelineByPropertyName =$true, ParameterSetName = 'Save')]
[Parameter(ValueFromPipelineByPropertyName = $true, ParameterSetName = 'Show')] [string]$SvnPath
)

# Dotsource 'Get-SvnAuthor' function:
# https://github.com/beatcracker/Powershell-Misc/blob/master/Get-SvnAuthor.ps1
$ScriptDir = Split-Path$script:MyInvocation.MyCommand.Path
. (Join-Path -Path $ScriptDir -ChildPath 'Get-SvnAuthor.ps1') # Strip extra parameters or splatting will fail$Param = @{} + $PSBoundParameters 'ShowOnly', 'QueryActiveDirectory', 'Path' | ForEach-Object {$Param.Remove($_)} # Get authors in SVN repo$Names = Get-SvnAuthor @Param
[System.Collections.SortedList]$ret = @{} # Exit, if no authors found if(!$Names)
{
Exit
}

# Find full name and email for every author
foreach($name in$Names)
{
$Email = '' if($QueryActiveDirectory)
{
# Get account name from commit author name in any of the following formats:
# john, domain\john, john@domain
$Local:tmp =$name -split '(@|\\)'
switch ($Local:tmp.Count) { 1 {$SamAccountName = $Local:tmp[0] ; break } 3 { if($Local:tmp[1] -eq '\')
{
[array]::Reverse($Local:tmp) }$SamAccountName = $Local:tmp[0] break } default {$SamAccountName = $null} } # Lookup account details if($SamAccountName)
{
$UserProps = ([adsisearcher]"(samaccountname=$SamAccountName)").FindOne().Properties

if($UserProps) { Try {$Email = '{0} <{1}>' -f $UserProps.displayname[0],$UserProps.mail[0]
}
Catch{}
}
}
}

$ret += @{$name = $Email} } if($ShowOnly)
{
$ret } else { # Use System.IO.StreamWriter to write a file with Unix newlines. # It's also significally faster then Add\Set-Content Cmdlets. Try { #StreamWriter Constructor (String, Boolean, Encoding): http://msdn.microsoft.com/en-us/library/f5f5x7kt.aspx$StreamWriter = New-Object -TypeName System.IO.StreamWriter -ArgumentList $Path,$false,  ([System.Text.Encoding]::ASCII)
}
Catch
{
throw "Can't create file: $Path" }$StreamWriter.NewLine = "n"

foreach($item in$ret.GetEnumerator())
{
$Local:tmp = '{0} = {1}' -f$item.Key, $item.Value$StreamWriter.WriteLine($Local:tmp) }$StreamWriter.Flush()
\$StreamWriter.Close()
}


And that’s all I need to create a fully functional authors file for my SVN repository:

.\New-GitSvnAuthorsFile.ps1 -Url 'http://server/svn/newrepo' -Path 'c:\svn2git\authors.txt' -QueryActiveDirectory

Here is the sample authors file, created by the command above:

john@domain = John Doe <john.doe@mycompany.com>
domain\jane = Jane Doe <jane.doe@mycompany.com>
doe = Doe <doe@mycompany.com>


And now that’s all for today, enjoy your winter holidays and stay tuned for more!