Get/set XML Schema and Content Types for SharePoint list directly in the content database

There is a charm about the forbidden that makes it unspeakably desirable.
— Mark Twain

Why would you do it?

Sometimes, despite all the warnings, you need to modify XML Schema and/or Content Types for SharePoint list directly in the content database. This could be caused by moving stuff, failed upgrade or removed SharePoint feature that resulted in broken lists.

In SharePoint 2007 and earlier that was fairly easy: you could just fire up SQL Management Studio, dig into the content database and fix it there: list’s Fields, that are part of list’s XML Schema are stored in tp_Fields column and Content Types are stored in the tp_ContentTypes column of the AllLists table in the content database.

So, what’s changed?

Starting with SharePoint 2010 most of the columns which contained plain XML (content type definitions, views, etc…) or BLOBs are now compressed in the database.

With luck and some googling around I’ve found that compressed objects format is documented in [MS-WSSFO3]: Windows SharePoint Services (WSS): File Operations Database Communications Version 3 Protocol. Those objects are called WSS Compressed Structures and consist of simple header followed by zlib comressed string.

Zlib streams can be extracted using System.IO.Compression.DeflateStream class. To do so, you have to skip first 14 bytes: 12 bytes for WSS Compressed Structure header and 2 bytes for zlib stream header, since DeflateStream doesn’t understand these. Here is what decompressed data (beautified, it actually stored as one long string) looks like:

Fileds:

15.0.0.4701.0.0
<FieldRef
    Name="ContentTypeId" />
<FieldRef
    Name="Title"
    ColName="nvarchar1" />
<FieldRef
    Name="_ModerationComments"
    ColName="ntext1" />
<FieldRef
    Name="File_x0020_Type"
    ColName="nvarchar2" />
<Field
    ID="{246d0907-637c-46b7-9aa0-0bb914daa832}"
    Name="_Author"
    Group="$Resources:core,Document_Columns;"
    Type="Text"
    DisplayName="$Resources:core,Author;"
    SourceID="http://schemas.microsoft.com/sharepoint/v3/fields"
    StaticName="_Author"
    Description="$Resources:core,_AuthorDesc;"
    Sealed="TRUE"
    AllowDeletion="TRUE"
    ShowInFileDlg="FALSE"
    ColName="nvarchar3"
    RowOrdinal="0" />
<Field
    ID="{875fab27-6e95-463b-a4a6-82544f1027fb}"
    Name="RelatedIssues"
    Group="$Resources:core,Extended_Columns;"
    Type="LookupMulti"
    Mult="TRUE"
    DisplayName="$Resources:core,Related_Issues;"
    SourceID="http://schemas.microsoft.com/sharepoint/v3"
    StaticName="RelatedIssues"
    PrependId="TRUE"
    List="Self"
    ShowField="Title"
    ColName="int1"
    RowOrdinal="0" />

Content Types:

<ContentType
    ID="0x01005144F19DD8291D42BAAA922235A381BD"
    Name="$Resources:core,Item;"
    Group="$Resources:core,List_Content_Types;"
    Description="$Resources:core,ItemCTDesc;"
    Version="4"
    FeatureId="{695b6570-a48b-4a8e-8ea5-26ea7fc1d162}">
    <FieldRefs>
        <FieldRef
            ID="{c042a256-787d-4a6f-8a8a-cf6ab767f12d}"
            Name="ContentType" />
        <FieldRef
            ID="{fa564e0f-0c70-4ab9-b863-0177e6ddd247}"
            Name="Title"
            Required="TRUE"
            ShowInNewForm="TRUE"
            ShowInEditForm="TRUE" />
        <FieldRef
            ID="{246d0907-637c-46b7-9aa0-0bb914daa832}"
            Name="_Author" />
    </FieldRefs>
    <XmlDocuments>
        <XmlDocument
            NamespaceURI="http://schemas.microsoft.com/sharepoint/v3/contenttype/forms">
            <FormTemplates xmlns="http://schemas.microsoft.com/sharepoint/v3/contenttype/forms">
                <Display>ListForm</Display>
                <Edit>ListForm</Edit>
                <New>ListForm</New>
            </FormTemplates>
        </XmlDocument>
    </XmlDocuments>
    <Folder
        TargetName="Item" />
</ContentType>
<ContentTypeRef
    ID="0x01200066684BCED23D0D4CAEE3EB61649D788E" />
<ContentType
    ID="0x01" />
<ContentType
    ID="0x0120" />

But if you want to modify resulting data and compress it back, DeflateStream isn’t the best option, since you’d have to manually add zlib header and ADLER32 checksum.

Fortunately, DotNetZip Library provides easy static methods to compress/expand zlib streams: CompressBuffer/UncompressBuffer. I’ve tested them and SharePoint accepts zlib data generated by CompressBuffer if it’s paired with correct WSS Compressed Structure header.

I wish there was an easier way to mess up my database!

Me too, so I’ve made a PowerShell module to get/set XML Schema and Content Types for SharePoint list directly in the content database. It’s using Warren F‘s Invoke-SqlCmd2 function, so you can grab/modify data from SharePoint content database without messing with SQL queries:

Any tips on using it?

Sure. This example shows how to modify XML Schema for list with ID cff8ae4b-a78d-444c-8efd-5fe290821cb9, which is stored in SharePoint content database SP_CONTENT on server SQLSRV.

Finding list GUID

Using module

  • Download module as Zip (unblock zip file before unpacking) or clone this repo using Git
  • Import module:
    Import-Module -Path 'X:\Path\To\WssCompressedStructure\Module'
    
  • Backup XML Schema blob for SharePoint list to file:
    Get-SpListWssCompressedStructure -ServerInstance SQLSRV -Database SP_CONTENT -Fields -ListId 'cff8ae4b-a78d-444c-8efd-5fe290821cb9' | Export-WssCompressedStructureBinary -DestinationPath 'X:\Wss\'
    
  • Export XML Schema for SharePoint list to file:
    Get-SpListWssCompressedStructure -ServerInstance SQLSRV -Database SP_CONTENT -Fields -ListId 'cff8ae4b-a78d-444c-8efd-5fe290821cb9' | Expand-WssCompressedStructure -DestinationPath 'X:\Wss\'
    
  • Modify file cff8ae4b-a78d-444c-8efd-5fe290821cb9.xml to your needs

  • Update XML Schema in database for this list:

    New-WssCompressedStructure -Path 'X:\Wss\cff8ae4b-a78d-444c-8efd-5fe290821cb9.xml' | Set-SpListWssCompressedStructure -ServerInstance SQLSRV -Database SP_CONTENT -Fields -ListId 'cff8ae4b-a78d-444c-8efd-5fe290821cb9'
    
  • If something goes wrong, restore XML Schema from blob:
    'X:\Wss\cff8ae4b-a78d-444c-8efd-5fe290821cb9.bin' | Import-WssCompressedStructureBinary | Set-SpListWssCompressedStructure -ServerInstance SQLSRV -Database SP_CONTENT -Fields -ListId 'cff8ae4b-a78d-444c-8efd-5fe290821cb9'
    

Small note

If you’ve upgarded you SharePoint installation (2007 → 2010), some of the lists in database can still contain uncompressed XML data in tp_Fileds and tp_ContentTypes columns. My module checks returned data from SQL to be valid WSS Compressed Structures, and will ignore such lists. Keep that in mind, if Get-SpListWssCompressedStructure returns nothing.

Last warning

If you want a guarantee, buy a toaster.
— Clint Eastwood

I hope I’ve stressed it enough, that by directly modifying SharePoint database, you’re voiding any chance of getting official support from Microsoft. So make sure you have backups, backups of backups and a plan of rebuilding your SharePoint farm. Happy hacking!

Writing stealth code in PowerShell

What happens in module, stays in module.

Most of my scripts are using Import-Component function to bulk-import dependencies (PS1 files with functions, modules, source code, .Net assemblies).

To import PS1 files with functions, they have to be dot-sourced and that provided me with some challenge: if PS1 is dot-sourced inside the function, it will be available only in that function’s scope. To overcome this, I could scope each contained function, alias, and variable as global (nasty!) or call Import-Component function itself using dot-sourcing (yes, you can dot-source more than just files).

For a while, dot-sourcing Import-Component seemed to work fine, until one day, I realized, that this effectively pollutes caller’s scope with all Import-Component‘s internal variables. Consider this example:

function DotSource-Me
{
    $MyString = 'Internal variable'
}

$MyString = 'External variable'

# Calling function as usual
DotSource-Me
Write-Host "Function was called, 'MyString' contains: $MyString"

# Dot-sourcing function
. DotSource-Me
Write-Host "Function was dot-sourced, 'MyString' contains: $MyString"

If we run this script, the output will be:

Function was called, 'MyString' contains: External variable
Function was dot-sourced, 'MyString' contains: Internal variable

As you can see, when the DotSource-Me function is called as usual, it’s internal variable is restricted to the function’s scope and doesn’t affect the caller’s scope. But when it’s dot-sourced, variable in the caller’s scope is overwritten.

To remedy this, we could take advantage of the fact, that creating a new module creates an entirely new SessionState. It means that everything that happens inside the module is completely isolated. So if we place all code inside the function in the dynamically generated module, it wouldn’t affect anything outside, even if dot-sourced. Also we don’t want to actually pollute caller’s scope with newly created module object. Luckily for us, the New-Module cmdlet has ReturnResult parameter, that runs the script block and returns the results instead of returning a module object. So lets modify our example:

function DotSource-Me
{
    New-Module -ReturnResult -ScriptBlock {
        $MyString = 'Internal variable'
    }
}

$MyString = 'External variable'

# Calling function as usual
DotSource-Me
Write-Host "Function was called, 'MyString' contains: $MyString"

# Dot-sourcing function
. DotSource-Me
Write-Host "Function was dot-sourced, 'MyString' contains: $MyString"

And then run it and observe the results:

Function was called, 'MyString' contains: External variable
Function was dot-sourced, 'MyString' contains: External variable

That’s so much better!

But what if our function that has to be dot-sourced has parameters? Unfortunately, PowerShell will create variable for each parameter, and because function is dot-sourced, those variables will be created in the callers scope:

function DotSource-Me
{
    Param
    (
        $MyString
    )
}

$MyString = 'External variable'

# Calling function as usual
DotSource-Me
Write-Host "Function was called, 'MyString' contains: $MyString"

# Dot-sourcing function
. DotSource-Me
Write-Host "Function was dot-sourced, 'MyString' contains: $MyString"

And they will pollute and\or overwrite variables in callers scope:

Function was called, 'MyString' contains: External variable
Function was dot-sourced, 'MyString' contains:

To mitigate this issue, we can exploit the fact that PowerShell doesn’t create corresponding variables for DynamicParameters. Note, that code in the DynamicParam block has to be wrapped in the New-Module too, otherwise it will be executed in caller’s scope

function DotSource-Me
{
    [CmdletBinding()]
    Param()
    DynamicParam
    {
        New-Module -ReturnResult -ScriptBlock {
            # Set the dynamic parameters name
            $ParameterName = 'MyString'

            # Create the dictionary
            $RuntimeParameterDictionary = New-Object System.Management.Automation.RuntimeDefinedParameterDictionary

            # Create the collection of attributes
            $AttributeCollection = New-Object System.Collections.ObjectModel.Collection[System.Attribute]

            # Create and set the parameters' attributes
            $ParameterAttribute = New-Object System.Management.Automation.ParameterAttribute

            # Add the attributes to the attributes collection
            $AttributeCollection.Add($ParameterAttribute)

            # Create and return the dynamic parameter
            $RuntimeParameter = New-Object System.Management.Automation.RuntimeDefinedParameter(
                        $ParameterName,
                        [string],
                        $AttributeCollection)
            $RuntimeParameterDictionary.Add($ParameterName, $RuntimeParameter)
            $RuntimeParameterDictionary
        }
    }
}

$MyString = "External variable"

# Calling function as usual
DotSource-Me
Write-Host "Function was called, 'MyString' contains: $MyString"

# Dot-sourcing function
. DotSource-Me
Write-Host "Function was dot-sourced, 'MyString' contains: $MyString"

And the result is:

Function was called, 'MyString' contains: External variable
Function was dot-sourced, 'MyString' contains: External variable

To make things easier, you can put my New-DynamicParameter function inside the New-Module‘ scriptblock and use it like this:

function DotSource-Me
{
    [CmdletBinding()]
    Param()
    DynamicParam
    {
        New-Module -ReturnResult -ScriptBlock {
            Function New-DynamicParameter
            {
                # function body here...
            }

            New-DynamicParameter -Name MyString -Type ([string])
        }
    }
}

$MyString = "External variable"

# Calling function as usual
DotSource-Me
Write-Host "Function was called, 'MyString' contains: $MyString"

# Dot-sourcing function
. DotSource-Me
Write-Host "Function was dot-sourced, 'MyString' contains: $MyString"

Bonus chapter

What if we really need to execute something in caller’s scope from the New-Module‘s scriptblock? In Import-Component function, dot-sourcing command itself has to be executed in the caller’s scope, while all other code should be well-hidden in New-Module. To achieve a desired result I’m using a not-so-well-know fact, that scriptblocks are bound to the session state:

Any script block that’s defined in a script or script module (in literal form, not dynamically created with something like [scriptblock]::Create()) is bound to the session state of that module (or to the “main” session state, if not executing inside a script module.) There is also information specific to the file that the script block came from, so things like breakpoints will work when the script block is invoked.

When you pass in such a script block as a parameter across script module boundaries, it is still bound to its original scope, even if you invoke it from inside the module.

Here is the final example:

function DotSource-Me
{
    [CmdletBinding()]
    Param()
    DynamicParam
    {
        New-Module -ReturnResult -ScriptBlock {
            # Set the dynamic parameters name
            $ParameterName = 'ScriptBlock'

            # Create the dictionary
            $RuntimeParameterDictionary = New-Object System.Management.Automation.RuntimeDefinedParameterDictionary

            # Create the collection of attributes
            $AttributeCollection = New-Object System.Collections.ObjectModel.Collection[System.Attribute]

            # Create and set the parameters' attributes
            $ParameterAttribute = New-Object System.Management.Automation.ParameterAttribute

            # Add the attributes to the attributes collection
            $AttributeCollection.Add($ParameterAttribute)

            # Create and return the dynamic parameter
            $RuntimeParameter = New-Object System.Management.Automation.RuntimeDefinedParameter(
                        $ParameterName,
                        [scriptblock],
                        $AttributeCollection)
            $RuntimeParameterDictionary.Add($ParameterName, $RuntimeParameter)
            $RuntimeParameterDictionary
        }
    }

    Process
    {
        New-Module -ReturnResult -ScriptBlock {
            # Assign internal variable
            $MyString = "Internal variable"

            # Execute scriptblock
            & $PSBoundParameters.ScriptBlock
        }
    }
}

$MyString = "External variable"
$MyScriptBlock = {Write-Host "Scriptblock, 'MyString' contains: $MyString"}

Write-Host "Script, 'MyString' contains: $MyString"

# Dot-sourcing function
. DotSource-Me -ScriptBlock $MyScriptblock

Note that although $MyString variable is defined inside the New-Module‘s scriptblock, the code in the MyScriptBlock‘s parameter’s scriptblock is executed in the caller’s scope and accesses $MyString variable from there:

Script. 'MyString' contains: External variable
Scriptblock. 'MyString' contains: External variable