In my last post I talked about maintaining the workflow history lists throughout SharePoint. This post is about maintaining Nintex workflow progress data. This data is found in the WorkflowProgress table in a Nintex content database.
When a Nintex Workflow is executed, each action is recorded in the WorkflowProgress table. This gives the system the opportunity to show a graphical representation of the workflow history for a specific instance. You can imagine that this table can contain a lot of information. Nintex recommends to keep this table below 15 million rows to avoid performance issues.
To maintain this table, you can use the PurgeWorkflowData operation of nwadmin.
The PurgeWorkflowData operation has a lot of parameters to specify which records it needs to purge. These parameters might not be enough to limit the records you want to purge. There’s a little catch in purging workflow progress data. To allow proper maintenance of the workflow history data, you need to make sure that you only purge workflow progress data for workflow instances where NO workflow history exists anymore. This means that there’s an order in how to maintain both the history lists and the workflowprogress table:
- Workflow History Lists
- WorkflowProgress table
If you purge the WorkflowProgress table before you clean the workflow history, you will end up with history list items which cannot be purged anymore in a selective way (using the ‘PurgeHistoryListData’ operation of nwadmin). You can only purge them by clearing the list completely.
If you read my previous post, I offered an alternative way for maintaining the history lists and one of the options was to clear the list based on the type of event. This basically means that for a lot of workflow instances, some sort of history will be retained. So, you need to take this into account when maintaining the WorkflowProgress table and make sure you don’t remove the workflow progress data for those instances.
To be able to do this, I created a script which does the following:
- Query the WorkflowProgress table of a Nintex content database to get all workflowProgress data which is created at least 6 months ago.
- It tries to resolve the Site, Web and HistoryList for each workflow instance it found. When the resolution of one of these fail, the instance is obsolete and can be removed from the table.
- If the resolution succeeds, a CAML query is performed to see if the instance has any history data in the history list. When nothing is found, the instance is obsolete and can be removed from the table.
- For each obsolete instance, a SQL query is executed to delete all workflow progress for that instance
If you specify a -LogOnly switch, nothing is deleted. All obsolete instances are written to a file.
Be aware that this script can run for a long time, depending on the amount of instances it finds and needs to check.
Script
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 |
<# .SYNOPSIS Delete obsolete WorkflowProgress data. .DESCRIPTION Delete obsolete workflowProgress data. .PARAMETER DBServer Specifies the name of the database server where the Nintex database exists. If omitted, the local connection will be used. .PARAMETER DBName Specifies the name of the Nintex content database to target. If omitted, all content databases will be in scope. .PARAMETER LogOnly Include this switch to run in log-only mode to simulate the actual cancellations. .NOTES File Name: Delete-WorkflowProgressData.ps1 Author : Bart Kuppens Version : 1.0 .EXAMPLE PS C:\> .\Delete-WorkflowProgressData.ps1 -DBServer SHPDB -DBName NTX_PRD_Content Description ----------- This script will delete all rows in the WorflowProgress table of the specified database for those instances for which no Workflow History was found. #> [CmdletBinding()] param( [Parameter(Position=0,Mandatory=$false,ValueFromPipeline=$false)] [string]$DBServer, [Parameter(Position=1,Mandatory=$false,ValueFromPipeline=$false)] [string]$DBName, [Parameter(Position=2,Mandatory=$false,ValueFromPipeline=$false)] [switch]$LogOnly ) Add-Type -TypeDefinition @" public enum LogType { SUCCESS, FAILURE, INFO, ERROR } "@ ############################### # Functions ############################### function Run-SQLQuery($server, $database, $query, $parameters=@{}) { try { $connection = New-Object System.Data.SqlClient.SqlConnection $connection.ConnectionString = "Server=$server;Database=$database;Integrated Security=True" $command = New-Object System.Data.SqlClient.SqlCommand($query,$connection) $parameters.Keys | % { $command.Parameters.AddWithValue("@$_",$parameters[$_]) } $adapter = New-Object System.Data.SqlClient.SqlDataAdapter $adapter.SelectCommand = $command $adapter.SelectCommand.CommandTimeout = 300 $dataset = New-Object System.Data.DataSet $adapter.Fill($dataset) >> $null $connection.Close() $dataset.Tables[0] } catch { Write-Log -message "Error getting data from SQL ($($_.Exception.InnerException.Message))" -MessageType ERROR } } function Run-SQLNonQuery($server, $database, $query, $parameters=@{}) { try { $connection = New-Object System.Data.SqlClient.SqlConnection $connection.ConnectionString = "Server=$server;Database=$database;Integrated Security=True" $command = New-Object System.Data.SqlClient.SqlCommand($query,$connection) $parameters.Keys | % { $command.Parameters.AddWithValue("@$_",$parameters[$_]) } $connection.Open() $rowsDeleted = $command.ExecuteNonQuery() $connection.Close() $rowsDeleted } catch { Write-Log -message "Error deleting data from SQL ($($_.Exception.InnerException.Message))" -MessageType ERROR } } function Write-Log([string]$message, [LogType]$MessageType) { $TimeStamp = $([DateTime]::Now.ToString('yyyy/MM/dd HH:mm:ss')) $message = "$TimeStamp - $MessageType : $message" Out-File -InputObject $message -FilePath $global:LogFile -Append } function Process-Database($server, $database) { $global:LogFile = "$ScriptLogLocation\ProgressDataCleanup_$($database)_" + $([DateTime]::Now.ToString('yyyyMMdd_HHmmss')) + ".log" if ($LogOnly) { Write-Log -message "!!!!!!!! LOG-ONLY Mode activated !!!!!!!!" -MessageType INFO } Write-Log -message "Cleanup progress data in database '$database'" -MessageType INFO Write-Log -message "Last Action Date : $LastActivityDateUTC" -MessageType INFO Write-Progress -Activity "Processing Data in $database" -Status "Querying database... (might take a while)" -PercentComplete 95 $data = Run-SQLQuery -server $server -database $database -query $selectQuery -parameters @{LastActivityDateUTC="'$LastActivityDateUTC'"} if ($data.Count -gt 0) { Write-Log -Message "# Instances in WorkflowProgress table : $($data.Count)" -MessageType INFO if ((Get-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue) -eq $null) { Add-PSSnapin Microsoft.SharePoint.PowerShell } $instancesToDelete = 0 $instanceRowsToDelete = 0 $instanceIndx = 0 foreach ($row in $data) { $instanceIndex++ Write-Progress -Activity "Processing Data in $database" -Status "Processing instance $instanceIndex/$($data.Count) - Obsolete Data Collected : $($instancesToDelete):$($instanceRowsToDelete)" -PercentComplete 95 $instanceIsObsolete = $false if ($oldHistoryListID -ne $row.HistoryListID) { # dispose old web object if needed if ($web -ne $null) {$web.Dispose()} $oldHistoryListID = $row.HistoryListID $output = NWAdmin.exe -o ResolveSharePointIds -siteId $row.SiteID -webId $row.WebID -listId $row.HistoryListID if (($output.Count -eq 1) -or ($output.Count -gt 1 -and $output[1].Contains("Command line execution error"))) { # The site, web or list doesn't exist. The current instance is therefore obsolete. $instanceIsObsolete = $true } else { $web = Get-SPWeb $output[0] $historyList = $web.Lists.TryGetList($output[1]) } } if (!($instanceIsObsolete) -and $historyList -ne $null) { $spquery = New-Object Microsoft.SharePoint.SPQuery $spquery.ViewFields = "<FieldRef Name='ID' />" $spquery.ViewFieldsOnly = $true $spquery.Query = [string]::Format($queryString,"{$($row.WorkflowInstanceID)}") $listItems = $historyList.GetItems($spquery) if ($listItems.Count -eq 0) { # No history items found for this instance. Instance data is obsolete $instanceIsObsolete = $true } } if ($instanceIsObsolete) { # we can remove all Workflow Progress for the instance $instancesToDelete++ $instanceRowsToDelete += $row.ProgressData if ($LogOnly.IsPresent) { Write-Log -message "$($row.ProgressData) obsolete rows found for instance $($row.WorkflowInstanceID)" -MessageType SUCCESS } else { $deletedInstances = Run-SQLNonQuery -server $server -database $database -query $deleteQuery -parameters @{WorkflowInstanceID=$row.WorkflowInstanceID} Write-Log -message "Deleted $($deletedInstances[1]) rows for workflow instance $($row.WorkflowInstanceID)" -MessageType SUCCESS } } } Write-Log -message "Instances to clean : $instancesToDelete" -MessageType INFO Write-Log -message "ProgressData Rows to remove : $instanceRowsToDelete" -MessageType INFO } } ############################### # Parameters ############################### $global:ScriptLogLocation = "D:\Logs\NintexCleanup" # The location where the output of the script is saved $global:LogFile = "" $MonthsToKeep = 6 # The amount of months events are kept ############################### # Main Script ############################### # Initialize the cutoff date. if ($MonthsToKeep -eq 0) { $cutOffDate = Get-Date } else { $cutOffDate = (Get-Date).AddMonths(-$MonthsToKeep) } $LastActivityDateUTC = $cutOffDate.ToUniversalTime().ToString("yyyy-MM-dd HH:MM:ss") $query = @" SELECT wi.SiteID, wi.WebID, wi.HistoryListID, wi.WorkflowInstanceID, count(wp.InstanceID) as ProgressData FROM WorkflowProgress wp LEFT JOIN WorkflowInstance wi on wi.instanceID = wp.InstanceID WHERE wi.state = 4 AND CONVERT(VARCHAR(19),wp.TimeStamp,120) <= @LastActivityDateUTC GROUP BY wi.SiteID, wi.WebID, wi.HistoryListID, wi.WorkflowInstanceID ORDER BY wi.SiteID, wi.WebID, wi.HistoryListID, wi.WorkflowInstanceID "@ $deleteQuery = @" DELETE wp FROM WorkflowProgress wp INNER JOIN WorkflowInstance wi on wi.instanceID = wp.InstanceID WHERE wi.WorkflowInstanceID = @WorkflowInstanceID "@ # Query to get items for the cleanup $queryString = @" <Where> <Eq> <FieldRef Name='WorkflowInstance'/> <Value Type='Text'>{0}</Value> </Eq> </Where>" "@ if ([string]::IsNullOrEmpty($DBServer) -or [string]::IsNullOrEmpty($DBName)) { # No Database or DBServer was specified in the startup params. Process all Nintex content databases [System.Reflection.Assembly]::LoadWithPartialName("Nintex.Workflow") | Out-Null $configdb = [Nintex.Workflow.Administration.ConfigurationDatabase]::GetConfigurationDatabase() if ($configdb -ne $null) { foreach ($contentdb in $configdb.ContentDatabases) { Process-Database -server $contentdb.SqlConnectionString["Data Source"] -database $contentdb.SqlConnectionString["Initial Catalog"] } } } else { # Database and server have been specified at startup. Process only that database. Process-Database -server $DBServer -database $DBName } |