PowerShell – Performance Tips

Now, we all know that PowerShell is a scripting language, and those are typically built for accessibility, flexibility and ease of use, not performance. That’s true, but some choices you make in PowerShell can considerably slow down your code. Learn some hot tips in order to get the most performance out of your PowerShell scripts, and avoid making these common mistakes.

List Types

It’s so simple to declare and use arrays in PowerShell, but did you know the default array type can be quite slow compared to it’s relatives? Let’s illustrate this by the following comparison:

$File = 'D:\blog\1000K-passwordlist-unique.txt'
$Passwords = Get-Content -Path $File

Write-Host "Testing performance with $([Int]($Passwords.Count/1000))K strings"

Write-Host 'Initialize ArrayList with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $ArrayList = [System.Collections.ArrayList]@($Passwords)
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Initialize Generic List with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $List = [System.Collections.Generic.List[string]]::new()
    foreach ($Password in $Passwords) {
        $List.Add($Password)
    }
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Initialize hashtable with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $HashTable = @{}
    foreach ($Password in $Passwords) {
        $HashTable.Add($Password, $Password)
    }
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Initialize ordered dictionary with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Dictionary = [Ordered]@{}
    foreach ($Password in $Passwords) {
        $Dictionary.Add($Password, $Password)
    }
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Initialize hashset with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $HashSet = [System.Collections.Generic.HashSet[string]]$Passwords
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Initialize stack with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Stack = [System.Collections.Stack]::new($Passwords)
} | Select-Object -ExpandProperty TotalMilliseconds
Performance on populating different collection types

And now let’s look at the time to find the partial string *cake* in each of those:

$File = 'D:\blog\10K-passwordlist.txt'
$Passwords = Get-Content -Path $File

Write-Host "Testing performance with $([Int]($Passwords.Count/1000))K strings"

$ArrayList = [System.Collections.ArrayList]@($Passwords)

$List = [System.Collections.Generic.List[string]]::new()
foreach ($Password in $Passwords) {
    [Void]$List.Add($Password)
}

$HashTable = @{}
foreach ($Password in $Passwords) {
    [Void]$HashTable.Add($Password, $Password)
}

$Dictionary = [Ordered]@{}
foreach ($Password in $Passwords) {
    [Void]$Dictionary.Add($Password, $Password)
}

$HashSet = [System.Collections.Generic.HashSet[string]]$Passwords

$Stack = [System.Collections.Stack]::new($Passwords)

Write-Host 'Find *cake* in ArrayList with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $ArrayList.Where({ $_ -like "*cake*" })
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Find *cake* in Generic List with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $List.Where({ $_ -like "*cake*" })
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Find *cake* in hashtable with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $HashTable.GetEnumerator().Where({ $_.Value -like "*cake*" })
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Find *cake* in ordered dictionary with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Dictionary.GetEnumerator().Where({ $_.Value -like "*cake*" })
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Find *cake* in hashset with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $HashSet.Where({ $_ -like "*cake*" })
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Find *cake* in stack with values (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Stack.Where({ $_ -like "*cake*" })
} | Select-Object -ExpandProperty TotalMilliseconds
Performance on searching for values in different collection types

Conclusion: For adding items to a collection, the most performant options are: ArrayList, Stack, and HashSet. For searching through those collections, there were only marginal differences, although the HashSet showed consistently good performance.

Adding items to array vs. arraylist

A very common task is to create an empty array, and add items to it in a loop, using the + operator.

$Numbers = @()
ForEach ($Number in (1..100))  {
    $Numbers += $_
}

This is very slow, as is shown in the following comparison with arraylist and the Add-method.

$File = 'D:\blog\50K-passwordlist.txt'
$Passwords = Get-Content -Path $File

Write-Host "Testing performance with $([Int]($Passwords.Count/1000))K strings"

Write-Host 'Add items to array using += (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Array = @()
    $Passwords.Foreach({ $Array += $_ })
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Add items to ArrayList using .Add() (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $ArrayList = New-Object -TypeName System.Collections.ArrayList
    $Passwords.Foreach({ $ArrayList.Add($_) })
} | Select-Object -ExpandProperty TotalMilliseconds
Performance on adding items to an array

Supressing output

Sometimes you don’t care about the output of a script, and just need to supress it. There are several different ways to go about this, but some are considerably slower. The following example tests performance for different ways to supress $Array.Add() (which normally returns a ‘0’):

$File = '1000K-passwordlist.txt'
$Passwords = Get-Content -Path $File

Write-Host "Testing performance with $($Passwords.Count/1000)K strings"

Write-Host 'Out-Null (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Array = [System.Collections.ArrayList]@()
    foreach ($Password in $Passwords) {
        $Array.Add($Password) | Out-Null
    }
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host '$null = (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Array = [System.Collections.ArrayList]@()
    foreach ($Password in $Passwords) {
        $null = $Array.Add($Password)
    }
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host '[Void] (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Array = [System.Collections.ArrayList]@()
    foreach ($Password in $Passwords) {
        [Void]$Array.Add($Password)
    }
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host '> $null (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Array = [System.Collections.ArrayList]@()
    foreach ($Password in $Passwords) {
        [Void]$Array.Add($Password)
    }
} | Select-Object -ExpandProperty TotalMilliseconds
powershell supress output

Here we can clearly see that as long as you don’t use Out-Null cmdlet, then you’re good. I should mention though, that the Out-Null cmdlet has been reworked in PowerShell 7, to be faster. But it still is not as fast as the other methods:

Performance for supressing output in PowerShell 7

Conclusion: Don’t use Out-Null.

String Building

It’s easy to concatenate strings in PowerShell, but the ease comes at an expense; the performance is horrible. Let’s look at the following experiment (you can find the password lists used here):

$Passwords = Get-Content -Path 100K-passwordlist.txt

# Concatenate using the + operator
Write-Host 'Concatenate using the + operator (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $OutString = ''
    foreach ($Password in $Passwords){
        $OutString += 'Pwd: ' + $Password;
    }
} | Select-Object -ExpandProperty TotalMilliseconds

# Concatenate using the StringBuilder
Write-Host 'Concatenate using StringBuilder (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $OutString = New-Object -TypeName System.Text.StringBuilder
    foreach ($Password in $Passwords){
        [void]$OutString.Append('Pwd: ' + $Sentence)
    }
} | Select-Object -ExpandProperty TotalMilliseconds
Performance of adding 10 000 strings
Performance of adding 100 000 strings

Conclusion: In this experiment, the stringbuilder was 98 times faster, when concatenating 100 000 strings. The stringbuilder has some initial overhead, so if you are just working with a few strings, it might actually perform slower. But for big data sets, stringbuilder is your go to method of adding strings.

.Where() vs. Where-Object

All collections have a built-in Where-method. Using this method is actually quite faster than piping the collection to Where-Object, and the syntax is pretty much the same. Let take a look at the following experiment:

$Passwords = Get-Content -Path 100K-passwordlist.txt
Write-Host "Testing performance with $($Passwords.Count/1000)K strings (look for string 'cake')"

# Finding strings in a array with the Where-Object cmdlet
Write-Host 'Where-Object cmdlet (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Passwords | Where-Object { $_ -like "*cake*" }
} | Select-Object -ExpandProperty TotalMilliseconds

# Finding strings in a array with the where method
Write-Host 'Built-in Where-method (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Passwords.Where({ $_ -like "*cake*" })
} | Select-Object -ExpandProperty TotalMilliseconds
Performance of Where-Object cmdlet vs. Where-method

Conclusion: The built-in Where-method is about twice as fast at the Where-Object cmdlet, although the difference is most notable when working with lots of objects/strings.

Other built-in methods

The Where-method is actually not the only built-in one. Lets take a look at the performance of some other methods, and compare them to the normal way we’d do it.

ForEach
$File = '1000K-passwordlist.txt'
$Passwords = Get-Content -Path $File

Write-Host "Testing performance with $($Passwords.Count/1000)K strings"

Write-Host 'Foreach-Object cmdlet (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Passwords | ForEach-Object { "$_ hmm.." }
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Foreach method (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Passwords.ForEach({ "$_ hmm.." })
} | Select-Object -ExpandProperty TotalMilliseconds

Conclusion: The built-in ForEach-method is about twice as fast at the ForEach-Object cmdlet, although the difference is most notable when working with lots of objects/strings.

Split
$File = '1000K-passwordlist.txt'
$Passwords = (Get-Content -Path $File) -join " "

Write-Host "Testing performance string length $($Passwords.Length)"

Write-Host 'Split operator (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Passwords -split " "
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Split method (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Passwords.split(" ")
} | Select-Object -ExpandProperty TotalMilliseconds

Conclusion: The built-in Split-method is much faster than the Split operator.

Reading large files

For reading files we usually use Get-Content, but turns out .NET has a faster method available for us.

$File = 'D:\blog\1000K-passwordlist.txt'
$SizeInMb = [Int]((Get-ChildItem -Path $File).Length / 1KB)

Write-Host "Testing performance reading a $SizeInMb KB file"

Write-Host 'Get-Content (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Content = Get-Content -Path $File
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'System.IO.StreamReader (milliseconds): ' -NoNewline
Measure-Command -Expression {
    try
    {
        $Sb = New-Object -TypeName System.Text.StringBuilder
        $stream = [System.IO.StreamReader]::new($File)
        while ($line = $stream.ReadLine())
        {
            $Sb.AppendLine($line)
        }
        $Sb.ToString()
    }
    finally
    {
        $stream.Dispose()
    }
} | Select-Object -ExpandProperty TotalMilliseconds

Conclusion: When working with large files, it is considerably faster to use the StreamReader, than using Get-Content.

Selecting unique values from array

Let’s compare different ways to select unique values from an array.

$File = 'D:\blog\100K-passwordlist.txt'
$Passwords = Get-Content -Path $File

Write-Host "Testing performance with $([Int]($Passwords.Count/1000))K strings"

Write-Host 'Select-Object cmdlet (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Passwords | Select-Object -Unique
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Sort-Object cmdlet (milliseconds): ' -NoNewline
Measure-Command -Expression {
    $Passwords | Sort-Object -Unique
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'Get-Unique cmdlet (milliseconds): ' -NoNewline
Measure-Command -Expression {
    # Get-Unique only works with sorted arrays
    $Passwords | Sort-Object | Get-Unique
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'HashSet (milliseconds): ' -NoNewline
Measure-Command -Expression {
    [System.Collections.Generic.HashSet[string]]$Passwords
} | Select-Object -ExpandProperty TotalMilliseconds

Write-Host 'LINQ (milliseconds): ' -NoNewline
Measure-Command -Expression {
    [Linq.Enumerable]::Distinct([string[]]$Passwords)
} | Select-Object -ExpandProperty TotalMilliseconds
powershell get unique values
Performance for selecting unique values in an array

Conclusion: Do not use the Select-Object cmdlet to find unique values in large arrays. For best performance use a hashset, or LINQ.

Nested Loops

It is general knowledge within programming communities that the more layers you have of nested loops, the poorer performance you can expect from your system. This is because of how machines operate at hardware layer, and how things are stored in memory or on disk.

Sequential vs. Parallel

PowerShell executes code sequentially by default. All the loops, and even the pipeline works sequentially within the same thread. So if you were to execute commands on lots of remote computers for example, it would take a really long time. This is because it would wait for computer1 to finish, before it goes on to computer2, and so on.

Parallel processing enables you to work on multiple tasks at the same time. For example you could execute code against 30-40 computers at a time, and set up some kind of queue. Thus this would be 30-40 times faster.

As a rule, whenever performing remote work against multiple systems, try to incorporate som parallel processing in your script.

Jobs

The easiest way to accomplish this is through PowerShell jobs, although this is not the best performance. But at least is way faster than sequential processing. The following code shows how to run up to 30 jobs in parallel, and will show a progressbar with the status at any time.

$ComputerList = 'Computer1', 'Computer2', 'Computer3'
$ComputerStack = [System.Collections.Stack]::new($ComputerList)
$JobLimit = 30
$ProgressId = Get-Random
$JobCode = {
    Param ($Computer)
    Invoke-Command -ComputerName $Computer -ScriptBlock {
        # Code to execute on the remote system
        Start-Sleep -Seconds (Get-Random -Minimum 2 -Maximum 4)
        Get-Service -Name Spooler
    }
}

Try
{
    Do
    {
        # If queue is not full, start one more job if needed
        if ($Running -lt $JobLimit -and $ComputerStack.Count -gt 0)
        {
            $Computer = $ComputerStack.Pop()
            [Void](Start-Job -Name "remote_$Computer" -ScriptBlock $JobCode -ArgumentList $Computer)
        }

        # Check status
        $Jobs = Get-Job -Name "remote_*"
        $Running = ($Jobs | Where-Object { $_.State -eq "Running" }).Count
        $Completed = ($Jobs | Where-Object { $_.State -eq "Completed" }).Count
        $Failed = ($Jobs | Where-Object { $_.State -eq "Failed" }).Count

        Write-Progress -Activity "Remote code execution" -Status "Total $($ComputerList.Count) | Queued $($ComputerList.Count-$Jobs.Count) | Running $($Running) | Completed $($Completed) | Failed $($Failed)" -Id $ProgressId -PercentComplete ([Math]::Floor((($Completed + $Failed) / $ComputerList.Count) * 100))

    } Until ($Running -eq 0)
}
Catch
{
    Write-Error -Message "$_ $(($_.ScriptStackTrace -split "`n")[0])"
}
Finally
{
    # Get outdata from the jobs
    $Result = Get-Job -Name "remote_*" | Where-Object { $_.State -eq "Failed" -or $_.State -eq "Completed" } | Receive-Job

    # Cleanup
    [Void](Get-Job -Name "remote_*" | Stop-Job -PassThru | Remove-Job)
}

Now, if you don’t care about progressbar and errorhandling, you could shorten it down to the following:

$ComputerStack = [System.Collections.Stack]::new('Computer1', 'Computer2', 'Computer3')
$JobLimit = 30
$JobCode = {
    Param ($Computer)
    Invoke-Command -ComputerName $Computer -ScriptBlock {
        # Code to execute on the remote system
        Start-Sleep -Seconds (Get-Random -Minimum 2 -Maximum 8)
        Get-Service -Name Spooler
    }
}

Do
{
    # If queue is not full, start one more job if needed
    if ($Running -lt $JobLimit -and $ComputerStack.Count -gt 0)
    {
        $Computer = $ComputerStack.Pop()
        [Void](Start-Job -Name "remote_$Computer" -ScriptBlock $JobCode -ArgumentList $Computer)
    }

    # Check status
    $Running = (Get-Job -Name "remote_*" | Where-Object { $_.State -eq "Running" }).Count
} Until ($Running -eq 0)

# Get outdata from the jobs
$Result = Get-Job -Name "remote_*" | Where-Object { $_.State -eq "Failed" -or $_.State -eq "Completed" } | Receive-Job

# Cleanup
[Void](Get-Job -Name "remote_*" | Stop-Job -PassThru | Remove-Job)
Runspaces

Runspaces are like jobs, but without all the extra overhead, so it’s faster to spin up a runspace, than a job. The caveat is that runspaces are more difficult/complex to grasp and work with.

I have another blog post on how to use runspaces for this kind of work:
PowerShell – Parallel Remote Execution with Runspaces

Microsoft also offers a great general tutorial on runspaces here.

Newer PowerShell Versions

Last but not least, you’ll get an overall performance upgrade by upgrading to PowerShell 7, as it’s built on .NET Core and has been optimized with performance in mind.

Get PowerShell 7 here.

Summary

  • For adding items to a collection, the most performant options are: ArrayList, Stack, and HashSet.
  • Never add to an array by the + operator, use .Add() of ArrayList instead
  • Never use Out-Null to supress output, use Void or assign to $null
  • Use the built-in methods in collections, over the corresponding cmdlets
  • Choose Stringbuilder when building large strings
  • Use StreamReader for reading large files
  • Do not use Select-Object for getting unique values from array
  • Use parallel execution to increase performance when remoting against multiple computers
  • Upgrade to PowerShell 7

Hope this post helped you realize how the default way to do things may not be the most performant one.

Leave a Reply