Category Archives: RoboCopy

Some tips & tricks I picked up over the years

Get-RCLogSummary take II

Trying to figure out this regular expression thing…. I just can’t stand the idea of not getting it!

Regular Expression meme

Joakim Svendsen Get-FolderSize uses regular expression, so I did some investigating. Why reinvent the wheel eh? ;-) Turns out there’s a lot going on under the hood here. First I had to figure out how the regular expression worked so I isolated just one…

[Regex]$regex_Dirs = 'Dirs\s:\s+(?<TotalDirs>\d+)\s+(?<CopiedDirs>\d+)(?:\s+\d+){2}\s+(?<FailedDirs>\d+)\s+\d+'

Turns out you can give your result a label if a match is found. Running the regular expression with -match operator will populate the $Matches variable with found label/value, which allows you to retrieve that value later on.

if ($_ -match $regex_Dirs){
  $rcLog.TotalDirs = [int]$Matches['TotalDirs']
  $rcLog.CopiedDirs = [int]$Matches['CopiedDirs']
  $rcLog.FailedDirs = [int]$Matches['FailedDirs']
} 

Quick side note: This script works with the /Bytes option. The plus side of using /Bytes is that you can do some fun stuff like calculating the size in GB or MB

TotalBytes  = $Matches['ByteCount']
TotalMBytes = ([int64] $Matches['ByteCount'] / 1MB).ToString('N')
TotalGBytes = ([int64] $Matches['ByteCount'] / 1GB).ToString('N')

Nice!

There’s yet another option, using ConvertFrom-String, an excellent blog by Bartek Bielawski brought to my attention by Dexter Dhami. I’m working on that as well. That one will only work with PowerShell version 5 so there’s that. Actually, I saw this feature in ISESteroids first at the PowerShell Summit. What I did notice is when using ConvertFrom-String performance took quite a hit…

I have mixed feelings when it comes to ConvertFrom-String. At times he just doesn’t get it! I love the template idea but you need to be specific, otherwise some data might fall through the cracks…You never know for sure… Like all tools you should be skilled at several. Think of regular expression as driving a stick and ConvertFrom-String as an automatic. Here in the Netherlands you can get your driver license on an automatic, but then you’re not allowed to drive stick ever! But if you pass driving a stick you’re allowed to drive automatic. Best bet, go with stick!

Here’s the code using regular expression, with a lil’ help from my friends… ;-)

#region Hash with the Robocopy Log properties
$rcLogProperties = [Ordered]@{
  rcLogFile = ''
  Source = ''
  Target = ''
  TotalDirs = ''
  CopiedDirs = ''
  FailedDirs = ''
  TotalFiles = ''
  CopiedFiles = ''
  FailedFiles = ''
  TotalBytes = ''
  CopiedBytes = ''
  FailedBytes = ''
  StartTime = ''
  EndTime = ''
  Speed = ''
}
#endregion

#region Main
#Get Logfiles from folder
Get-ChildItem '.\temp\23-06-2015' -File |
ForEach-Object {
  #Get Lofile Header & Footer
  $arrSummary  = (Get-Content $_.FullName)[5..8] 
  $arrSummary += (Get-Content $_.FullName)[-10..-1]

  $rcLog = New-Object -TypeName psobject -Property $rcLogProperties
  $rcLog.rcLogFile = $_.Name

  Foreach($line in $arrSummary) {
    switch ($line){
      {$_ -like  '*Source :*'}
      {
        $rcLog.Source = ($_ -replace '(\s\w+)(\s.)','').Trim()
      }
      {$_ -like  '*Dest :*'}
      {
        $rcLog.Target = ($_ -replace '(\s\w+)(\s.)','').Trim()
      }
      {$_ -like  '*Dirs :*'}
      {
        [regex]$regex_Dirs = 'Dirs\s:\s+(?<TotalDirs>\d+)\s+(?<CopiedDirs>\d+)(?:\s+\d+){2}\s+(?<FailedDirs>\d+)\s+\d+'
        if ($_ -match $regex_Dirs){
          $rcLog.TotalDirs = [int]$Matches['TotalDirs']
          $rcLog.CopiedDirs = [int]$Matches['CopiedDirs']
          $rcLog.FailedDirs = [int]$Matches['FailedDirs']
        }    
      }
      {$_ -like  '*Files :*'}
      {
        [regex]$regex_Files = 'Files\s:\s+(?<TotalFiles>\d+)\s+(?<CopiedFiles>\d+)(?:\s+\d+){2}\s+(?<FailedFiles>\d+)\s+\d+'
        if ($_ -match $regex_Files){
          $rcLog.TotalFiles = [int]$Matches['TotalFiles']
          $rcLog.CopiedFiles = [int]$Matches['CopiedFiles']
          $rcLog.FailedFiles = [int]$Matches['FailedFiles']
        }    
      }
      {$_ -like  '*Bytes :*'}
      {
        [regex]$regex_Bytes = 'Bytes\s:\s+(?<TotalBytes>\d+)\s+(?<CopiedBytes>\d+)(?:\s+\d+){2}\s+(?<FailedBytes>\d+)\s+\d+'
        if ($_ -match $regex_Bytes){
          $rcLog.TotalBytes  = $Matches['TotalBytes']
          $rcLog.CopiedBytes = $Matches['CopiedBytes']
          $rcLog.FailedBytes = $Matches['FailedBytes']
        }    
      }
      {$_ -like  '*Ended :*'}
      {
        [regex]$regex_End = 'Ended\s:\s+(?<EndTime>.+)'
        if ($_ -match $regex_End){
          $rcLog.EndTime = $Matches['EndTime']
        }  
      }
      {$_ -like  '*Started :*'}
      {
        [regex]$regex_Start = 'Started\s:\s+(?<StartTime>.+)'
        if ($_ -match $regex_Start){
          $rcLog.StartTime = $Matches['StartTime']
        } 
      }
      {$_ -like  '*Speed :*'}
      {
        [regex]$regex_Speed = 'Speed\s:\s+(?<Speed>.+\/min)'
        if ($_ -match $regex_Speed){
          $rcLog.Speed = $Matches['Speed']
        } 
      }
    }
  } 
  $rclog
}|
Out-GridView
#endregion

Hope it’s worth something to you!

Ttyl,

Urv

‘Sup PSHomies?

Got a lil’ somethin’ for ya… Get-RCLogSummary! As you know I’m a big fan of RoboCopy! I thought I’d share one of the perks of using RoboCopy: the LogFile.

Here’s a list of RoboCopy Logging options, courtesy of ss64.com

   Logging options
                /L : List only - don’t copy, timestamp or delete any files.
               /NP : No Progress - don’t display % copied.
          /unicode : Display the status output as Unicode text.  ##
         /LOG:file : Output status to LOG file (overwrite existing log).
      /UNILOG:file : Output status to Unicode Log file (overwrite)
        /LOG+:file : Output status to LOG file (append to existing log).
     /UNILOG+:file : Output status to Unicode Log file (append)
               /TS : Include Source file Time Stamps in the output.
               /FP : Include Full Pathname of files in the output.
               /NS : No Size - don’t log file sizes.
               /NC : No Class - don’t log file classes.
              /NFL : No File List - don’t log file names.
              /NDL : No Directory List - don’t log directory names.
              /TEE : Output to console window, as well as the log file.
              /NJH : No Job Header.
              /NJS : No Job Summary.

My preference when it comes to logging is to have seperate logfiles instead of appending to one big file. The option /NP is a no brainer, displaying ‘%’ will give you an indication how long it took for that specific file/folder, but who wants that right?It will only increase your logfile size taking more time to parse it down the line. I recently used /NDL and I must say this will keep your logfile footprint small. I did include /FP to still have an idea where the file is being copied from. I’d go with /NDL in combination with /FP when doing a delta-sync. A delta-sync is a robocopy job that will copy the differences once a full-sync has taken place. If the file hasn’t changed robocopy will skip it. Only new and newer files will be copied… Ok enough background, let get scripting shall we? :-P

Function Get-RCLogSummary{
  param(
    [String]$LogFileName,
    [String[]]$LogSummary
  )

  $objLogSummary = @{
    rcLogFile = $LogFileName
    Speed = ''
  }

  Foreach($line in $logSummary) {
    switch ($line){
      #Header
      {$_ | select-string '   Source :'}
        {
          $_= $_.ToString()
          $objLogSummary.Add('Source',$_.Substring(11).Trim())
        }
      {$_ | select-string '     Dest :'}
        {
          $_= $_.ToString()
          $objLogSummary.Add('Target',$_.Substring(11).Trim())
        }
      {$_ | select-string '  Started :'}
        {
          $_= $_.ToString()
          $objLogSummary.Add('Start',$($_.Substring(11).Trim()))
        }
      #Footer
      {$_ | select-string '    Dirs :'}
        {
          $_= $_.ToString()
          $objLogSummary.Add('TotalDirs',$_.Substring(11,10).Trim())
          $objLogSummary.Add('FailedDirs',$_.Substring(51,10).Trim())
          $objLogSummary.Add('CopiedDirs',$_.Substring(21,10).Trim())
        }
      {$_ | select-string '   Files :'}
        {
          $_= $_.ToString()
          $objLogSummary.Add('TotalFiles',$_.Substring(11,10).Trim())
          $objLogSummary.Add('FailedFiles',$_.Substring(51,10).Trim())
          $objLogSummary.Add('CopiedFiles',$_.Substring(21,10).Trim())
        }
      {$_ | select-string '   Bytes :'}
        {
          $_= $_.ToString()
          $objLogSummary.Add('TotalBytes',$_.Substring(11,10).Trim())
          $objLogSummary.Add('FailedBytes',$_.Substring(51,10).Trim())
          $objLogSummary.Add('CopiedBytes',$_.Substring(21,10).Trim())
        }
      {$_ | select-string '   Ended :'}
        {
          $_= $_.ToString()
          $objLogSummary.Add('End',$($_.Substring(11).Trim()))
        }
      {$_ | select-string '   Speed :'}
        {
          $_= $_.ToString()
          $objLogSummary.Speed = $($_.Substring(11).Trim())
        }
      {$_ | select-string '   Times :'}
        {
          $_= $_.ToString()
          $objLogSummary.Add('Time Total',$($_.Substring(11,10).Trim()))
        }
      }
    }

  #return $objLogSummary
  [PSCustomObject]$objLogSummary
}

#region:array with all LogSummary Object Properties
$arrRCProperties = @(
  'rcLogFile',
  'Source',
  'Target',
  'TotalDirs',
  'TotalFiles',
  'TotalBytes',
  'FailedDirs',
  'FailedFiles',
  'FailedBytes',
  'CopiedDirs',
  'CopiedFiles',
  'CopiedBytes',
  'Start',
  'End',
  'Time Total',
  'Speed'
)
#endregion

#region: Get all robocopy LogFiles in specified folder and get Summary
get-childitem '.\log\rc\home\22-06-2015' -File |
ForEach-Object {
  #region: Get File Header & Footer
  $arrSummary  = (Get-Content $_.FullName)[5..8] #Header
  $arrSummary += (Get-Content $_.FullName)[-11..-1] #Footer
  #endregion

  Get-RCLogSummary -LogFileName $_.Name -LogSummary $arrSummary
}|
Select-Object $arrRCProperties |
Out-GridView
#endregion

First I’ll get a list of logFiles and retrieve the first 5-8 lines and the last 10 lines of each file for processing. The LogFileName & array Summary are then passed as parameters to Get-RCLogSummary. I did a select to get the parameters in a certain order. It was a toss up between using [Ordered] Hash or  defining a [PSCustomObject] beforehand. I figured you could minimize the Properties you want by tweaking the $arrRcProperties yourself. last but not least use Out-Gridview or Export-Csv to see the endresult.

I’m working on my pipeline skills, trust me my previous version was more ‘elaborate’, and by elaborate I mean over engineered…

So I guess you’ve noticed that regular expression is missing? Robocopy labels are fixed which is a good thing for me. I’m looking into it…

wpid-wp-1435926794192.jpg

This regular expression isn’t as easy as it seems… This works, just don’t include /Bytes in your robocopy parameter list. In that case you’ll definitely need regular expression. Version 2.0 I guess…

Hope it’s worth something to you

Ttyl,

Urv

SDDL gives more NTFS insight

I’ve been doing migrations, oh say for the past 10 years (Hmmm, that’s long if I do say so myself) Data Migrations can be complex depending what needs to be achieved. I remember using ScriptLogic to map drives depending on which subnet a user was on, that was way before DFS was available… Good times…

I’ve had my share of headaches when it comes to Data migrations. The biggest challenge is interoperability, when Target Resources keeps on using Source Resources until all Source Resources have been migrated. Sometimes it’s just not possible to migrate all Source Resources at once (what we affectionately call ‘big bang’). If data is being mutated by different departments/projects that aren’t migrated at the same time then interoperability is your only choice… Still tricky though…

Ok so here’s the scenario: Migrate Resources from one AD Forest to another (with a trust in place). I’ll take you through the Data part :-)

The key component is to use SIDHistory. SIDHistory will help resolve whether you have access or not to a Source Resource. My favorite replication tool has to be robocopy! It wasn’t love at first sight, but once I figured out all the parameters, then there isn’t much you can’t accomplish with it!

For interoperability we usually redirect Target Resources to the Source. This way Data mutation can still be achieved without disturbing Production. In the mean time data is being synced to the Target Domain with ACLs intact! Why? We’ll get to that later… Or might as well get into it now… :-)

Ok so ACL (Access Control List) is that list you get when you open up a file or folder security tab. The accounts are referred to as ACE (Access Control Entry). That’s where you’d grant/remove an account read/write/full/etc access to said file or folder. When using SIDHistory you’re token access will resolve correctly, but here’s where it gets tricky

I’ve copied Data with robocopy keeping security intact. When I opened a folder security tab I noticed the Target account name being displayed. That threw me off because I didn’t reacl the target resource yet.

Quick sidestep ReACL is a term I came across using Quest Active Directory Manager (now DELL). ReACL can be done by adding the Target Account (doubling the amount of ACEs) or doing a cleanup by first adding the Target account and removing the Source Account. You can also rollback if needed but that one is tricky, especially if SIDHistory has more than one entry.

But you wouldn’t know that by looking at the folder Security tab.

If you really want to find out who has access, SDDL will let you know. SDDL uses an object SID to grant or deny access. Thing is SDDL is hard to read hence the Security tab. So the first time I ReACLed a folder adding the Target Account I saw that the ACEs did double, but I only saw the Target Account. I expected to see SOURCE\ACCOUNT;TARGET\ACCOUNT instead I was seeing the TARGET\ACOUNT twice. Here’s where looking at SDDL will give you more insight… Suffice to say we’ll be doing this the PowerShell way… Oh come on! don’t act so surprised! :-P

So first let’s get the ACL of the folder you want to inspect (try this on your folder):

$acl = get-acl '\\162.198.1.129.\g$\GRP\DATA\DEPT-001-XYZ'

To find out who has access  type $acl.Access. This will give you a list of all ACEs in the ACL. This is the list you’d also see in Explorer security tab (advance mind you, I noticed that). Now for the fun part $acl.sddl… Tada!!!

$acl.Sddl

O:S-1-5-21-103234515-1370883554-928726630-1008G:S-1-5-21-103234515-1370883554-928726630-513D:P(A;OICI;FA;;;SY)(A;OICI;FA;;;BA)(A;OICI;0x1301bf;;;S-1-5-21-103234515-1370883554-928726630-4307)(A;OICI;0x1301bf;;;S-1-5-21-103234515-1370883554-928726630-4308)(A;OICI;0x1200a9;;;S-1-5-21-103234515-1370883554-928726630-4309)

Seems complicated, well yes it is, still it’s worth figuring out… Have a look at MSDN for more information.

The tell tale is the Domain SID, every Account begins with it. Looking at the Domain SID tells you who actually has access (or not) to said resource and which Domain that account belongs to.

The Domain SID for the current domain I’m inspecting is:
DomainSID : S-1-5-21-602145358-1453371165-789345543
You can get the Domain SID using Get-ADDomain cmdlet… ;-)

I picked an ACE from the $acl.access list:

FileSystemRights : Modify, Synchronize
AccessControlType : Allow
IdentityReference : SOURCE\DEPT-001-XYZ-RXWR
IsInherited : False
InheritanceFlags : ContainerInherit, ObjectInherit
PropagationFlags : None

Let’s get some AD properties from this acount

Get-ADGroup -Identity DEPT-001-XYZ-RXWR -Server source.nl -Properties SID,SIDHistory
..
SamAccountName : DEPT-001-XYZ-RXWR
SID : S-1-5-21-602145358-1453371165-789345543-35829
SIDHistory : S-1-5-21-103234515-1370883554-928726630-4307

Here’s the sddl string once more:

O:S-1-5-21-103234515-1370883554-928726630-1008G:S-1-5-21-103234515-1370883554-928726630-513D:P(A;OICI;FA;;;SY)(A;OICI;FA;;;BA)(A;OICI;0x1301bf;;;S-1-5-21-103234515-1370883554-928726630-4307)(A;OICI;0x1301bf;;;S-1-5-21-103234515-1370883554-928726630-4308)(A;OICI;0x1200a9;;;S-1-5-21-103234515-1370883554-928726630-4309)

This group has access using SIDHistory!!!

Ok now what? Well in an ideal situation the data would have been ReACLed using the current SID instead of the SIDHistory. The reason for that is to cleanup your SIDHistory to avoid tokenbloat. Here’s an excellent blog by the dirteam discussing the perils of tokenbloat.

This only scratched the surface of what you could investigate! There aren’t many tools (Free) that can help. Ashley Mcglone has an excellent series on the matter definitely worth reading.

I’m currently doing a Data migration (surprise!) so I’ll be adding more tips/tricks/gotchas as the Data migration progresses so stay tuned!

Hope this will steer you in the right direction when it comes to figuring out who has access…
The rabbit hole goes deep…

Ttyl,

Urv