Remove Invalid Characters from File Names

This script strips a potential file name of characters that are invalid in Windows file names, i.e. *, :, /, \.

 
 
 
 
 
4.1 Star
(9)
5,580 times
Add to favorites
8/16/2016
E-mail Twitter del.icio.us Digg Facebook
  • Another novice
    2 Posts | Last post May 31, 2018
    • Hey, Chris.  This script looks awesome.  Unfortunately, it doesn't do anything when I try to run it.  Added a transcript to so I could see what happens, but there is nothing there either.  Any ideas?
    • Thank you for the compliment. Could you give me an example of what you were trying to run it against?
  • How to run this from another script
    3 Posts | Last post March 09, 2017
    • Forgive my ignorance I am a real novice at powershell. I am trying to run your script using a directory scan for each file in a directory structure. When I do it does not clean up the file name it just returns the same name. I put a test file in the directory I was testing it on and nothing happened. 
      
      I used your install command and then ran this:
      
        $items = Get-ChildItem -Path $FolderPath -Recurse
          foreach ($item in $items)
          {
             Remove-InvalidFileNameChars $item.name 
          }
      
      This is what is returned: &.joe({}%$#2@dd.txt
      
      Any help would be appreciated.
      
      Thanks,
      Joe    
    • Joe,
      You're giving it the name, but not storing the value, or actually telling it to rename the file. This is what you would want for your loop:
      foreach ($item in $items) {
          $NewName = Remove-InvalidFileNameChars $item.Name
          $item | Rename-Item -NewName $NewName
      }
      
      I hope that clarifies things.
    • Joe,
      I apologize. I just wrote that code without really thinking about its implications, nor did I consider what you had said about the test file. The reason nothing came back on your test file is because Windows wouldn't have let you create it if it had invalid characters in it to begin with. There are really not many common keyboard characters that are considered invalid by Windows: "<>|:*?/\ and that's all. So, your test filename, while ugly, is valid. The use of this script is to strip out invalid characters before you try to apply a string as a file name, preventing an error. Checking extant file names is not needed since Windows never have allowed them in the first place. Again, I apologize for the terrible first answer.
  • Bug with Control Characters
    4 Posts | Last post August 16, 2016
    • Great script!  I did run into a weird RegEx-related bug when using the Replacement parameter if my Name parameter was an ISO8601-formatted date string such as, '2016-08-05T14:22:47'.  An additional replacement character gets appended to the end of the string.  After debugging, I determined this is caused by the replacement of the control character called Data Link Escape (ASCII/Unicode decimal value 16).  I don't know what this character does, nor do I care.  It's not an issue with your code, just some quirk with the .Net RegEx engine.  It's probably doing the right thing, but again I don't care :D
      
      I avoided this by adding a parameter to the script to list any allowable characters, then adding a line to remove them from the array of invalid characters:
      
      #The decimal Unicode value of any normally-invalid characters that are acceptable in the end result
      [int[]]$ValidCharsDecimalUnicodeValues = @(16)
      
      #Ensure we don't remove/replace any characters that we want to consider valid
      $arrInvalidChars = $arrInvalidChars | Where-Object -FilterScript {$ValidCharsDecimalUnicodeValues -notcontains [byte][char]$_}
    • After playing with it some more, I realize this is not so much of a bug as a niche feature request for the ability to have some characters get removed instead of replaced.  Here's how I did it in the end:
      
      Added to the begin block:
          #The decimal Unicode value of any characters to remove instead of replacing
          [int[]]$CharsToRemoveNotReplaceDecimalUnicodeValues = @(16)
      
          #Ensure we don't replace any characters that we want to remove
          $arrInvalidChars = $arrInvalidChars | Where-Object -FilterScript {$CharsToRemoveNotReplaceDecimalUnicodeValues -notcontains [byte][char]$_}
      
      Added to the beginning of the Remove-Chars function:
      
              ForEach ($Character in $CharsToRemoveNotReplaceDecimalUnicodeValues){
                  $String = [RegEx]::Replace($String, [char]$Character, '')
              }
      
    • That is an interesting discovery. If you don't mind my asking, what is generating the ISO 8601 string? I can certainly use your code to add this as a feature; I like adding new features. 
    • J-LA, I added the feature. It is called the RemoveOnly parameter, and it takes an array of either one character strings, integer, or hexidecimal values of characters to remove and not be replaced by the value in the Replacement parameter.
      
      Thank you for the idea to add this as a feature. I appreciate it.
      
      Let me know what you think.
  • Keeping a space
    2 Posts | Last post August 28, 2014
    • What if i did not want to replace spaces what would be an easy way to modify you code to not change spaces
    • I apologize for never answering this question.  I finally did update the script to leave the space character intact by default.