Removing Duplicate Items from a Mailbox

NOTICE: The TechNet gallery will be retired in June, 2020. For future updates to these script, please visit my GitHub profile at https://github.com/michelderooij. Here you can also watch for updates, report bugs or ask questions, or contribute yourself. Official announcemen

 
 
 
 
 
4.5 Star
(30)
23,125 times
Add to favorites
Exchange
5/6/2020
E-mail Twitter del.icio.us Digg Facebook
  • Errors in performing search for Local Exchange 2010
    2 Posts | Last post February 25, 2020
    • Hi been trying to use this script to remove duplicates that were created in a users Inbox due to me not clicking dont import duplicates.
      
      I have attempted to run this script using many different examples but they all lead to the following error: "WARNING: Error performing operation FindFolders with Search options in . Error: You cannot call a method on a null-valued expression."
      
      here is the example of the line im using: 
      
      $UserCredential = Get-Credential
      
      ./Remove-DuplicateItems1.ps1 -Mailbox User1 -Type Mail -Credentials $UserCredential -DeleteMode HardDelete -Mode Full -Verbose
      
      the script then fully errors out with the following:
      
      C:\Users\admin.DOMAIN\Documents\Remove-DuplicateItems1.ps1 : Cannot access mailbox information store, error: The script 
      failed due to call depth overflow.
      At C:\Users\admin.DOMAIN\Documents\User1.ps1:3 char:1
      + ./Remove-DuplicateItems1.ps1 -Mailbox User1 -Type Mail -Credentials $UserCredent ...
      + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException
          + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Remove-DuplicateItems1.ps1
      
      I will note that i have a second version of the main script Remove-DuplicateItems.ps1 hence why this has a 1 in the name. the other one is from a different site that appears to be a stripped down edit of this one.
      
      Any help with this is appreciated.
    • If you're passing a set of mailboxes, the script will call itself per entry; possibly it goes wrong somewhere there. 
  • Cannot bind to MsgFolderRoot & ArchiveMsgFolderRoot
    1 Posts | Last post February 25, 2020
    • What am I missing to run this against my Office 365 mailbox?  I'm only targeting a folder (called "Keep") with its subfolders.
      
      I'm using the script with cmd: .\Remove-DuplicateItems.ps1 -Identity jdoe@johndoe.com -Type Mail -Server outlook.office365.com -IncludeFolders Keep\* -DeleteMode MoveToDeletedItems
      
      Results with: Processing mailbox jdoe@johndoe.com (jdoe@johndoe.com)
      WARNING: Cannot bind to MsgFolderRoot - skipping. Error: Exception calling "Bind" with "2" argument(s): "The request
      failed. The remote server returned an error: (401) Unauthorized."
      WARNING: Cannot bind to ArchiveMsgFolderRoot - skipping. Error: Exception calling "Bind" with "2" argument(s): "The
      request failed. The remote server returned an error: (401) Unauthorized."
  • Place Duplicates in a different Folder
    2 Posts | Last post February 12, 2020
    • Is there something that we can add or change to get it to either place the items in another folder or move the items to another folder. I Love the code and it working perfectly but it was decided to have the items go to a Backup folder instead of Deleted Items.
    • It's not in the code, but it's not too difficult to create that Backup folder, and use it's folderId to move items to instead of deleting them. Adding this functionality is on the list, but no ETA.
  • Remove Duplicates except if has attachments
    5 Posts | Last post December 14, 2019
    • I have been working with this code to remove duplicates in a mailbox we get a large amount of email in, but they also get emails that are identical except the attachments change so I was looking to make this script ignore all messages with attachments. I have had trouble with getting the script adjusted to ignore messages with attachments. Is there something simple I am overlooking or could I get some help with what I need to add for this to ignore messages with attachments?
    • When using specific fields to compare (eg mode full), it doesn't specifically check presence of attachments. Size however likely has changed, so you might want to use -NoSize switch. If the process to change attachments also touches things like time fields (eg Received), you may want to explicitly remove that one from the equation (comment out (#) line on "if ($Item.DateTimeReceived) { $key += $Item.DateTimeReceived.ToString()})" from the 'IPM.Note' (= e-mail items) comparison.
    • I did think of the -NoSize switch but this is correctly not moving the emails with attachments but it also not detecting a lot of the duplicates as well I have commented out Date sent and received and internet msg id. To test this I made email templates that I just open and hit send so they exactly identical. Using identical templates to test.
      
      I have got it to sort duplicate emails with attachments as well with this line
      if ($Item.hasAttachments) { $key += $Item.hasAttachments.ToString()}
      is there something I can add to this to make it only look at emails without the attachments.
    • You added a line to include attachment names in the comparison, so how would that make it not look at attachments? (or am I missing something here)
    • The duplicate emails we trying to sort are emails without attachments. The issue is that there are emails that are sent to the email box that use templates so they are exactly the same except the attachments change so it is detecting these as duplicates and deleting them when they are not duplicates so we just want it to ignore all emails with attachments.
      
      I have basicly achieved this in a way I was not wanting to do it this way since I want to ignore these mssages all together but I just set it is to run in a way that it is basicly impossible to detect them as duplicates unless they truely duplicate caused by system error. Sorry if I confused you it took me longer then it should have for me to realize this way of doing it.
      
      if ($Item.hasAttachments -eq $True) { 
      if ($Item.DateTimeReceived) { $key += $Item.DateTimeReceived.ToString()}
      if ($Item.Subject) { $key += $Item.Subject}
      if ($Item.InternetMessageId) { $key += $Item.InternetMessageId}
      if ($Item.DateTimeSent) { $key += $Item.DateTimeSent.ToString()}
      if ($Item.Sender) { $key += $Item.Sender}
      if ($Item.Body) { $key += $Item.Body}
      If (!$NoSize) {if ($Item.Size) { $key += $Item.Size.ToString()}}
      }
      Else {
      #if ($Item.DateTimeReceived) { $key += $Item.DateTimeReceived.ToString()}
      if ($Item.Subject) { $key += $Item.Subject}
      #if ($Item.InternetMessageId) { $key += $Item.InternetMessageId}
      #if ($Item.DateTimeSent) { $key += $Item.DateTimeSent.ToString()}
      if ($Item.Sender) { $key += $Item.Sender}
      if ($Item.Body) { $key += $Item.Body}
      #If (!$NoSize) {if ($Item.Size) { $key += $Item.Size.ToString()}}
      }
  • Works for all types of items?
    2 Posts | Last post December 13, 2019
    • Hi  Michel,
      is this script also for calendars, contacts and tasks?
      Thanks!
    • Any item
  • Doesn't find non-standard folders
    2 Posts | Last post December 02, 2019
    • Hi there, I've managed to get the script working, but it only seems to find 12 folders, it just doesn't see any other ones (I have a quite a few more). Any tips or ideas? Thanks!
      
      .\Remove-DuplicateItems.ps1 -Identity me@mydomain.com -Server outlook.office365.com -Credentials $Credentials -MailboxOnly -MailboxWide -Type Mail -Mode Full -Report -Verbose -WhatIf
      
      VERBOSE: Loading C:\Program Files\Microsoft\Exchange\Web Services\2.2\\Microsoft.Exchange.WebServices.dll
      VERBOSE: Loaded EWS Managed API v15.00.0913.015
      VERBOSE: Set to trust all certificates
      VERBOSE: Using credentials me@mydomain.com
      VERBOSE: Processing items of type Mail, delete mode is SoftDelete
      Processing mailbox me@mydomain.com (me@mydomain.com)
      VERBOSE: Using Exchange Web Services URL https://outlook.office365.com/EWS/Exchange.asmx
      VERBOSE: Constructing folder matching rules
      VERBOSE: Processing primary mailbox me@mydomain.com
      VERBOSE: Collecting folders to process, type Mail
      VERBOSE: Adding folder \Archive (priority 0)
      VERBOSE: Adding folder \Deleted Items (priority 0)
      VERBOSE: Adding folder \Drafts (priority 0)
      VERBOSE: Adding folder \GitHub (priority 0)
      VERBOSE: Adding folder \Inbox (priority 0)
      VERBOSE: Adding folder \Junk Email (priority 0)
      VERBOSE: Adding folder \Outbox (priority 0)
      VERBOSE: Adding folder \Sent Items (priority 0)
      VERBOSE: Adding folder \Sync Issues (priority 0)
      VERBOSE: Adding folder \Sync Issues\Conflicts (priority 0)
      VERBOSE: Adding folder \Sync Issues\Local Failures (priority 0)
      VERBOSE: Adding folder \Sync Issues\Server Failures (priority 0)
      VERBOSE: Found 12 folders that match search criteria
      Processing folder \Sync Issues
      Processing folder \Sent Items
    • When leveraging FullAccess permission (i.e. not impersonation), make sure you granted FullAccess properly, otherwise it will skip non-default folders, e.g.
      Add-MailboxPermission -Identity userA -User userB -AccessRights FullAccess -InheritanceType All
  • doesn't work for me
    3 Posts | Last post November 12, 2019
    • hi, 
      in order to clean up duplicates on an exchange online archive, I tried this command: 
      
      ".\Remove-DuplicateItems.ps1 -Identity xxx -DeleteMode HardDelete -ArchiveOnly -Force"
      
      but it doesn't work:
      
      "WARNING: Cannot bind to ArchiveMsgFolderRoot - skipping. Error: Exception calling "Bind" with "2" argument(s): "Exchange Server doesn't support the requested version."
      
      I read all the q&a but I cannot find a solution useful for me. I used two different computers (Win7 x86 and Win10 x64), Updated as well, with two different EWS (2.0 x86 and 2.2 x64) but it still doesn't work.
      
      I'm thinking for an Exchange Server incompatibility.
      
      Thanks,
      Simone Frigieri. 
    • hi,
      this command finally worked:
      
      ".\Remove-DuplicateItems.ps1 -Server outlook.office365.com -Identity xxx -DeleteMode HardDelete -ArchiveOnly -Force -Credentials (Get-Credential)"
      
      Simone Frigieri.
    • When you process a mailbox in 365, you need to provide credentials as you found out.
  • Archive mailbox only estimates items without option to remove them
    4 Posts | Last post November 12, 2019
    • Hi Michel,
      I ran the script as below:
      
      .\Remove-DuplicateItems.ps1 -Mailbox usermail@domain.com -Server outlook.office365.com  -IncludeFolders '#SentItems#' -ExcludeFolders '#Inbox#*','#Draft#*','#Deleted Items#*','#Junk#*'-MailboxWide -DeleteMode MoveToDeletedItems -Impersonation -Mode Full -Verbose -Credentials (Get-Credential) -Report -ArchiveOnly
      
      from the progress bar and the report, I could see "Finding Duplicate Items, checked 46,200, found 7458". At the end, I did not get the option to confirm if I want to perform the action.
      
      The credential used has the following rights, Global Admin, Impersonation and full access and Send As to the target mailbox.
      
      Please what could I be missing?
    • Hi Michel,
      
      I figured the box was too large to process. With about 50,000 messages mostly with attachements. 
       
      So I used good old mfcmapi to split the items into two folders of roughly 25,000 items each and then ran the command below:
      
      .\Remove-DuplicateItems.ps1 -Mailbox usermailbox@domain.com -Server outlook.office365.com  -IncludeFolders '#SentItems#*' -ExcludeFolders '#Inbox#*','#Draft#*','#Deleted Items#*','#Junk#*','\Sent Items 2\*'-MailboxWide -DeleteMode MoveToDeletedItems -Impersonation -Mode Full -Verbose -Credentials (Get-Credential) -Report -ArchiveOnly
      
      This successfully moved the duplicate items to the DeletedItems folder.
      
      Would be great if the script could be extended to handle these type of scenario.
      
      Thanks
    • Hi Michel,
      
      Sequel to my last thread, with further testing, I realized the script is able to handle large mailboxes. However, when there is a network outage on the client system on which it is run, the script fails. This also happens with unnoticeable network outages or drop in quality of Service from the ISP. 
      
      Hence, the recommendation to split the contents of large mailboxes to smaller sub/primary folders, is only applicable in the case of uncertain quality of internet connectivity. 
      
      It my case, I had to modify the "full" switch, to match only InternetMessageID. This is because on analysing the messages using MFCMAPI, it was the only common attribute to the duplicates. The cause of duplicates was not ascertained, but linked to some backup solution that was used on a MacBook from 2011 to date. This altered the message attributes including timestamps and time zones.
      
      Thanks for putting this script together. This seems to be the only available implementation out in the public that works excellently well.
    • Good you got it sorted. It can handle throttling, but unfortunately outages is another thing.
  • Public Folder
    4 Posts | Last post November 10, 2019
    • Hi Michel,
      
      There was mention of this running against a public folder.  
      How can I do this?
      Does the folder need to be mail enabled?
      Can you do recursive folders or do public folder behave differently here?
      
      Thanks in advance
    • Development is on Modern Public Folders, true. No version yet for public release, unfortunately.
    • I have a version that works for me against EX2016 PubFolders.
      i added some changes but it is still 99.9% Michel solution.
      cant load to Technet with my low rep, so its on my OneDrive.. https://1drv.ms/u/s!Agi7_K1y3R6SpQ3LUEnRQ6pC7nsY?e=gfx6Rj  use at own risk.
      
      Processing folder \19036\Approvals
      18612 items processed and 161 removed in 00:08:32 - average 2177 items/min
      
      ######### END-- Mon 26 Aug 2019 213044 ######
    • Hi Rob,
      
      I tried to download your Public Folder version but it wasn't available anymore.  Can you please add it to OneDrive again and post the address?
      
      Thanks
      
      Colin Ferrington
  • Combination of modes
    3 Posts | Last post November 08, 2019
    • Hi Michel,
      
      Thanks for the script, it looks really great, but due to the way my duplicates were created it does not work 100%. The Body mode works, but clears lot of emails that are not really duplicate, but are automated messages and have always the same body. Is it possible to compare on a custom combination of parameters like Body+Subject+DateTime or something like that?
      
      Thanks,
      Martin
    • Got it myself by editing the Full mode in the script to suit my needs. Again thanks for the script!
    • Please how did you do this?
1 - 10 of 88 Items