Wednesday, March 21, 2018

SharePoint 2013 - Execute Search Query using Powershell for large number of results

I recently came across the need to provide a report of specific Content Types documents using SharePoint 2013 Search. Content Search Web Parts does this, however, there is no out-of-the-box method to export your search results to a spreadsheet. you can definitely copy from page to .CSV file but this will not help if there are more items (in thousand's). If your search yields anything above just a couple of pages you’re looking at a lengthy, time consuming, and very boring, task.

Of course there is way, PowerShell to the rescue!. PowerShell saves my lot of time. I will rather spent some time to create powershell so that it will be useful in other senarios too. However, with my latest engagement being primarily focused on the administration side of things, I had finally sat down and really dug into PowerShell, something I’ve been meaning to do for quite some time. PowerShell makes short work of tasks such as the one demonstrated here.

Remember you wil get most of the example on the internet which has powershell to export the search results into csv file. but believe me guys these examples always assumes that your results are less than 10-20 rows. what if you have thousands of rows which need to be exported? the answer is still powershell but there is smart way to use it.

After doing some research I came up with the following reusable script:

Step 1 : Scripts Parameter

#input parameters
param
(
[string] $siteUrl = "https://sharepointdev.company.com", # Web Application URL
[string] $outputPath = ".", # Log File
[string] $queryText = 'ContentTypeId:0x0101009E3C6E67A2CF4261807753FA8CF19B5F*'

Step 2 : Add Snapin

if ((Get-PSSnapin -Name "Microsoft.SharePoint.PowerShell" -ErrorAction SilentlyContinue) -eq $null )
{  
    Add-PsSnapin "Microsoft.SharePoint.PowerShell"
}

Step 3 : Set File Path

$csvFilePathTo = "$outputPath\Documents.csv"
$logFilePathTo = "$outputPath\Documents_Log.txt"

set-variable -option constant -name filename -value $csvFilePathTo
set-variable -option constant -name outputFile -value $logFilePathTo # Log File

Step 4 : Set keywordquery object

# new keywordquery object
$site = New-Object Microsoft.SharePoint.SPSite $siteUrl
$query = New-Object Microsoft.office.Server.Search.Query.KeywordQuery $site

# set ResultTypes
$query.ResultTypes = [Microsoft.Office.Server.Search.Query.ResultType]::RelevantResults

# set number of items to return
$currentIndex = 1

#$query.StartRow = $currentIndex
$query.RowLimit = 500
$query.TrimDuplicates = $true;
$query.Timeout = 600000; #10 Minutes

write-host 'Timeout: ' $query.Timeout

# actual string you are searching for

$query.QueryText = $queryText

Step 5 : Set the Properties you want to retrieve

# get all the extended properties for query
$selectProperties = $query.SelectProperties;

$selectProperties.Add(“SiteName”)
$selectProperties.Add(“Path”)
$selectProperties.Add(“ListName”)
$selectProperties.Add(“Title”)
$selectProperties.Add(“Author”)
$selectProperties.Add(“AuthorOWSUser”)
$selectProperties.Add(“DisplayAuthor”)
$selectProperties.Add(“PostAuthor”)
$selectProperties.Add(“Created”)
$selectProperties.Add(“CreatedOWSDate”)
$selectProperties.Add(“ModifiedBy”)
$selectProperties.Add(“LastModifiedTime”)
$selectProperties.Add(“ModifiedOWSDate”)

$selectProperties.Add(“ContentTypeId”)

Step 6 : Get results first time in temporary table

# execute the query
try
{
    $resultTableColl = $query.Execute()
}
catch
{
    # update log files
      "timeout Error Occured `r`n" | Out-File $outputFile -Append
    $resultTableColl = $query.Execute()
}

# get the results back
$resultTable = $resultTableColl.Item([Microsoft.Office.Server.Search.Query.ResultType]::RelevantResults)
write-host $resultTable.TotalRowsIncludingDuplicates
# make a DataTable from the results

$resDataTable = $resultTable.Table 

Step 7 : Loop through the results till all results traversed

###########################################################################
# Loop through the results and append all results into $resDataTable
###########################################################################
do
{
    #write-host $resultTable.TotalRows
    $currentIndex += $resultTable.Table.Rows.Count
    write-host $currentIndex
    $query.StartRow = $currentIndex
    # execute the query
    $resultTableColl = $query.Execute()
    # get the results back
    $resultTable = $resultTableColl.Item([Microsoft.Office.Server.Search.Query.ResultType]::RelevantResults)
   
    if($resultTable.Table.Rows.Count -le 0)
    {
        break;
    }
    else
    {
        # make a DataTable from the results
        $resDataTable.Merge($resultTable.Table)
    }   

    # use below block is for testing purpose only
    # if results are more than 500 then this loop will run only one time 
    if($resDataTable.Rows.Count -gt 500)
    {
        
         # Un-comment the next line to get only <=1000 results
         #break;
    }

}while($resultTable.TotalRowsIncludingDuplicates -gt $resDataTable.Rows.Count)

write-host "Total Rows Found" $resDataTable.Rows.Count

Step 8 : Traverse the table object to creates rows for csv to export

# since we have data lets format as per o/p we needed.
$RArray = New-Object System.Collections.ArrayList
$urlArray = $null
foreach($row in $resDataTable.Rows)
{   
    $urlArray = $row["Path"] -split '/'
    $DocURL = $row["Path"]
    $listTitle = $row["ListName"]
    $website = $row["SiteName"]
    $itemName = $row["Title"]
    $varcreatedBy = $row["AuthorOWSUser"].tostring().Split("|");   
     
    # Add our data to $CTDBObject as attributes using the add-member commandlet
    # Create a new custom object to hold our result.                   
    $CTDBObject = new-object PSObject -Property @{           
                                        DocumentId       = [int]($row["DocId"].tostring() )              
                                        SiteUrl          = $website
                                        Site             = $siteName             
                                        DocumentUrl      = $DocURL
                                        ListName         = $listTitle                                                                           
                                        DocumentName     = $itemName                                        
                                    }  
   
      $RecordNumber = $RArray.Add($CTDBObject
     
    $CTDBObject = $null
}

Step 9 : Export the data to CSV file


#############################################################################
# Export data in excel
$RArray | Select DocumentId, SiteUrl, Site, DocumentUrl, ListName, DocumentName | Export-csv $filename  -NoTypeInformation


Save step 1 to 9 in one file name as "SearchDocuments.ps1". you can either execute ps1 file and specifiy parameter on the fly or else create  a batch file and specify the parameter in that file.

e.g.
Batchfile name: SearchDocuments.bat
Batchfile contents:

SET ThisScriptsDirectory=%~dp0
SET SiteURL=https://sharepointdev.company.com
SET Enivironment=Prod
SET outputDirectory=\\sharepointdev.company.com\reports


PowerShell -NoProfile -ExecutionPolicy Bypass -Command "& '%ThisScriptsDirectory%SearchDocuments.ps1' '%SiteURL%' '%outputDirectory%' 'ContentTypeId:0x0101009E3C6E67A2CF4261807753FA8CF19B5F*' "