How to Export Tabluar Data in Captiva 7

Posted by sroth

Armedia has a customer using Captiva 7 to automatically capture tabular information from scanned documents. They wanted to export the tabular data to a CSV file to be analyzed in Excel. Capturing the tabular data in Captiva Desktop proved to be simple enough, the challenge was in exporting it in the desired format.  Our customer wanted each batch to create its own CSV file, and that file needed to contain a combination of fielded and tabular data expressed as comma delimited rows.

Here is an example of one of the scanned documents with the desired data elements highlighted.

Timecard-Scan-Metadata-Highlighted

Here is an example of the desired output.

EMPLOYEE,EID,DATE,REG HRS,OT HRS,TOT HRS
ANDREW MARSH,084224,4/22/2013,7,0,7
ANDREW MARSH,084224,4/23/2013,7.5,1,7.5
ANDREW MARSH,084224,4/24/2013,4,0,9
ANDREW MARSH,084224,4/25/2013,8.5,0,8.5
ANDREW MARSH,084224,4/26/2013,12,0,12
BARB ACKEW,084220,4/22/2013,7,0,7
BARB ACKEW,084220,4/23/2013,9.5,0,9.5
BARB ACKEW,084220,4/24/2013,9.5,0,9.5
BARB ACKEW,084220,4/25/2013,2.5,0,2.5
BARB ACKEW,084220,4/26/2013,8,.5,8

As you can see, the single fields of Employee Name and Employee Number are repeated on each row of the output.  However, because Employee Name and Employee Number were not captured as part of the tabular data on the document, this export format proved to be a challenge.

Here’s what I did:

  1. In the Document Type definition, I created fields for the values I wanted to capture and export (Name, EmployeeNbr, Date, RegHrs, OTHrs, TotHrs).  Here’s how it looks in the Document Type editor:

CC7-doctype

  1. In the Desktop configuration, I configured:
    • Output IA Values Destination: Desktop
    • Output dynamic Values: checked
    • Output Array Fields: Value Per Array Field
  2. Finally, I created a Standard Export profile that output the captured fields as a text file, not a CSV file. I named the file with a “CSV” extension so Excel could easily open it, but to create the required output format, the file had to be written as a text file.  Here is what the Text File export profile looks like:

CC7-export

The content of the Text file export profile is:

EMPLOYEE,EID,DATA,DATE,REG HRS, OT HRS, TOT HRS
---- Start repeat for each level 1 node ----
---- Start repeat for each row of table: Desktop:1.UimData.Hours ----
{S|Desktop:1.UimData.Name},{S|Desktop:1.UimData.EmployeeNbr},{S|Desktop:1.UimData.Date},{S|Desktop:1.UimData.RegHrs},{S|Desktop:1.UimData.OTHrs},{S|Desktop:1.UimData.TotHrs}
---- End repeat ----
---- End repeat ----

By using two nested loops I was able to access the non-tabular fields, Name and EmployeeNbr, as well as the tabular fields in the same output statement.  This looping feature of the Text File export profile saved having to write a CaptureFlow script to iterate through all the table variables and concatenate Strings for export.  A nice feature, but not well documented.

Leave a Reply

Upcoming Webinar: Modernize your legal case management system (06/26/2018)