Enterprise Content Management

Armedia Blog

CMIS Integration – Integrating FileNet with SharePoint 2013

October 17th, 2014 by Ben Chevallereau

Recently, our team has been working on a series of CMIS Integrations. This video demonstrates the use of the CMIS components that we developed and used to integrate FileNet with SharePoint 2013. This integration has been packaged into SharePoint. During the video, you’ll see how to connect to FileNet, to browse the repository, to create folder, to create documents and as well to preview documents and to download documents.

Predictive Analytics and The Most Important Thing

October 15th, 2014 by Jim Nasr

It turns out I was wrong…which happens at an alarmingly increasing rate these days—though I chalk that to a thirst to challenge myself…errr, my story!

So, for a while now, I had convinced myself that I knew what the most important thing was about successfully doing predictive analytics: accuracy of data and the model (separating the noise). Veracity, as they say. In working with a few clients lately though, I no longer think that’s the case. Seems the most important thing is actually the first thing: What is the thing you want to know? The Question.

As technologist we often tend to over-complicate and possibly over-engineer. And it’s easy to make predictive analytics focus on the how; the myriad of ways to integrate large volumes and exotic varieties of data, the many statistical models to evaluate for fit, the integration of the technology components, the visualization techniques used to best surface results, etc. All of that has its place. But ultimately, first, and most importantly, we need to articulate the business problem and the question we want answers for.

What do we want to achieve from the analytics? How will the results help us make a decision?

Easy as that sounds, in practice it is not particularly easy to articulate the business question. It requires a real understanding of the business, its underlying operations, data and analytics and what would really move the meter. There is a need to marry the subject matter expert (say, the line of business owner) with a quant or a data scientist and facilitate the conversation. This is where we figure out the general shape and size of the result and why it would matter; also, what data (internal and external) feeds into it.

Articulating The Question engages the rest of the machinery. Answers are the outcome we care about. The process and the machinery (see below for how we do it) give us repeatability and ways to experiment with both asking questions and getting answers.

Armedia Predictive Analytics Process

Armedia Predictive Analytics process for getting from The Question to answers

VIDEO- Alfresco CMIS Integration- A Sneak Peak

September 19th, 2014 by Ben Chevallereau

For a few months now, myself and our fellow team members have been working on developing a CMIS integration to seamlessly allow for Alfresco to be accessed from other platforms. This video demonstrates the components that we built on top of the standard CMIS 1.1 and that we packaged in different platforms like Sharepoint 2013, Sharepoint 2010 or Drupal. Using these components, you can browse your repository, find your documents, upload documents by drag and drop, edit-online or use full text or advanced search. This video focus essentially on the integration with Alfresco, but it can be used with any CMIS 1.1 compliant repository. Complimentary to these components, we created as well a filtered search component. This one is only compatible with Alfresco but with any versions. Use this component, you can use a full text search and filter the result using metadata like file type, creator, creation date or file size.

These components have been built only with JS, HTML and CSS files, so it’s why it’s so easy to repackage in other web platforms. Moreover, we really built it to make them highly customizable. Depending of your use case, you can customize these components to display relevant metadata, to focus on a specific folder, to add new filter and a lot more.



For more information about our CMIS integration with Alfresco, Join us next week in San Francisco for Alfresco Summit 2014!

CLICK HERE to register for this event.

Spring Managed Alfresco Custom Activiti Java Delegates

September 17th, 2014 by Judy Hsu

I recently needed to make a change to have Alfresco 4′s Activiti call an object managed by Spring instead of a class that is called during execution.  Couple of reasons for this:

  1. A new enhancement was necessary to access a custom database table, so I needed to inject a DAO bean into the Activiti serviceTask.
  2. Refactoring of the code base was needed.  Having Spring manage the java delegate service task versus instantiating new objects for each process execution is always a better way to go, if the application is already Spring managed (which Alfresco is).
    • i.e. I needed access to the DAO bean and alfresco available spring beans.
    • NOTE:  You now have to make sure your class is thread safe though!

For a tutorial on Alfresco’s advanced workflows with Activiti, take a look at Jeff Pott’s tutorial here.  This blog will only discuss what was refactored to have Spring manage the Activiti engine java delegates.

I wanted to piggy-back off of the Activiti workflow engine that is already embedded in Alfresco 4, so decided not to define our own Activiti engine manually.  The Alfresco Summit 2013 had a great video tutorial, which helped immensely to refactor the “Old Method” to the “New Method”, described below.


For our example, we’ll use a simple activiti workflow that defines two service tasks, CherryJavaDelegate and ShoeJavaDelegate (The abstract AbstractCherryShoeDelegate is the parent).  The “Old Method” does NOT have spring managing the Activiti service task java delegates.  The “New Method” has spring manage and inject the Activiti service task java delegates, and also adds an enhancement for both service tasks to write to a database table.

Old Method

1. Notice that the cherryshoebpmn.xml example below is defining the serviceTask’s to use the “activiti:class” attribute; this will have activiti instantiate a new object for each process execution:

<process id="cherryshoeProcess" name="Cherry Shoe Process" isExecutable="true">
    <serviceTask id="cherryTask" name="Insert Cherry Task" activiti:class="com.cherryshoe.activiti.delegate.CherryJavaDelegate"></serviceTask>
    <serviceTask id="shoeTask" name="Insert Shoe Task" activiti:class="com.cherryshoe.activiti.delegate.ShoeJavaDelegate"></serviceTask>

2. Since we have multiple service tasks that need access to the same Activiti engine java delegate, we defined an abstract class that defined some of the functionality.  The specific concrete classes would provide / override any functionality not defined in the abstract class. 

import org.activiti.engine.delegate.JavaDelegate;
public abstract class AbstractCherryShoeDelegate implements JavaDelegate {
    public void execute(DelegateExecution execution) throws Exception {

public class CherryJavaDelegate extends AbstractCherryShoeDelegate {

New Method

Here’s a summary of all that had to happen to have Spring inject the java delegate Alfresco 4 custom Activiti service tasks (tested with Alfresco 4.1.5) and to write to database tables via injecting DAO beans.

  1. The abstract AbstractCherryShoeDelegate class extends Activiti engine’s BaseJavaDelegate
  2. There are class load order issues where custom spring beans will not get registered.  Set up depends-on relationship with the activitiBeanRegistry for the AbstractCherryShoeDelegate abstract parent
  3. The following must be kept intact:
    • In the Spring configuration file, 
      • Abstract AbstractCherryShoeDelegate class defines parent=”baseJavaDelegate” abstract=”true” depends-on=”ActivitiBeanRegistry”
      • For each concrete Java Delegate:
        • The concrete bean id MUST to match the class name, which in term matches the Activiti:delegateExpression on the bpmn20 configuration xml file 
          • NOTE: Reading this Alfresco forum looks like the activitiBeanRegistry registers the bean by classname, not by bean id, so likely this is not a requirement
        • The parent attribute MUST be defined as an attribute

Details Below:

1. Define spring beans for the abstract parent class AbstractCherryShoeDelegate and each concrete class that extends AbstractCherryShoeDelegate (i.e. CherryJavaDelegate and ShoeJavaDelegate). Have Spring manage the custom Activiti Java delegates where the concrete class.  The abstract parent must define it’s own parent as “baseJavaDelegate”, abstract=”true”, and depends-on=”ActivitiBeanRegistry”.

<bean id="AbstractCherryShoeDelegate" parent="baseJavaDelegate" abstract="true" depends-on="activitiBeanRegistry"></bean>
<bean id="CherryJavaDelegate"
class="com.cherryshoe.activiti.delegate.CherryJavaDelegate" parent="AbstractCherryShoeDelegate">
    <property name="cherryDao" ref="com.cherryshoe.database.dao.CherryDao"/>

<bean id="ShoeJavaDelegate"
class="com.cherryshoe.activiti.delegate.ShoeJavaDelegate"  parent="AbstractCherryShoeDelegate">
    <property name="shoeDao" ref="com.cherryshoe.database.dao.ShoeDao"/>


- Do NOT put any periods to denote package structure in the bean id!  Alfresco/Activiti got confused by the package “.”, where spring normally works fine with this construct.

- Also just because the concrete class is extending the parent abstract class, is not enough to make it work.

<bean id="com.cherryshoe.activiti.delegate.CherryJavaDelegate"
class="com.cherryshoe.activiti.delegate.CherryJavaDelegate" >
    <property name="cherryDao" ref="com.cherryshoe.database.dao.CherryDao"/>

<bean id="com.cherryshoe.activiti.delegate.ShoeJavaDelegate"
class="com.cherryshoe.activiti.delegate.ShoeJavaDelegate" >
    <property name="shoeDao" ref="com.cherryshoe.database.dao.ShoeDao"/>

2. Notice that the cherryshoebpmn.xml example below is using the “activiti:delegateExpression” attribute and referencing the Spring bean.  This means only one instance of that Java class is created for the serviceTask it is defined on, so the class must be implemented with thread-safety in mind:

<process id="cherryshoeProcess" name="Cherry Shoe Process" isExecutable="true">
    <serviceTask id="cherryTask" name="Insert Cherry Task" activiti:delegateExpression="${CherryJavaDelegate}"></serviceTask>

    <serviceTask id="shoeTask" name="Insert Shoe Task" activiti:delegateExpression="${ShoeJavaDelegate}"></serviceTask>

3.  The abstract class is now changed to extend the BaseJavaDelegate.  The specific concrete classes would provide / override any functionality not defined in the abstract class. 

import org.alfresco.repo.workflow.activiti.BaseJavaDelegate;
public abstract class AbstractCherryShoeDelegate extends BaseJavaDelegate {
    public void execute(DelegateExecution execution) throws Exception {

public class CherryJavaDelegate extends AbstractCherryShoeDelegate {

For more examples and ideas, I encourage you explore the links provided throughout this blog. Also take a look at Activiti’s user guide, particularly the Java Service Task Implementation section. What questions do you have about this post? Let me know in the comments section below, and I will answer each one.

The blog Spring Managed Alfresco Custom Activiti Java Delegates was originally posted on cherryshoe.blogspot.com.

U. S. Government Digital Acquisition Policy Gets an Update

September 11th, 2014 by Scott Roth

You may have seen the news that the U. S. Government has established the U.S. Digital Service, a small team designed to “to improve and simplify the digital experience that people and businesses have with their government.” On the heels of that announcement came the news that Michael Dickerson, former Google engineer, has been selected to head up the U. S. Digital Service. And, in conjunction with these announcements, came some initial updates to the U. S. Government’s acquisition policies as they relate to software and computing solutions. It is these updates I would like to highlight in this post.

These initial updates come in the form of two documents, The Digital Services Playbook and the TechFAR , which really go hand-in-hand. The Playbook lays out best practices for creating digital services in the government, and the TechFAR describes how these services can be acquired within the confines of existing acquisition policy (i.e., the FAR ). The Playbook discusses 13 “plays”, or best practices that should be implemented to ensure delivery of quality applications, websites, mobile apps, etc., that meet the needs of the people and government agencies.  Advocating and implementing these plays will be the Digital Services’ mission.  As a long-time provider of software development services, I wasn’t too surprised by any of these best practices – and neither will you. However, it was refreshing to see the government finally embrace and advocate them.  Here are the Digital Services Playbook plays.

  1. Understand what people need
  2. Address the whole experience, from start to finish
  3. Make it simple and intuitive
  4. Build the service using agile and iterative practices
  5. Structure budgets and contracts to support delivery
  6. Assign one leader and hold that person accountable
  7. Bring in experienced teams
  8. Choose a modern technology stack
  9. Deploy in a flexible hosting environment
  10.  Automate testing and deployments
  11. Manage security and privacy through reusable processes
  12. Use data to drive decisions
  13. Default to open

Like I said, you probably weren’t surprised by these practices, in fact, if you are a successful software services company, you probably already implement these practices. But remember, these practices are now being embraced by the U. S. Government, whose acquisition policy has traditionally been geared more toward building battleships than software solutions.

Speaking of acquisition, the TechFAR is a handbook that supplements the Federal Acquisition Regulations (FAR). The FAR is a strict and lengthy body of regulations all executive branch agencies must follow to acquire goods and services. The Handbook is a series of questions, answers, and examples designed to help the U. S. Government produce solicitations for digital services that embrace the 13 plays in the Digital Services Playbook. At first glance, you may not think that implementing these practices would require a supplement like the Handbook, but if you have any experience with the FAR, or agencies who follow it, you will understand that interpretation and implementation of the regulations varies from agency to agency, and they usually error on the side of caution (i.e., strict interpretation of the policy).

In my experience, the single most difficult thing for a U. S. Government agency to accomplish under the FAR is play #4, the use of agile methodologies to develop software solutions. If you can accomplish this, many of the other plays will happen naturally (e.g., #1, #2, #3, #6, #7, #10). However, the nature of agile development – user stories vs. full system requirements, heavy customer participation vs. just follow the project plan, etc. – seems contrary to the “big design” methodology implied by the FAR. This notion couldn’t be more wrong. The TechFAR encourages the use of agile methodologies and illustrates how solicitations and contracts can be structured to be more agile.

Personally, I think the Digital Services Playbook and the TechFAR are a great starting point for improving the quality and success of government software solutions.  And, official guidance like this now brings the U. S. Government’s acquisition process inline with how Armedia has always developed software solutions, i.e., using agile methodology.  No longer will we have to map our methodology and deliverables to an archaic waterfall methodology to satisfy FAR requirements.

I think the questions/answers/examples in the TechFAR are good, and provide terrific insight for both the government writing solicitations, and industry responding to them. If you sell digital services to the U. S. Government, I encourage you to read these two documents, the Digital Services Playbook and the TechFAR  — they’re not long. And even if you don’t contract with the U. S. Government, the best practices in the Playbook and the advice in the Handbook are still probably applicable to your business.

WordPress Contributors Upload Plugins

August 7th, 2014 by Paul Combs

My previous post, “Allow WordPress Users to Upload Images” discussed the use of the functions.php file to implement capabilities to the contributors role that isn’t there by original design. As the functions.php file is part of a WordPress theme, and if an alternate theme is selected, the functions will no longer be accessible unless the functions.php file is edited in that theme as well. With the use of plugins, however, the functions will remain.

This function is slightly different than many others, in that it is a persistant change. So even if the function is not enabled as a plugin or removed from a functions.php the change will remain until it is explicitly revoked. Two plugins are needed.

This plugin enables the capability of the contributor role to upload content along with their post. Once enabled the action takes effect, even if it is then disabled. Hence, the reason for the next plugin.

 Plugin Name: Armedia: Contributor Role Upload Enabler
 Description: Adds the capability to the contributor role to upload content. This change is persistent until it is explicitly revoked. Based on the source by Hardeep Asrani.
 Author: Paul Combs
 Version: 1.0
 Author URI: http://www.armedia.com

function allow_contributor_uploads() {
 if ( current_user_can( 'contributor' ) && ! current_user_can( 'upload_files' )) {
 $contributor = get_role('contributor');

add_action('admin_init', 'allow_contributor_uploads');

This plugin removes the capability of the contributor role to upload content. Once enabled the action takes effect, even if it is then disabled.

 Plugin Name: Armedia: Contributor Role Upload Disabler
 Description: Removes the capability to the contributor role to upload content. This change is persistent until it is explicitly revoked.
 Author: Paul Combs
 Version: 1.0
 Author URI: http://www.armedia.com

function remove_contributor_uploads() {
 if ( current_user_can( 'contributor' ) && current_user_can( 'upload_files' )) {
 $contributor = get_role('contributor');

add_action('admin_init', 'remove_contributor_uploads');


There are no checks and balances here, so it should be noted that if both are enabled the results will not be as expected. A quick test of refreshing a contributor screen with both plugins enabled will reveal that the capability is available every other refresh. For the expected result, select one or the other.

Allow WordPress Contributors to Upload Images

August 5th, 2014 by Paul Combs

WordPress offers six different roles ranging from Super Admin to Subscriber. There is one role that permits a user to write and manage their own posts but cannot publish them; that is the contributor. Writing a post and submitting it for approval to publish without images is as easy as it gets. Posts of that nature are rare and can get a little boring. Images help make a post more interesting. However, as a contributor, images cannot be uploaded with the post. A number of work-a-rounds may be put into place to remedy this, but each can be time consuming and often repeated effort. This may not be so bad for a short post, but one with many images can be more challenging than the effort is worth.

To allow contributors to upload images with posts would greatly simplify this. One site offers a snippet of code to add to the theme’s functions.php file.

if ( current_user_can('contributor') && !current_user_can('upload_files') )
add_action('admin_init', 'allow_contributor_uploads');

function allow_contributor_uploads() {
$contributor = get_role('contributor');

This has been tested as working using WordPress 3.9.1. Here is an after and before screenshot of the admin board of a contributor. Notice the image on the left has a Media option.


A contributor may now upload their post with images ready for someone else to Publish. After a successful upload, from the Media option take note of a couple of differences between images submitted by the contributor and images submitted by others.

This is an image submitted by anyone other that the contributor. Notice that the contributor may only view the image and no other action may be taken.


This image has a box next to it to allow for bulk actions. The contributor may also Edit or Delete Permanently their own image as well as view it.


A second contributor account was created to verify that the another contributor may only view other contributor images as well as any other image and may perform other actions on own images. The results were as expected.

It is important to note that even if the code is removed from the functions.php file, the contributor role will still have the capability to upload content. This capability is persistent until explicitly revoked. The setting is saved to the database. To explicitly revoke this capability simply reverse the action editing the code above and append to the functions.php file.

if ( current_user_can('contributor') && current_user_can('upload_files') )
add_action('admin_init', 'remove_contributor_uploads');

function remove_contributor_uploads() {
$contributor = get_role('contributor');

Although the functions.php may be modified with either of the pieces of code provided above, a cleaner and more portable method would be the use of custom plugins. One plugin to enable uploads and another to disable them. That could be the topic of my next article…


How to Export Tabluar Data in Captiva 7

July 29th, 2014 by Scott Roth

Armedia has a customer using Captiva 7 to automatically capture tabular information from scanned documents. They wanted to export the tabular data to a CSV file to be analyzed in Excel. Capturing the tabular data in Captiva Desktop proved to be simple enough, the challenge was in exporting it in the desired format.  Our customer wanted each batch to create its own CSV file, and that file needed to contain a combination of fielded and tabular data expressed as comma delimited rows.

Here is an example of one of the scanned documents with the desired data elements highlighted.


Here is an example of the desired output.

ANDREW MARSH,084224,4/22/2013,7,0,7
ANDREW MARSH,084224,4/23/2013,7.5,1,7.5
ANDREW MARSH,084224,4/24/2013,4,0,9
ANDREW MARSH,084224,4/25/2013,8.5,0,8.5
ANDREW MARSH,084224,4/26/2013,12,0,12
BARB ACKEW,084220,4/22/2013,7,0,7
BARB ACKEW,084220,4/23/2013,9.5,0,9.5
BARB ACKEW,084220,4/24/2013,9.5,0,9.5
BARB ACKEW,084220,4/25/2013,2.5,0,2.5
BARB ACKEW,084220,4/26/2013,8,.5,8

As you can see, the single fields of Employee Name and Employee Number are repeated on each row of the output.  However, because Employee Name and Employee Number were not captured as part of the tabular data on the document, this export format proved to be a challenge.

Here’s what I did:

  1. In the Document Type definition, I created fields for the values I wanted to capture and export (Name, EmployeeNbr, Date, RegHrs, OTHrs, TotHrs).  Here’s how it looks in the Document Type editor:


  1. In the Desktop configuration, I configured:
    • Output IA Values Destination: Desktop
    • Output dynamic Values: checked
    • Output Array Fields: Value Per Array Field
  2. Finally, I created a Standard Export profile that output the captured fields as a text file, not a CSV file. I named the file with a “CSV” extension so Excel could easily open it, but to create the required output format, the file had to be written as a text file.  Here is what the Text File export profile looks like:


The content of the Text file export profile is:

---- Start repeat for each level 1 node ----
---- Start repeat for each row of table: Desktop:1.UimData.Hours ----
---- End repeat ----
---- End repeat ----

By using two nested loops I was able to access the non-tabular fields, Name and EmployeeNbr, as well as the tabular fields in the same output statement.  This looping feature of the Text File export profile saved having to write a CaptureFlow script to iterate through all the table variables and concatenate Strings for export.  A nice feature, but not well documented.

Good Times With VirtualBox Networking

July 24th, 2014 by David Milller

TL;DR version: if you run multiple VirtualBox VMs on the same desktop, setup 3 network interfaces on each such VM (one NAT, one internal, one bridged).

Now for the long, more entertaining (hopefully!) version:

Recently I switched from VMware Workstation to Oracle VirtualBox for my personal virtualization needs.   I’m very happy overall. VirtualBox seems faster to me – when I minimize a VM, do lots of other work, then restore the VM, it is responsive right away; where vmWare would page for a minute or two.  And each VirtualBox VM is in a separate host window, which I like more than VMware’s single tabbed window.

Still, I must say VMware’s networking was easier to deal with.  Here’s how I ended up with 3 IP addresses in each of my local VMs…

I have a CentOS VM running Alfresco and Oracle; a Fedora VM running Apache SOLR and IntelliJ IDEA; and a Windows 2012 Server VM running Active Directory.  I need connectivity to each of them from my host desktop (Windows 8.1), and they need connectivity to each other, and they need to be able to connect to Armedia’s corporate VMs.  Plus,  I’d rather not update my hosts file or IP settings every time  I move between the office and home!

1st VirtualBox network: a Network Address Translation (NAT network) which allows each VM to talk to the other VMs, but not to any other machine; and does not allow connection from the host desktop.  This meets Goal #2 (connectivity to each other).  But Goals #1 and #3 are not met yet.

2nd VirtualBox network: a VirtualBox Host-Only network which allows connectivity from the host desktop.  Now Goals #1 (connectivity from the host) and #2 (connectivity to each other) are just fine.

Also, both the NAT and the host-only network offer stable IP addresses; whether at home or at work, my VM’s get the same address each time, so I don’t spend 10 minutes updating IP references every time I switch location.

Danger!  Here is where VirtualBox tricks you!  It seems like Goal #3 (access to corporate VMs) is met too!  With the NAT and internal IP addresses, I can see our internal websites and copy smaller files to and from the data center VMs.  But if I transfer a larger file, I get a Connection Reset error!  Twice in the last month, I’ve spent hours tracking down the “defect” in the corporate network settings.  (You’d think I’d remember the problem the second time around; but in my defense the error manifested in different ways).

Solution?  Add the 3rd VirtualBox network: a bridged network (i.e. bridged to your physical network adapter, so this network causes each VM to have an IP address just like the host gets, from the corporate/home DHCP server): Now the 3rd goal is really met!  I can transfer files all day long, no worries.

Something to watch out for: when you disconnect a wired ethernet cable, VirtualBox automatically changes the bridged network to bind to your wireless interface.  This is nice since your VMs automatically get new addresses.  BUT! When you plug in the ethernet again (which in my case deactivates the wireless), VMware does NOT switch back to the wired interface!  That happened to me this morning.  Spent a few hours trying to figure out why my file uploads failed.  Finally saw where VirtualBox re-bound my bridged network.  Changed it back to the wired interface, and all was well.

Alfresco Records Management: An Approach to Implementation Part II

July 22nd, 2014 by Deja Nichols

In the 1st part of this blog “Alfresco Records Management; An Approach to Implementation – PART 1,” I went over the business case and planning phase for a medium sized agency that wanted a seamless records management configuration, leveraging Alfresco’ s Enterprise Content Management (ECM) system and Records Management (RM) Module.

To figure out how we wanted to go about design and implementation and how to configure the system properly, we need to get an idea of the basic lifecycle of our documents and records. We needed to see where we were going. To build a castle, you need to know how much total space, land you need etc. What are all the materials you need? What is it going to cost?  Even if it’s just a general idea, it’s best to map out what you want, what is required for the whole project first. You can’t just start out with one room of a castle and “see where it takes you.” I have personally seen that it is the same with building ECM and RM systems. Different documents can have different life cycles but here is a general example for a possible lifecycle for an HR Complaint:




In this blog, Part 2, I’m going to go over our last two general aspects, how we set up and implemented Alfresco in order to accomplish our ideal records management configuration:

  • Configuration
  • Implementation

Steps for Creating a Records Management Program


In order to best describe our configuration and implementation phase, I want to go over some very basic aspects of how things were set up in Alfresco. Although we had an older version of Alfresco, most of this was out of the box with little configuration. So here’s the basic aspects that we created in Alfresco that was important to the layout of the system:

  1. Group sites
  2. Document library within each group site
  3. Document types
  4. Metadata
  5. Record Declaration
  6. Seamless User Interaction
  7. Records Management (RM) Module  (aka RM repository or File Plan)
  8. Workflows for certain documents and records

Alfresco Group Sites Example



Alfresco Group sites:

To break it down let’s start with the basic structure of our company. Like most companies, we have a hierarchical structure, about seven different departments and about 200 employees. Every employee belongs to one department, so we set up each department with a “group site” in Alfresco. (Human Resources Site, Finance Site, Legal Site etc…this is an out of the box feature in Alfresco)

Alfresco Document Library:

Each department group site has its own file repository. In Alfresco it is called a “Document Library,” which, per our records policies, was deemed to be the single-source repository for all of that departments’ electronic documents.

Document types:

Each document library can be set up with a unique set of “Document Types” to categorize documents into your file taxonomy. They can also be unique per group sites’ document library. (For example, the Human Resources document library may have “Employee Contracts” and “Resumes” as two possible document types, but Finance may have “Vendor Contracts” and “Invoices” etc.)

The idea was that, upon an employee uploading a document to their department’s document library, they were prompted to select a document type. You can also set up a sub-document type, if that was necessary per the retention schedule or file taxonomy.


Then, we configured the system to require the user to enter in any applicable metadata for the document they are uploading. (As required by our documents matrix in PART 1). Some of our documents needed to have extra properties (metadata) to help with mapping it to the correct location for retention purposes. Example, for a document type “Resume” we wanted to add the following metadata: “Name of Employee” so that the system knew which records management folder to put it in (which I will go over in more detail later in this blog). Each record uploaded typically only needed one extra piece of information to correctly categorize it for records management purposes.

Alfresco Records Management Home Page


Record Declaration

The last step when uploading a document that we configured the system to be able to handle was to declare the document an official record, which only showed up on document types that had a retention policy associated with it. If they chose a document type that was predetermined to be a record (as opposed to a non-record), then the user was given the option to choose whether or not the document they were uploading should be declared as an official record. What this means is that if this document was “complete” and ready as an “official record,” then checking that box would immediately declare it an official record upon ingestion. But if the document was still a “work in progress,” and not yet an official “record,” the user could simply leave the box unchecked and declare the document an official record at a later date, when it is fully completed. (Example: If the document type is a “Contract,” and it is still being worked on when it was uploaded, they would not check the “declare record” box, upon ingestion. But, then after some time, when that contract gets officially signed, they can declare it a record at that time).

For us, the word “Complete” was defined per document by the document matrix. Basically it is “when a document is considered complete” and/or “at which point it becomes official record.” For us, one example of the declaration criteria was: “After the document (in this case, a contract) was officially approved AND all stakeholders had signed-off on it, it could be declared a record.” For some records this was not applicable, such as articles of incorporation, bills, financial statements etc. These were automatically official records upon ingestion and immediately sent to the RM module for retention regardless, since they were un-editable documents. So obviously anything of that document type was given the option to “declare it a record,” and it was automatically declared after ingestion.

Process for Declaring a Record


In most cases, we found the user usually knows what they are uploading. They usually upload their own work into the system and they usually know if that work is still “in progress” or “complete” etc.  We also found that it was not even necessary to teach users the document matrix because most of them knew what they were working on like the back of their hand. Thus, this method worked for us and we did not have to turn our end users into Records Managers! They only needed to know 3 basic things:

1. What the document type was (invoice, contract, financial report)

2. What the metadata was (date of document or name of employee, etc., usually only one piece of extra information was needed)

3. Is it still being edited or otherwise worked on, or can it be declared a record now?

Seamless User Interaction:

We wanted our users to be able to see the records in their own context. What I mean is, we did’t want them having to go look for their documents in 2 places. We did’t want them to have to worry or even know about the RM module in Alfresco. All they needed to know was that they upload documents to their group site and the RM works behind the scenes. So we set it up so that if they are searching for documents on their group site, documents that were sent to the RM module (from that site) also show up in the search results. They can open it and view it, collaborate on it without ever leaving their group site. (You can also set up a visual indicator on each document as an “official record” so you can tell which ones have been sent.)

Alfresco Records Management In Place Records Declaration


Alfresco Records Management Module:

When someone sets up the Alfresco RM module it will allow them to create folders in what is called a “File Plan.” Then the Records Manager can set retention rules (that coincide with the retention schedule) on those folders. From there, the documents can be mapped to the File Plan folders (using the documents matrix as a guide) when a document type and metadata combo is placed on a document and then declared a record. That File Plan folder runs the retention on the documents from there.

Alfresco Records Management Module File Structure Example


When the user selects “Yes” to the question “declare official record?” (whether upon ingestion or later declared), it tells the system that this file can now be sent directly to the File Plan in the Alfresco RM Module.


Now let’s take a look at a practical example of how a file gets uploaded into the system and ends up in the file plan and retention policies applied to it. (The characters indicated in green will be our input variables.)

Actor: Wants to upload an old invoice they found into the Finance group site. First enters in the document type: “Invoice.” Next pops up, because “invoice” was selected, the required metadata for the document type “Invoice” which was configured to be “Year.” Actor enters in a year: “2007.” Since “Invoice” doc type has a retention period connected to it, per configuration, the actor sees the checkbox up for “Official Record?.” Actor chooses to “check” the box which = true (or Yes).  Computer: from this input, knows exactly:

  • Where to put this file (was configured to : RM Site/File Plan/Finance/Invoices/2007)
  • When to put it there (was configured to : “Declare Official Record?” yes = immediately an official record = sent to file plan immediately)
  • How long it stays there (Document was placed in the Invoice folder. This file will keep records for current year + 6 years per the rules placed on it by the Records Manager. Also since it was placed in the “2007” folder, the system knows when to start the retention) [current year + 2007 + 6 = 2014]…this document is discarded on Jan 2014.)


Alfresco Records Management- Uploading a Document


If a document is not ready to be declared an official record upon ingestion (if you are still editing a contract for example) then one can keep it in the system and declare it a record when it is ready/complete/approved etc.



This diagram (above) shows the flow of a typical record lifecycle. Flow: upload, assign doc type and metadata, if not yet record – edit/collaborate, later declare record, retain and discard (if applicable), etc. Upon upload, the document has its entire life already mapped out for it, depending on the configuration of the document types, metadata and file plan. (Please note, there are some official records that are never discarded and have a “Permanent” retention. The file plan can accommodate for these types of files as well and the above model would need to be slightly modified to account for that. You don’t even need to ask if it’s an official record or not on permanent records as discussed earlier in this blog.)


Another popular option when declaring a record is to put the document through a workflow that takes it through an approval process, and once the document gets approved via the workflow, it automatically declares itself a record. The system, from that point on, knows all the information it needs in order to retain it, and if applicable, dispose of it per policy.

The creation of workflows is an important step to know what kind of workflows your company needs for records and documents/files, etc. This may be intimately connected to the lifecycle and management of your records so I suggest keeping it in mind when mapping out your system. For more information from Armedia about Workflows in Alfresco, see our Blog Subsection on Alfresco Workflows.

In closing:

This approach is primarily a “day forward” solution. When it comes to migration, there may need to be a different approach so files can be ingested into the new system and arrive at the  correct location within Alfresco.

Also I would like to note that this approach might work best with a customized user interface for more flexibility.

There are many different ways to go about implementing Records Management, and companies need flexible customization that will work for their business processes and records management needs. This method can help you get started on your own configuration and implementation.

For more information on Records management, check out our white paper: “Records Management: An Approach to Getting Started”

To read more Armedia Blogs about Alfresco, See these links: Alfresco Records Management, Alfresco ECM

Copyright © 2002–2011, Armedia. All Rights Reserved.