Enterprise Content Management

Armedia Blog

Archive for the ‘Documentum’ Category

Modifying SharePoint Document Content Types and Libraries Using the Client Model

September 22nd, 2011 by Tim Lisko

In a current eRoom to SharePoint migration project I wanted to preserve the “Date Created”, “Date Modified”, “Edited By”, and “Created By” fields in the eRoom documents. To do this I created a custom content type (in SharePoint 2010) based on the standard “Document” content type with four new fields to accept the migrated information. I also created a library template that uses this content type as well as the standard “Document” content type. I’ll explain why later in this article.

Once the documents are migrated I need to update the migrated list/library items. What I don’t want is to keep one set of “preserved” fields and another set of SharePoint fields. The SharePoint fields, as you probably guessed, set the author and editor as the individual doing the migration (or impersonated member) and the created and modified dates being the date of migration.

Updating list items is a pretty easy thing to do in SharePoint’s Client Object Model. The following code accomplishes this task.

foreach (SP.ListItem item in lstItems)
{
//skip items not assigned to the migratedDoc document type
if (item.ContentType.ToString() == migType.ToString())
{
row = listItems_dtbl.NewRow();
row["List"] = list.Title;
title = (!Convert.IsDBNull(item["Title"]) ? item["Title"].ToString() : "");
if (title == "" && item["FileLeafRef"].ToString() != "")
{   //set title if one does not exist
title = item["FileLeafRef"].ToString();
title = title.Replace(item["File_x0020_Type"].ToString(), "");
}

sEditor = (!Convert.IsDBNull(item["eEditor"]) ? item["eEditor"].ToString() : "");
sAuthor = (!Convert.IsDBNull(item["eAuthor"]) ? item["eAuthor"].ToString() : "");

try
{
editor = ctx.Web.EnsureUser(sEditor);
ctx.ExecuteQuery();
}
catch
{
editor = ctx.Web.EnsureUser("tlisko");
}
try
{
author = ctx.Web.EnsureUser(sAuthor);
ctx.ExecuteQuery();
}
catch
{
author = ctx.Web.EnsureUser("tlisko");
}
item["Editor"] = editor;
item["Author"] = author;
item["Modified"] = item["eModified"];
item["Created"] = item["eCreated"];

//remove values
item["eEditor"] = null;
item["eAuthor"] = null;
item["eCreated"] = null;
item["eModified"] = null;

item["ContentTypeId"] = docType.Id;
item.Update();
}
}
try
{
ctx.ExecuteQuery();
}

“ctx” is my ClientContext, though that is probably self-evident. You may also note that I only process items whose content type is my custom content type “migratedDoc” which is instantiated as “migType.” Only the migrated documents will have this content type.

eRoom documents do not have Titles, so I set a title based on the file name. You can leave Title blank – it isn’t a required field. Continuing through the code you will notice that I check that the names of the editor and author exist in the LDAP (ctx.Web.EnsureUser(sAuthor)). The update of the list/library item will fail if you try to use an editor or author that doesn’t exist.

I don’t really want to have a custom document library – the SharePoint default document library is just fine. So to get the result I want I need to reassign all the migrated items to the standard “Document” type. List items content types have a Name property. Unfortunately you can’t change it – it is read only. What you can change is the ContentTypeId and changing that value changes the content type of the list item. In the code above “docType” is local instance of the “Document” content type.

I said at the beginning of this article that I would explain why I wanted to keep the “Document” content type in addition to the creating a custom content type. The reason is that you can’t assign a list item to a content type if that content type isn’t in the list. That’s also why you can’t move a document from one SharePoint library to another unless its content type is in the target library as well.

Next I want to remove my custom content type from the list altogether. But to do that, I have to make sure that all those extra fields that are in the custom content type are set to null – which I do before changing the content type. The order of resetting these fields and reassigning the content type doesn’t matter. What does matter is that if these extra fields are not set to null, you will be unable to remove the content type from the list.

After processing all the list items I can now remove the custom content type from the list. Easy, two lines of code:

migType.DeleteObject();
ctx.ExecuteQuery();

But that still doesn’t remove the custom fields from the list. More code is needed.  Incorporating the delete of the custom content type you have this.

migType.DeleteObject();
list.Fields.GetByTitle("eCreated").DeleteObject();
list.Fields.GetByTitle("eModified").DeleteObject();
list.Fields.GetByTitle("eAuthor").DeleteObject();
list.Fields.GetByTitle("eEditor").DeleteObject();
ctx.ExecuteQuery();

And that’s it! The library looks like the default document library with only the “Document” content type as desired.

I'm in eRoom, Get Me Out of Here!

July 27th, 2011 by cstephenson

[turns up amplifier, cranks up volume to 11, puts Huey Lewis and the News vinyl on record player....]
I start up Visual Studio 6.0 and settle down to some nostalgic programming with Visual Basic (VB) 6.  Ah, college memories come flooding back and strangely not many related to actual study.  I feel like I have stepped back in time – the processor has worked its speed up to 88 mph and now I am back using VB6.  I also remember learning Delphi Version 1.0 so that I could *help* a friend do their final year project. Good times.

(more…)

Think Alfresco from Documentum perspective –Take 1

July 14th, 2010 by bsampath

Open Source ...</ins>

When you work for a while in the software you get numbed to “technologies have come and gone…” occasionally though some become commodities and others trend setters. We have seen that with many products like Apache, Tomcat, Lucene, Drupal …etc that have stabilized and matured over the past years with the help of increased development from the open source realm. Wait! Did I mention the word “Open Source” and going to talk about the enterprise content management?

So without any more ado, we have Alfresco- catering to a rapidly increasing demand of the enterprise content management solutions which is built over the open source technologies such as Spring, Hibernate, and Lucene platforms. Having done years of Documentum development and several Alfresco projects of late, I think there are some interesting overlaps and differences of approach that I feel would make the developers get adapted quicker.

With the wiki site overwhelmed with Introduction, API’s, Development, Deployment and the Forums to answer all the questions regarding the issues faced during the project phase, I am here to talk purely from the developers perspective on what’s the key areas that I happen to witness the difference from the Documentum space.

The road map for my next series of blog is going to cover each of the areas mentioned below in more detailed, code abundant and developer centric approach which will answer the questions:

  • Does this feature exist in Documentum or Alfresco or both?
  • If yes, how different is the approach?

So with that preamble, and in no particular order, I give you my list of the key areas I got hands on and learnt how different Alfresco is:

Custom data model is the core for any enterprise content management solution. The use of “Aspects” as its core is the fundamental concept for content modeling in Alfresco. Although in the form D6, aspects was introduced, how different is the use and approach in Alfresco is something I will take a deep dive in my next blog.

Alfresco Web Scripts brings together the world of content repository and the web. Being a Documentum developer earlier ways of interaction with the repository have been either using DFC API’s or DQL. In Alfresco, Web Scripts provide RESTful access to content within the repository and we can build our own interface using java script. A custom move operation is implemented using the Web Scripts and the comparison of the implementation with the Documentum would be a something to be noted.

On my last project, we had requirements for the customers to be able to permanently redact Personally Identifiable Information (PII) from existing documents stored in the repository and version the original document upon save. For various reasons, we decided to integrate the 3rd party tool Daeja ViewOne module to provide this capability. I will discuss the topic as part of this blog series.

I started this series based on my experience implementing Alfresco projects and I invite you to share any of your experiences with any part of the road map wherein you run into interesting twists and turns? Did you drive off the road to get some help? I welcome your feedback as the blog takes its shape. See you all soon in Take 2.

Documentum-Composer

July 6th, 2010 by athimmavajjala

It was year 2004, when I was first introduced to EMC Documentum. As I first fired up the DAB IDE, I felt man this IDE is Unintuitive, slow and cryptic. Apart from learning the DFC API’s in Documentum, getting accustomed to DAB itself was an excruciating experience. Finally after three years Documentum Composer came as a savior.

I liked Documentum Composer for 3 main reasons:

1. It has better interface than DAB, and its easy to use providing support for keeping multiple projects open at the same time. You need not checkout and checkin the artifacts every time you make any changes only thing you will be doing is installation of the project.

2. It comes as combined feature of both DAB and DAI where in you create the DAR and install from the same IDE into a repository.

3. Composer is built on the Eclipse platform, and the installation includes a bundled Eclipse environment. One of the benefits of the Eclipse platform is that it offers a number of paradigms that are familiar to developers thus allowing users to identify the issues at the DocApp development level instead of installation level.

Composer helps to develop applications faster and easier by reducing the learning curve through user interface consistency and the familiar standards-based Eclipse tooling framework. Composer can be run in “headless Eclipse mode”, which enables command-line scripting and automation of project installation. This feature provides self-contained, fully automated deployment of projects into a repository, without the need for a previous DFC installation or DMCL client.

One thing that I like about Composer is the ability to install into multiple repositories, assuming they share a connection broker, without restarting and on the other hand ability to work offline. To install the application or migrate the data model a user would use DAI to install the docapp archive into one or more production Docbases, where as with composer you can do both the task in one single place without the need of DAI to deploy.

It is both an artifact project, supporting artifact development, and it is also a DFS project, supporting consumer and service DFS development.

Another feature that I liked about Composer is the ability to deploy java methods as BOF module without the need for the user to copy the class files under /dba/java_methods folder in the CS location and to be able to deploy on the java method server without having to bring down the content server. Also when Business Object Framework (BOF) came out in 6.0, DAB was extended to support creation of modules which served as “container” for custom BOF code.  However, you still had to have a java environment to build JAR files that contained BOF code.  In addition, any customizations to WDK/Webtop were not supported in DAB.  The WAR build and deployment process was entirely separate from docapp installation process. Composer is an effort by EMC to come up with a single tool that help developers to Configure Documentum artifacts, Create Business Objects, Build DFS Services, Deploy Documentum applications across multiple repositories.

Once you begin using Documentum Composer the benefits are easily evident. To name a few:

  • Developing applications quickly and easily
  • Learning curve is reduced due to the consistency of user interface and familiar standards of Eclipse tooling framework

Personally I consider composer a great gift to the Documentum developer community.

So long, farewell, auf Wiedersehen, eRoom

June 11th, 2010 by cstephenson

A simple goal – “export, transform, load” – the destination is a matter of choice.

EMC eRoom is going away.  It has been marked as End of Life (EOL) so what next?  EMC Documentum have 2 options: EMC Documentum Collaboration Services; EMC Documentum Centerstage.  Armedia’s immediate goal is to support Collaboration Services, then Centerstage but why stop there?  Why limit a client’s choice.

Armedia’s eRoom migration story is in 3 acts (and yes, I have been listening to some test pieces that I used to play in my brass banding days – check out Year of the Dragon by Philip Sparke).

Act I – The Export

Getting the content out of eRoom into an understandable format.  Of course, its not just the content, there is  a large quantity of metadata in eRoom as well.  Act I – The Export deals solely with interrogating eRoom and generating a document detailing everything about eRoom.  From communities to Files.  From eRoom Setup to databases – we mean everything.  The result: a well-formed XML document

Act II – The Transformation

As with any classic performance, after the captivating opening, Act II deals with getting to know the characters.  In this case, the transformation gets to know the XML document and gains a deep understanding of the objects held within.  The transformation is responsible for also generating a secondary XML document. This is formed to support the ingestion to a new Content Management System (CMS) and / or Collaboration System.  Currently the supported transformation is for EMC Documentum Collaboration Services.  This can easily be extended due to the flexible architecture of this utility and is simply a case of transforming XML.

Act III – The Load

The closing act is the build up to the dramatic climax which leaves the audience going “WOW!”.  eRoom Migration aims to achieve the same “WOW!”.  Now that the XML has been transformed you can sit back and let the load run automatically.  That’s it.  By using the ingestion engine of Caliente! loading all the content and metadata is simple.  Just let eRoom Migration take care of everything for you.  The only thing it does not do is say “WOW!” – we leave that to you.

Over the next few weeks I plan to talk in more detail about the approach taken and dig deeper into the 3 different pieces of the migration effort.  For those eRoom users, what do you see yourselves using in the near future?

Let there be guitar!

May 27th, 2010 by cstephenson

What happens when you combine Ligero and EMC Documentum when developers have a spare moment. You get:

Rockumentum!

The question was thrown out in casual conversation – “Hey, do you think we could turn Documentum into a Jukebox?”. So what else would you do with your million dollar investment? It turns out this was surprisingly easy to do when Ligero was put in front of Documentum.

A small number of mp3 files were added to our test repository, the URL was put into the browser and in the immortal words of AC/DC – LET THERE BE ROCK!

The audio player associated with the .mp3 extension was launched and within mere milliseconds rock music filled the room. Of course this was coupled with a nice pair of LogicTech laptop speakers which provided a surprisingly good sound.

Being the nerds we are, we added a small playlist (.pls) to the repository. This was kept simple and contained purely a list of absolute references to the tunes in the repository. First we gave the playlist url to our audio player and once again we were treated to the sounds of rock music. Awesome dude.

So what next. Well, it would be simple enough to develop a few Liglets* for Ligero to build playlists for you. These could return the metadata associated with the tracks. Heck, we could even throw in the Album Art for kicks.  There are lots of things we could do.

For more fun, throw in video and and take advantage of HTML5. This could lead to .avi and possible video training, but where is the fun in that?

*Liglets: a fun and novel idea around adding scriplets to Ligero to assist with rapid development of webpages being served out of Documentum.

Migrating CAD Drawings into Documentum

December 17th, 2009 by rrana

Anyone who has ever attempted a data migration knows that there is no such thing as ‘a smooth transfer of power’, as it were. From learning both the legacy system and the new system, business processes, data clean-up, data mapping, deciding between existing tools and creating new ones, there is a lot to keep track of to ensure that what you have at the end matches what you started with.

Migrating CAD engineering drawings poses its own set of unique challenges. AutoCAD and MicroStation drawings can internally have references to other drawing files that exist within the content management system (or file system). When moving these files over to a new system, care must be taken to ensure these references are maintained.

Sword CADtop is a tool that provides users access to CAD drawings that are stored in a Documentum repository directly from within AutoCAD or MicroStation. It gives the user the ability to check in/out drawings, browse, search, view documents and attach reference drawings that exist in the docbase. CADtop maintains reference information by updating the links within the drawing and storing this information in a registered table.

CADtop also provides an import tool that can be used to migrate documents into the Documentum system. It handles importing of documents of multiple types, such as .pdfs, .docs, .tifs, etc. along with the drawing formats, .dwg and .dgn. It is a very simple to use command line tool, but it has its limitations. It can only handle importing up to around 5000 files at a time. However, this can be overcome by creating a batch script to automate the copying of sets of files to a temporary directory and running the import tool.

The import tool uses an XML configuration file to determine the object type, folder path to import from and cabinet/folder to import to. However, those are the only properties you can set. The import tool has no way to attach meta-data from a database to content files being imported. A way around this is to make sure each file has a unique name, then import the files, then using a DFC script, update the properties of all the objects in the docbase from an XML file (for example). This works well as long as all the filenames are unique in the first place, or not drawings with references. If filenames are not unique, then you would have to rename the files, which means the filenames will no longer match the drawing’s internal reference links.

The import tool provides you with a switch that lets you run it in attach reference mode. This allows you to pass an Excel file which consists of a table of parent object IDs, child object IDs and reference filenames, so that you can resolve references within drawing files after they’ve been imported into the docbase. Thereby allowing you to use a tool such as Caliente (Shameless Plug™) to import the drawings into Documentum and update the properties. Then create the Excel file with all the references and run the CADtop import tool to attach them.

The end result will be that when you open one of these drawings through AutoCAD, CADtop will automatically pull out all the references from the Documentum repository and display them, just as in the legacy system.

High Volume Server (Part 2)

November 16th, 2009 by dselvakumar

HVS (Part 2) Data Partitioning

Before we delve into Data Partitioning, here’s a review of some fundamental database concepts. In terms of this article the focus will be on Oracle as the database.

Fundamental Database (ORACLE) concepts:
A Table:
(i) Stores structured data
(ii) is a database object housed in a segment.
A normal table (i.e. non-partitioned) is exactly one segment. A partitioned table will be made up of as many segments as it has partitions.

A Tablespace:
(i) is a logical container for segments
(ii) may be empty but it will most likely hold one or more segments
(iii) may be made up of one or more data files.

A Data file:
(i) is the physical operating system (OS) disk file that stores data. All data in an Oracle database ends up in a data file that is part of a file system configured by the OS or as a raw device managed by Oracle

A Data Block: is Oracle manages the storage space in the data files of a database in units called data blocks.

A Segment: is made up of extents and is the logical container for an object in an Oracle database

An Extent: is a set of contiguous data blocks

Data Partitioning (as it relates to the HVS)
Data Partitioning, in a nutshell, decomposes a database object (indexes, tables) into smaller more manageable pieces called partitions. The goal of data partitioning is to reduce the amount of data read for a particular SQL operation so that the overall response time is reduced. The data is organized using a partition key which is essentially a set of columns that determines in which partition a given row of objects will reside. It is important to note that the underlying database must also be partition enabled (which Oracle has been for quite a while). HVS uses i_partition as the partition key and HVS does Range partitioning, i.e., each partition is specified by the value of its partition key (i_partition). When using HVS, a SysObject, all its associated content objects, any local ACLs being referenced by the SysObject, and so on, will be assigned to the partition designation of the SysObject. For a LWSO, if the i_partition is NOT explicitly set, it will by default, be in the same partition as the parent, thus sharing the same i_partition id. Partition pruning is essentially directing a query to a subset of partitions rather than the entire table.

Data Partitioning leads to improved manageability (storage of data files on different physical drives), improved availability (partition independence) and optimized queries (using partition pruning).

To D6 or "Deep 6"

October 16th, 2009 by bhunton

Most of us in IT can think of many things we would rather do than upgrade systems and software; for example, maybe take a nice trip to the dentist, or perhaps volunteer as the test subject in an IRS agent, audit training class.

If you finally have your Documentum 5.3 environment rolling along with good performance, then you probably don’t want to think about an upgrade, even though your current installation has been out of normal EMC support for a year. You may find yourself in Hamlet’s shoes, asking yourself whether to “Deep 6” the upgrade or to “take arms against a sea of trouble” and go for it. Wow! That really ravages the Soliloquy but you get my meaning.

I have been technically involved in Documentum upgrades since version 2. None have been trivial. EDMS 98 to 4i was tough because of the major changes in architecture. In 4i to 5, you had new WDK based clients, and the classes moved from dmcl to DFC. Web services were introduced. EBS still worked but it was in the gray area of support. What has happened to your custom Documentum applications? Did you have to build an interop so dmcl could still work in the new DFC world? Did you rewrite them? Maybe now is the time.

Fulltext Indexing switched from Verity to FAST. If you had millions of indexed content objects in Verity, you had to make a decision whether to blow the index away and start completely over, or to migrate it. Is your migration still running? Did you choose the mode unwisely? Oh, you use NAS instead of SAN? Ouch!

RightSite 4i went away too, leaving EMC customers with tens of thousands of broken anonymous links to repository content that required tedious labor to convert. Some of them have purchased Armedia’s Ligero which provides an easy solution to the problem of converting legacy RightSite links, as well as providing a very nice web content publishing/caching application.

The beat goes on, and we are well into the D6 product life cycle. The upgrade from Documentum 5 to 6 presents major challenges even in simple installations. Significant changes were introduced with version 5 that have come to fruition in D6. If you have a relatively simple Documentum installation, under version 5 you could probably ignore the global registry, ACS and BOF, even DFC properties up to a point. In D6 you have to pay close attention to them, because they are fully integrated. Their configuration can greatly affect your performance.

Then there is the move away from the simplicity of Tomcat, toWebLogic, to JBoss for the method server and ACS. How’s your heap? Then there are the new Documentum Archive (DAR) installations that replace the Docapp Installer. Then there is UCF, and, and…

And if all that is not enough to make your head spin like Linda Blair’s, there’s the sheer mass of the new installation binaries that must be downloaded from EMC then heaved to your host server. Do you have the additional storage and database space to handle it? Did I tell you that you need to go through repository sizing estimates again? Did I tell you that when it comes time to do the migration and upgrades you need to build into your schedule significantly more time to complete.

Your time is up! Documentum 6 is well beyond first ship date. It provides great features, functions, and flexibility. It is a suite of over 100 products and features designed for medium to very large enterprises. You need experienced architects, engineers, and integrators to design and build your system and to plan your upgrade. You need Armedia to get you through it.

Copyright © 2002–2011, Armedia. All Rights Reserved.