Enterprise Content Management

Armedia Blog

Archive for the ‘Data Migration’ Category

Modifying SharePoint Document Content Types and Libraries Using the Client Model

September 22nd, 2011 by Tim Lisko

In a current eRoom to SharePoint migration project I wanted to preserve the “Date Created”, “Date Modified”, “Edited By”, and “Created By” fields in the eRoom documents. To do this I created a custom content type (in SharePoint 2010) based on the standard “Document” content type with four new fields to accept the migrated information. I also created a library template that uses this content type as well as the standard “Document” content type. I’ll explain why later in this article.

Once the documents are migrated I need to update the migrated list/library items. What I don’t want is to keep one set of “preserved” fields and another set of SharePoint fields. The SharePoint fields, as you probably guessed, set the author and editor as the individual doing the migration (or impersonated member) and the created and modified dates being the date of migration.

Updating list items is a pretty easy thing to do in SharePoint’s Client Object Model. The following code accomplishes this task.

foreach (SP.ListItem item in lstItems)
{
//skip items not assigned to the migratedDoc document type
if (item.ContentType.ToString() == migType.ToString())
{
row = listItems_dtbl.NewRow();
row["List"] = list.Title;
title = (!Convert.IsDBNull(item["Title"]) ? item["Title"].ToString() : "");
if (title == "" && item["FileLeafRef"].ToString() != "")
{   //set title if one does not exist
title = item["FileLeafRef"].ToString();
title = title.Replace(item["File_x0020_Type"].ToString(), "");
}

sEditor = (!Convert.IsDBNull(item["eEditor"]) ? item["eEditor"].ToString() : "");
sAuthor = (!Convert.IsDBNull(item["eAuthor"]) ? item["eAuthor"].ToString() : "");

try
{
editor = ctx.Web.EnsureUser(sEditor);
ctx.ExecuteQuery();
}
catch
{
editor = ctx.Web.EnsureUser("tlisko");
}
try
{
author = ctx.Web.EnsureUser(sAuthor);
ctx.ExecuteQuery();
}
catch
{
author = ctx.Web.EnsureUser("tlisko");
}
item["Editor"] = editor;
item["Author"] = author;
item["Modified"] = item["eModified"];
item["Created"] = item["eCreated"];

//remove values
item["eEditor"] = null;
item["eAuthor"] = null;
item["eCreated"] = null;
item["eModified"] = null;

item["ContentTypeId"] = docType.Id;
item.Update();
}
}
try
{
ctx.ExecuteQuery();
}

“ctx” is my ClientContext, though that is probably self-evident. You may also note that I only process items whose content type is my custom content type “migratedDoc” which is instantiated as “migType.” Only the migrated documents will have this content type.

eRoom documents do not have Titles, so I set a title based on the file name. You can leave Title blank – it isn’t a required field. Continuing through the code you will notice that I check that the names of the editor and author exist in the LDAP (ctx.Web.EnsureUser(sAuthor)). The update of the list/library item will fail if you try to use an editor or author that doesn’t exist.

I don’t really want to have a custom document library – the SharePoint default document library is just fine. So to get the result I want I need to reassign all the migrated items to the standard “Document” type. List items content types have a Name property. Unfortunately you can’t change it – it is read only. What you can change is the ContentTypeId and changing that value changes the content type of the list item. In the code above “docType” is local instance of the “Document” content type.

I said at the beginning of this article that I would explain why I wanted to keep the “Document” content type in addition to the creating a custom content type. The reason is that you can’t assign a list item to a content type if that content type isn’t in the list. That’s also why you can’t move a document from one SharePoint library to another unless its content type is in the target library as well.

Next I want to remove my custom content type from the list altogether. But to do that, I have to make sure that all those extra fields that are in the custom content type are set to null – which I do before changing the content type. The order of resetting these fields and reassigning the content type doesn’t matter. What does matter is that if these extra fields are not set to null, you will be unable to remove the content type from the list.

After processing all the list items I can now remove the custom content type from the list. Easy, two lines of code:

migType.DeleteObject();
ctx.ExecuteQuery();

But that still doesn’t remove the custom fields from the list. More code is needed.  Incorporating the delete of the custom content type you have this.

migType.DeleteObject();
list.Fields.GetByTitle("eCreated").DeleteObject();
list.Fields.GetByTitle("eModified").DeleteObject();
list.Fields.GetByTitle("eAuthor").DeleteObject();
list.Fields.GetByTitle("eEditor").DeleteObject();
ctx.ExecuteQuery();

And that’s it! The library looks like the default document library with only the “Document” content type as desired.

I'm in eRoom, Get Me Out of Here!

July 27th, 2011 by cstephenson

[turns up amplifier, cranks up volume to 11, puts Huey Lewis and the News vinyl on record player....]
I start up Visual Studio 6.0 and settle down to some nostalgic programming with Visual Basic (VB) 6.  Ah, college memories come flooding back and strangely not many related to actual study.  I feel like I have stepped back in time – the processor has worked its speed up to 88 mph and now I am back using VB6.  I also remember learning Delphi Version 1.0 so that I could *help* a friend do their final year project. Good times.

(more…)

So long, farewell, auf Wiedersehen, eRoom

June 11th, 2010 by cstephenson

A simple goal – “export, transform, load” – the destination is a matter of choice.

EMC eRoom is going away.  It has been marked as End of Life (EOL) so what next?  EMC Documentum have 2 options: EMC Documentum Collaboration Services; EMC Documentum Centerstage.  Armedia’s immediate goal is to support Collaboration Services, then Centerstage but why stop there?  Why limit a client’s choice.

Armedia’s eRoom migration story is in 3 acts (and yes, I have been listening to some test pieces that I used to play in my brass banding days – check out Year of the Dragon by Philip Sparke).

Act I – The Export

Getting the content out of eRoom into an understandable format.  Of course, its not just the content, there is  a large quantity of metadata in eRoom as well.  Act I – The Export deals solely with interrogating eRoom and generating a document detailing everything about eRoom.  From communities to Files.  From eRoom Setup to databases – we mean everything.  The result: a well-formed XML document

Act II – The Transformation

As with any classic performance, after the captivating opening, Act II deals with getting to know the characters.  In this case, the transformation gets to know the XML document and gains a deep understanding of the objects held within.  The transformation is responsible for also generating a secondary XML document. This is formed to support the ingestion to a new Content Management System (CMS) and / or Collaboration System.  Currently the supported transformation is for EMC Documentum Collaboration Services.  This can easily be extended due to the flexible architecture of this utility and is simply a case of transforming XML.

Act III – The Load

The closing act is the build up to the dramatic climax which leaves the audience going “WOW!”.  eRoom Migration aims to achieve the same “WOW!”.  Now that the XML has been transformed you can sit back and let the load run automatically.  That’s it.  By using the ingestion engine of Caliente! loading all the content and metadata is simple.  Just let eRoom Migration take care of everything for you.  The only thing it does not do is say “WOW!” – we leave that to you.

Over the next few weeks I plan to talk in more detail about the approach taken and dig deeper into the 3 different pieces of the migration effort.  For those eRoom users, what do you see yourselves using in the near future?

Migrating CAD Drawings into Documentum

December 17th, 2009 by rrana

Anyone who has ever attempted a data migration knows that there is no such thing as ‘a smooth transfer of power’, as it were. From learning both the legacy system and the new system, business processes, data clean-up, data mapping, deciding between existing tools and creating new ones, there is a lot to keep track of to ensure that what you have at the end matches what you started with.

Migrating CAD engineering drawings poses its own set of unique challenges. AutoCAD and MicroStation drawings can internally have references to other drawing files that exist within the content management system (or file system). When moving these files over to a new system, care must be taken to ensure these references are maintained.

Sword CADtop is a tool that provides users access to CAD drawings that are stored in a Documentum repository directly from within AutoCAD or MicroStation. It gives the user the ability to check in/out drawings, browse, search, view documents and attach reference drawings that exist in the docbase. CADtop maintains reference information by updating the links within the drawing and storing this information in a registered table.

CADtop also provides an import tool that can be used to migrate documents into the Documentum system. It handles importing of documents of multiple types, such as .pdfs, .docs, .tifs, etc. along with the drawing formats, .dwg and .dgn. It is a very simple to use command line tool, but it has its limitations. It can only handle importing up to around 5000 files at a time. However, this can be overcome by creating a batch script to automate the copying of sets of files to a temporary directory and running the import tool.

The import tool uses an XML configuration file to determine the object type, folder path to import from and cabinet/folder to import to. However, those are the only properties you can set. The import tool has no way to attach meta-data from a database to content files being imported. A way around this is to make sure each file has a unique name, then import the files, then using a DFC script, update the properties of all the objects in the docbase from an XML file (for example). This works well as long as all the filenames are unique in the first place, or not drawings with references. If filenames are not unique, then you would have to rename the files, which means the filenames will no longer match the drawing’s internal reference links.

The import tool provides you with a switch that lets you run it in attach reference mode. This allows you to pass an Excel file which consists of a table of parent object IDs, child object IDs and reference filenames, so that you can resolve references within drawing files after they’ve been imported into the docbase. Thereby allowing you to use a tool such as Caliente (Shameless Plug™) to import the drawings into Documentum and update the properties. Then create the Excel file with all the references and run the CADtop import tool to attach them.

The end result will be that when you open one of these drawings through AutoCAD, CADtop will automatically pull out all the references from the Documentum repository and display them, just as in the legacy system.

Copyright © 2002–2011, Armedia. All Rights Reserved.