Enterprise Content Management

Armedia Blog

Archive for the ‘Software Development’ Category

The CRASH Report

January 10th, 2012 by Scott Roth

Cast software, the maker of software quality tools, released their second annual CRASH (Cast Report on Application Software Health) report in December. The report examined the “health” of world-wide software applications by examining the source code of 745 applications (~365 million lines of code), from 160 different companies, spanning 10 industry sectors, and 8 programming languages. The code examination flagged 1800 different types of development and architecture violations that compromise application “health” in 5 major categories. The categories were:

  • Robustness – The stability of an application and the likelihood of introducing defects when modifying it.
  • Performance – The efficiency of the software’s application layers.
  • Security – An application’s ability to prevent unauthorized intrusions.
  • Transferability – The ease with which an application can be transferred to a new maintenance team.
  • Changeability – An application’s ability to be easily and quickly modified.

Cast has drawn some interesting conclusions in their report. Here are a few I found notable.

  • The most secure applications seem to be large COBOL applications in the financial and insurance industry (that should be reassuring to everyone). The least secure applications were written in .Net.  Yikes!
  • J2EE applications scored worst in performance, primarily because of misunderstood technologies and frameworks.  Another contributing reason could be the high degree of modularity inherent in J2EE applications.
  • Transferability scores for applications in the government sector scored lower than in any other industry sector. Being a government contractor, this one strikes close to home. What conclusions or insights can be gleaned from this finding? One insight the report draws is that government agencies are spending ~73% (on average) of their IT budgets to maintain existing applications — more than any other industry sector. I ask you, where is the money in government IT contracting?
  • Transferability and Changeability scores were highest for applications developed using a classic waterfall style methodology, as opposed to an Agile methodology. Whoa! I didn’t see that one coming. (Cast found that the other three categories, Robustness, Performance and Security, were about equal between waterfall and Agile methodologies.) Perhaps because Agile projects are in a continual state of refactoring that they are never in an ideal state to be transferred.

All of the deficiencies identified in this report are termed “technical debt” and have an average cost (according to Cast) of $3.61 per line of code to repair — except for Java, which rings in at $5.42 per line of code. That’s a lot of money and consumes an enormous amount of IT budgets.  For the roughly 365 million lines of code used in the study, that’s $1.3 billion of technical debt.

In conclusion, let me quote from Cast’s own conclusion, who I think said it quite well: “The observations from these data suggest that development organizations are focused most heavily on Performance and Security in certain critical applications. Less attention appears to be focused on removing the Transferability and Changeability problems that increase the cost of ownership and reduce responsiveness to business needs. These results suggest that application developers are still mostly in reaction mode to the business rather than being proactive in addressing the long term causes of IT costs and geriatric applications.”

The 22 page executive summary can be downloaded from Cast here.
http://www.castsoftware.com/resources/resource/email-campaigns/cast-report-on-application-software-health?gad=HPH

The Fast/Good/Cheap Rule of Software Development

September 30th, 2011 by Scott Roth

Triangles have been a staple of mathematics, architecture and engineering for centuries. They have also become important in software development by way of a project management concept. You may have heard of the “Fast/Good/Cheap” rule. This rule uses a triangle to depict and constrain the attributes of a project that are usually in conflict during development: schedule, quality and budget. Ideally, a project should maximize speed (Fast), quality (Good), and efficiency (Cheap).

The rule of “Fast/Good/Cheap” states that a customer can choose only two of these attributes to maximize. The third, unchosen attribute, will naturally suffer. This is a physical constraint of a triangle, and a realistic constraint of a project.

If you consider the area of a triangle to be constant and represent the scope, or requirements of your software project, you can create numerous triangles to model the “Fast/Good/Cheap” rule. For example:

Remember, the area of the triangle (scope of the project) must remain constant. Therefore, a change to any one side of the triangle necessarily affects the other two. The implications are these:

  • Good & Fast = Expensive. High quality software will be produced very quickly to meet a customer’s time constraint. However, this approach will require additional staff, extended work hours, additional testing, etc. — all things that will drive up the cost of the project.
  • Good & Cheap = Slow. High quality software will be produced; however the project tempo will be slow. Other projects will take priority over this project and interrupt its schedule frequently.
  • Fast & Cheap = Poor Quality. The project will be done quickly and on a shoe string budget, and you will get what you pay for. Don’t expect all of the requirements to be met and expect some bugs and unpredictable behavior because the development team didn’t have the time or resources to thoroughly design or test the software.

In reality, everyone wants Good & Fast, but can’t afford it so they settle for Fast & Cheap. Then one of two things happen:

  1.  Inferior, non-compliant, buggy software is delivered that the customer is unhappy with and your company gets a bad reputation for poor quality.
  2. You and your staff absorb all the cost and overtime required to produce quality software on an unrealistic schedule with an insufficient budget (because you are perfectionists). You are lauded as heroes, but the project actually cost you money instead of making you money.

Ideally, you strive to strike a balance among these competing constraints, although some projects will justifiably favor one attribute over another.

All complex systems are difficult to model, and the software development process is no exception. This very simple model can be used to easily and quickly demonstrate to customers how changes to any of a project’s constraints (Fast/Good/Cheap) will effect the other constraints of the project. I have used this model numerous times to demonstrate how these constraints relate and the consequences of changing or over emphasizing one of them. A picture is worth a thousand words and customers usually get the “Fast/Good/Cheap” rule immediately once you draw the triangle. And best of all, beyond its obviousness, there is a lot of solid project management research and data to validate it. I will leave that reading to you.

Modifying SharePoint Document Content Types and Libraries Using the Client Model

September 22nd, 2011 by Tim Lisko

In a current eRoom to SharePoint migration project I wanted to preserve the “Date Created”, “Date Modified”, “Edited By”, and “Created By” fields in the eRoom documents. To do this I created a custom content type (in SharePoint 2010) based on the standard “Document” content type with four new fields to accept the migrated information. I also created a library template that uses this content type as well as the standard “Document” content type. I’ll explain why later in this article.

Once the documents are migrated I need to update the migrated list/library items. What I don’t want is to keep one set of “preserved” fields and another set of SharePoint fields. The SharePoint fields, as you probably guessed, set the author and editor as the individual doing the migration (or impersonated member) and the created and modified dates being the date of migration.

Updating list items is a pretty easy thing to do in SharePoint’s Client Object Model. The following code accomplishes this task.

foreach (SP.ListItem item in lstItems)
{
//skip items not assigned to the migratedDoc document type
if (item.ContentType.ToString() == migType.ToString())
{
row = listItems_dtbl.NewRow();
row["List"] = list.Title;
title = (!Convert.IsDBNull(item["Title"]) ? item["Title"].ToString() : "");
if (title == "" && item["FileLeafRef"].ToString() != "")
{   //set title if one does not exist
title = item["FileLeafRef"].ToString();
title = title.Replace(item["File_x0020_Type"].ToString(), "");
}

sEditor = (!Convert.IsDBNull(item["eEditor"]) ? item["eEditor"].ToString() : "");
sAuthor = (!Convert.IsDBNull(item["eAuthor"]) ? item["eAuthor"].ToString() : "");

try
{
editor = ctx.Web.EnsureUser(sEditor);
ctx.ExecuteQuery();
}
catch
{
editor = ctx.Web.EnsureUser("tlisko");
}
try
{
author = ctx.Web.EnsureUser(sAuthor);
ctx.ExecuteQuery();
}
catch
{
author = ctx.Web.EnsureUser("tlisko");
}
item["Editor"] = editor;
item["Author"] = author;
item["Modified"] = item["eModified"];
item["Created"] = item["eCreated"];

//remove values
item["eEditor"] = null;
item["eAuthor"] = null;
item["eCreated"] = null;
item["eModified"] = null;

item["ContentTypeId"] = docType.Id;
item.Update();
}
}
try
{
ctx.ExecuteQuery();
}

“ctx” is my ClientContext, though that is probably self-evident. You may also note that I only process items whose content type is my custom content type “migratedDoc” which is instantiated as “migType.” Only the migrated documents will have this content type.

eRoom documents do not have Titles, so I set a title based on the file name. You can leave Title blank – it isn’t a required field. Continuing through the code you will notice that I check that the names of the editor and author exist in the LDAP (ctx.Web.EnsureUser(sAuthor)). The update of the list/library item will fail if you try to use an editor or author that doesn’t exist.

I don’t really want to have a custom document library – the SharePoint default document library is just fine. So to get the result I want I need to reassign all the migrated items to the standard “Document” type. List items content types have a Name property. Unfortunately you can’t change it – it is read only. What you can change is the ContentTypeId and changing that value changes the content type of the list item. In the code above “docType” is local instance of the “Document” content type.

I said at the beginning of this article that I would explain why I wanted to keep the “Document” content type in addition to the creating a custom content type. The reason is that you can’t assign a list item to a content type if that content type isn’t in the list. That’s also why you can’t move a document from one SharePoint library to another unless its content type is in the target library as well.

Next I want to remove my custom content type from the list altogether. But to do that, I have to make sure that all those extra fields that are in the custom content type are set to null – which I do before changing the content type. The order of resetting these fields and reassigning the content type doesn’t matter. What does matter is that if these extra fields are not set to null, you will be unable to remove the content type from the list.

After processing all the list items I can now remove the custom content type from the list. Easy, two lines of code:

migType.DeleteObject();
ctx.ExecuteQuery();

But that still doesn’t remove the custom fields from the list. More code is needed.  Incorporating the delete of the custom content type you have this.

migType.DeleteObject();
list.Fields.GetByTitle("eCreated").DeleteObject();
list.Fields.GetByTitle("eModified").DeleteObject();
list.Fields.GetByTitle("eAuthor").DeleteObject();
list.Fields.GetByTitle("eEditor").DeleteObject();
ctx.ExecuteQuery();

And that’s it! The library looks like the default document library with only the “Document” content type as desired.

Implementing Multiple Filters in LINQ Query from Visual Studio

September 22nd, 2011 by Tim Lisko

A best practice in any application making data calls is to push processing to the server and limit the amount of data that has to come back to the application.

I recently worked on a windows application that uses the Microsoft SharePoint Client Object Model to manipulate lists and their elements. I needed to filter the lists for processing in the application to just the document libraries that were not hidden, contained at least one item, and were not the “Site Assets” or “Style Library.”

Cleary a situation where filtering is desired. I found it easy to find examples for generating a list with one filter. For example:

1
2
3
4
5
6
7
var web = clientContext.Web;
SP.ListCollection listcoll = web.Lists;

ctx.Load(listcoll,
lists => lists.Include(list => list.Title)
.Where(list => list.BaseTemplate == 101));
ctx.ExecuteQuery();

However, examples on implementing multiple filters such as I needed proved much more challenging. I finally found an example where the implementation used many “.Where” clauses which led to the following attempt.

1
2
3
4
5
6
7
8
...
ctx.Load(listcoll,
lists => lists.Include(list => list.Title)
.Where(list => list.BaseTemplate == 101)
.Where(list => list.ItemCount > 0)
.Where(list => list.Title != "Site Assets")
.Where(list => list.Title != "Style Library")
.Where(list => list.Hidden == false));

Unfortunately this would not even compile in Visual Studio 2010.

I decided to try implementing a more common syntax used in ‘if’ statements – using ‘&&’ to combine multiple conditions but within one .Where.

1
2
3
4
5
6
7
8
...
ctx.Load(listcoll,
lists => lists.Include(list => list.Title, list => list.ItemCount)
.Where(list => list.BaseTemplate == 101 &&
list.ItemCount > 0 &&
list.Title != "Site Assets" &&
list.Title != "Style Library" &&
list.Hidden == false));

YES – Success! Finally, listcoll finally had the libraries I needed.

I'm in eRoom, Get Me Out of Here!

July 27th, 2011 by cstephenson

[turns up amplifier, cranks up volume to 11, puts Huey Lewis and the News vinyl on record player....]
I start up Visual Studio 6.0 and settle down to some nostalgic programming with Visual Basic (VB) 6.  Ah, college memories come flooding back and strangely not many related to actual study.  I feel like I have stepped back in time – the processor has worked its speed up to 88 mph and now I am back using VB6.  I also remember learning Delphi Version 1.0 so that I could *help* a friend do their final year project. Good times.

(more…)

Top 25 Programming Errors

July 25th, 2011 by Scott Roth

The 2011 CWE/SANS Top 25 Most Dangerous Software Errors report was published by the SANS Institute and MITRE in June (cwe.mitre.org/top25).  The report leveraged the SANS Institute’s Tops 20 attack vectors (www.sans.org/top20) and MITRE’s Common Weakness Enumeration (CWE) to develop a list of the most frequent and sever programming errors this year.  The report details each error, how it is implemented, what the danger is, and practical ideas for identifying and mitigating it.  The report is a fascinating report to read — both its compilation and technical content — and well worth your time even if you are an experienced developer.  The report also suggests it would be a valuable read for software project managers, software  project customers, and educators.  I agree.

Not surprising, the top 25 most dangerous programming errors contain some well known programming mistakes that have been with us for years (decades, in fact).  For example, the first error discussed in the report is SQL Injection.  Using
improperly escaped special characters in a SQL query, hackers can steal and/or highjack your data. Skeptical?  Ask Sony Pictures, PBS and MySQL.com, they were all victims of attack this year enabled by this common programming mistake.  It is
unfortunate, because with a little effort, this programming vulnerability can be easily mitigated.

Also, still in the top 5 is the classic Buffer Overflow problem. This mistake allows hackers to inject more information into a field or variable than it can handle.  The resulting “overflow” can contain malicious code and grant hackers access to your system, plant viral code, or simply crash your system.  Buffer overflows have been around for decades.  Too bad, since these too can
easily be corrected.

Other common mistakes include:  OS Command Injection, Cross-site Scripting, Hard Coded Credentials, Unrestricted File Uploads, bogus or flawed cryptography, and Open Redirects.  As detailed in the report, all of the errors are very simple to correct and prevent.  It would server everyone well — developers, testers, designers, managers and users — to be aware of
these errors, how to identify them, and most importantly, how to mitigate them.  Give it a read, it is well worth your time:  cwe.mitre.org/top25.

 

Using SP Designer 2010 to Add Custom Buttons

June 21st, 2011 by Tim Lisko

Sharepoint Designer  2010 provides a great quick way to add simple actions to your SharePoint application.

I have a project that I wanted to add a couple buttons that would allow the user to navigate away from a “Drop-Off” library to the libraries where files are directed by my content organizer rules. So, without thought to SP Designer I opened my VS 2010 and commenced to coding. The great thing about building your buttons in VS is the flexibility you have – but being so flexible means you have to build a lot even if you only want a little!

In this instance I only wanted my buttons in one library. Using VS2010 you start with an elements.xml file. Unfortunately the elements file can only point to a “type,” not a specific library or list. So, even though I only wanted to see my buttons in my “Drop Off” document library, all the document libraries would also display the buttons. You can certainly get around that by omitting the type and buttons and code them in or setting visibility, but I haven’t done that myself so I was less interested in getting the coding right and figuring out where to insert the coding.

Enter SharePoint Designer 2010. It provides an easy nterface for creating custom actions that aren’t complex. You can create buttons that will navigate to a form, initiate a workflow, or navigate to a URL. Even better, it only applies to the document library or list where you are adding the custom action – exactly what I wanted!

You can certainly add an image to your button – this was something I overlooked initially and was driving me crazy that I “couldn’t” add a button image! Well, of course you can add an image. Just move that scrollbar down on the side of the “Create Custom Action” screen to get to the “Advanced custome action options” – duh!

So, a couple limitations right off:

  • You can’t create tabs, groups, or context groups with SP Designer so far as I can see
  • SP Designer placed my buttons in the “Manage” group of the “Documents” tab. There does not appear to be anyway to direct your buttons to a specific ribbon group or tab that you might prefer.

In this case the SP Designer provided all the functionality I needed and saved me time I would have spent coding, testing, and deploying!

New and improved method for extracting JavaScript from HTML

June 17th, 2011 by dmiller

A few weeks back, I wrote about using HTML5 custom data attributes as an enabling mechanism for removing all JavaScript from our HTML pages.

Turns out that approach has one significant drawback: HTML attributes are not suitable for storing arbitrary data.  Specifically, we have a need for storing user-entered data and stringified JSON objects; both of which may contain apostrophes and quotes.  Basically, the attribute value is truncated at the first apostrophe or quote, resulting in a depressing situation in terms of being able to parse the JSON:

<body data-titles='why isn't storing arbitrary data always the right thing to do?'>

If you see the color highlighting in the HTML sample above, the problem should be obvious: everything after the apostrophe in “isn’t” is truncated from the attribute value.

Yesterday I found out HTML introduces another new data format, one that is more general and much more suited to our needs: HTML5 microdata.  The microdata specification is still in flux; I am writing this blog on June 17 2011, and the latest working draft is dated May 25, only a few weeks ago!  Anyway, microdata offers a much richer data model, and also provides support for standard microdata types; schema.org defines microdata formats for books, movies, people, places, events, products, and many others.  (Plus it has a good writeup on microdata in general).

But I’m getting a little ahead of myself.  I don’t need to define any microdata types (not yet, anyways).  I just need a new way to store some data on my HTML page, so my JavaScript can read it.

Using microdata, my example above looks more like this:

<div id="data" itemscope="true" style="display:none">
  <span itemprop="titles">why isn't storing arbitrary data always the right thing to do?</span>
</div>

The itemscope attribute on the div identifies the div as an item; items have properties (including nested items, but I haven’t used those yet).  The itemprop attribute in the span is the property key, and the text content of the span is the property value.

The awesome thing from my perspective is this: I can put anything in a span.  I’m no longer restricted to the attribute data model; I can put JSON or user-entered data or whichever in there. 

So I have these values in HTML, how do I get them from my JavaScript?  The microdata specification includes a DOM API, however no browser whatsoever currently implements this API!  That is how new and in-flux microdata is!

In my short career as a web developer, I already found jQuery is always the answer.  The jquery.microdata.js plugin extends jQuery with functions very close to the microdata spec.  (The only difference I see is the plugin provides a properties function where the spec defines a properties array).   Now I can read microdata in my JavaScript files like so:

myMicrodataItems = $(this).items();
myEncodedTitle = myMicrodataItems.properties("titles").itemValue();

The items function gets me a collection of all the items on the page, the properties function gives me access to a specific property, and the itemValue function gets me that property’s value.

With all this in place, I can encode arbitrary data in my HTML basically without restriction, and easily read it from JavaScript, requiring no JavaScript whatsoever on the HTML page.  And this is all using standard HTML5. And I’m ready to use microdata more extensively in our application; we may get a lot of mileage from the schema.org predefined types.

The only fly in the ointment (and it’s a pretty small fly) is that microdata is not ignored by the user agent; so I have to add the “display:none” style to my divs.  Otherwise the encoded data is user-visible.

JSP Pages, Do not get too used to c:out!

June 10th, 2011 by jhsu

I was always one to go ahead and use <c:out> to display model data in my JSP pages.  Never had a reason not to!  Well, I recently had a reason…

As I mentioned in my last post, I am working on a web application that uses several jQuery libraries – another one is AutoSuggest, a handy-dandy plugin for auto-completion.  The AutoSuggest library can have pre-populated data, but it expects a JSONArray type. Normally XMLHttpRequest form submits work great, and data is returned to my JSP page with no page refresh, but sometimes I need to use normal HTML form submits (i.e. for file uploads), where my data is returned as ModelandView to a JSP page (refresh needed).

I had trouble with the latter – displaying JSONArray data in my JSP page because I was always setting the value to a javascript variable using <c:out>.  You can see below where I had some trouble.

OLD:

Option 1:

var existing_docApprovers = '<c:out value="${existing_docApprovers}" escapeXml="false" />';

Option 2:

var existing_docCases = "<c:out value=’${existing_docCases}’ escapeXml=’false’ />";

Problems –

  1. If I have single quotes around the <c:out> as in Option 1, this preserves the JSON with proper syntax around all properties/values.  Great – but not if there’s a in the JSON itself anywhere.  This results in a javascript parsing error.
  2. If I change it to double quotes around the <c:out> as in Option 2, same problem (“’s in the JSON itself).  This results in a javascript parsing error.
  3. If I change to escapeXml=”true” in either Option 1 or 2, the special HTML formatting is preserved, but then it’s not valid JSON.

What I needed to do was use the bare EL statement to leave the contents exactly as is -> a JSONArray!

NEW:

// default JSON Array as empty
var existing_docApprovers = [];
<c:if test="${ (not empty existing_docApprovers) }">existing_docApprovers = ${existing_docApprovers};</c:if>

Of course, I also needed to set existing_docApprovers to an empty array, just in case ${existing_docApprovers} was null or undefined or just did not exist, or I would get an javascript error setting a variable to nothing.

This Oreilly article explains the problem I had very well – I’ll paraphrase here to emphasize:
“I’ve recently seen it suggested that JSP pages should replace all <c:out/> with ${…}. This could have serious side-effects if the content of the variables presented is not considered carefully with respect to escaping.”

 

 

A pattern for extracting JavaScript from HTML

June 3rd, 2011 by dmiller

When I wrote about separation of concerns in webapps, I said I would consider how to apply separation of concerns in my project.  This post is a progress report!  I have tried this pattern on several pages and so far, all is well.

Step 1 is obvious: just extract all the JavaScript embedded in the page into its own .js file, and then import the new .js file into the old page.  This part is easy.  The only challenging part might be any hard-coded event handlers embedded in your HTML.  Anywhere you have, e.g., <a href=”…” onclick=”clickMe()”/>, you have to remove such explicit event handlers, and add code like this to the .js file: myAnchor.click=clickMe.

Luckily our code doesn’t have anything like this, so Step 1 is purely mechanical (so far, anyway).

Step 2 is more interesting.  Obviously the new JavaScript file has to be completely static: you want it to be cached by the browser; that’s the whole point of what we’re doing.  So you can’t have any page-specific data in the JavaScript file.

Well, in our code base, the JavaScript needs some page-specific stuff.  The JavaScript has to know about the form field values, the ID numbers of the objects displayed on the form, and other odds and ends of data that may be different each time the page is viewed.

The approach I came up with is to encode such needed data in the HTML page, and then have the JavaScript look up the data when the page is done loading.  jQuery makes this dead easy; with jQuery it is very easy to manipulate the browser DOM.

So the only question is where to put this data on the HTML page?  I use HTML5 custom data attributes: attributes whose name starts with “data-”, can be attached to any HTML element, and are ignored by the browser.  I chose to add these attributes to the HTML body element:


<body
data-contextPath=’<%=request.getContextPath()%>’
data-docId=’<c:out value=”${docId}” escapeXml=”false” />’
data-step=’<c:out value=”${step}” />’


Then, in the JavaScript file, it is very easy using jQuery to read these values… Obviously this code has to run after the DOM is ready:

$(document).ready(function()
{
    contextPath = $("body").attr("data-contextPath");
    docId = $("body").attr("data-docId");
    step = $("body").attr("data-step");
....
}

After applying these two steps, I have a clean separation. All my HTML is in an HTML page, and all my JavaScript is in a JavaScript file (and all my CSS is in a CSS file, but that was already true).

This approach may be old hat for you experienced web developers out there.  Hopefully this approach may help a few of you like me: an old server-side grunt tossed into the wild-and-wooly Web side of things!

Copyright © 2002–2011, Armedia. All Rights Reserved.