Friday, February 9, 2007

MSWeb: An Enterprise Intranet 2

The MSWeb team started out four years ago with a vision of the very broad but tricky area of taxonomies, and went to work figuring out how they could be built for use on the MSWeb portal. They tested and developed tools and vocabularies that improve content management as well as the searching and browsing of the MSWeb site.
MSWeb is the ultimate low-hanging fruit: highly visible, frequently used by many in the company, rich in valuable content and important to management.
This project is beginning to have an impact that goes far beyond the MSWeb site. Other major Microsoft intranet sites—those for human resources, finance, the library, and the information technology group—have begun to use some or all of the tools and taxonomies that were developed by the MSWeb team. And more than two dozen major sub-portals have implemented aspects of MSWeb’s search system. How has the MSWeb team succeeded at spreading its gospel through a huge organization like Microsoft, when similar efforts at smaller companies often fail?

The roots of MSWeb’s success are many. Let’s examine them.

Location, location, location
Because MSWeb is the company’s major intranet portal, just about everyone in the company uses it—94% of all Microsoft employees. The site is large and complex, providing the team with ample challenges and a test bed for trying out new solutions. Additionally, MSWeb’s enterprise-wide prominence has made for an excellent marketing opportunity for the team’s efforts and for information architecture in general.

Indeed, as a candidate site for an information architecture redesign, MSWeb is the ultimate low-hanging fruit: highly visible, frequently used by many in the company, rich in valuable content, important to management, and, finally, managed by an enlightened team that was aware of information architecture. You couldn’t ask for a better showcase for the value of good information architecture.

Helping where it hurts
Every information architecture project ultimately has two audiences: users and site managers/owners. It’s important to make both audiences happy, and the best way to do so is to fix what hurts.

The MSWeb team intentionally selected a major area—search—that would greatly benefit both users and managers, and designed its taxonomies to specifically improve search performance. Users’ experiences with searching were greatly improved through the integration of Best Bets into search results (more on Best Bets below). And the MSWeb team began to help site managers address search, sometimes by simply providing informal consulting, but also in more concrete ways such as providing a centrally managed crawling and indexing service. By encouraging units to develop resource records, the MSWeb team spawned the creation of a collection of content surrogates that references some of the most valuable content in the Microsoft intranet environment. And once these records were created, they made for great starting points for site crawling—robots simply followed the links embedded in the UCS’s records.

Just as the prominence of MSWeb gained exposure for the team’s efforts, the success of Best Bets validated the MSWeb approach. Both paved the way for improved collaboration between the MSWeb team and many other business units that were players in the Microsoft intranet environment.

Modular services
From the very start, the MSWeb team has looked for opportunities to develop its taxonomies and tools in a modular and therefore reusable fashion, and package them as services for the rest of the company. In fact, they’ve even branded their offerings as “Search and Taxonomies as a Service” (originally “Search as a Service”—and still referred to as SAS). The SAS console, displayed here, provides an excellent visualization of what SAS offers to its users.

The SAS console

The MSWeb team recognized that other business units would have a wide variety of needs, as well as existing tools on hand to address their own information architecture and content management challenges. They knew that no one could compel those business units to adopt 100% of the MSWeb approach. So the team designed SAS to be extremely modular, so Microsoft business units could take advantage of some services while passing on others.
For example, SAS offers access to MSWeb’s taxonomies through the MDR. Other units can manage and store their own taxonomies through the MDR as well, as long as they are willing to share their work. And to ensure quality in their taxonomies, those other business units can take advantage of taxonomy-related consulting services provided by SAS.

Different business units can access taxonomies from the MDR through the SAS console. Or, because the taxonomies are exportable in XML, units can develop their own interfaces, as did Microsoft’s library. This flexibility means that existing tools, homegrown or not, don’t need to be thrown out in favor of MSWeb’s version. Similarly, XML is used to export search results; this enables another unit’s site to leverage the records stored in UCS (assuming that their engine can accommodate XML). Even the MSWeb search interface is exportable, as it’s written using XSL.

As discussed earlier, metadata schema are extensible, in effect allowing different business units to create customized versions of any schema. Records created using those schema are reusable through a highly flexible subscription process. And last but not least, optional crawling and indexing services are also made available by SAS to its client business units.

All of this flexibility leads to a huge number of possible SAS service configurations. A Microsoft business unit could handle most of its information architecture and content management needs using everything SAS has to offer, or it could operate its own publishing system that only imports taxonomies from the MDR. Or it might choose to go it completely alone. The decision is up to that business unit, and is impacted by the factors of users, content, and context that guide all information architecture work.

In the case of HRWeb, Microsoft’s human resources portal, the decision was made to use most SAS services. SAS was used to:

  • Identify content for crawling and indexing for use in searching
  • Create a category label taxonomy for browsing
  • Create Best Bets specifically for use in the HRWeb portal
  • Classify those Best Bets using HRWeb’s category label taxonomy
  • Provide access to the SAS high-quality search engine
  • Export Best Bets search results to HRWeb’s site
Perhaps most importantly, HRWeb drew on the MSWeb team’s expertise through a consulting relationship. MSWeb staff taught HRWeb’s team how to develop category labels through user-centered design (UCD) techniques such as contextual inquiry. The HRWeb team was also instructed in the art and science of cataloging resource records using descriptive vocabularies and the shared metadata schema. The resulting HRWeb site is shown here.

Microsoft’s HR group is a full-fledged SAS “client,” using all of SAS’s services

Currently, most units have small web development–related teams and limited resources, and are just beginning to delve into the sticky topics of taxonomies, searching, and browsing. As they learn about SAS, they are generally quite glad to take advantage of the tools and expertise already developed by the MSWeb team. But as each unit’s expertise and budget for information architecture grows, it will likely want to take on more and more control. The flexibility of its service modules will ensure that SAS can be configured to keep up with those changes.

Different kinds of flexibility
Aside from a focus on taxonomies, the major components of MSWeb’s approach—the tools and a flexible, modular, and somewhat entrepreneurial service model—draw little from library science. And as noted earlier, the taxonomies themselves, not to mention MSWeb’s operating definition of the word “taxonomy,” do not adhere to an orthodox library science approach.

Team members left their disciplinary baggage at the door in order to achieve buy-in and support from colleagues from different backgrounds and with different perspectives.
This is a different flexibility than the kind that drives the SAS approach. The MSWeb team has been driven by a philosophy built on a flexibility of mind. Although many team members have library science backgrounds, they have left their disciplinary baggage at the door in order to achieve buy-in and support from colleagues from different backgrounds and with different perspectives.

For example, few, if any, graphic designers get excited by the thought of developing taxonomies. But anyone will listen to an open-minded colleague describing a good approach to solving a big problem. Because the MSWeb team was willing to be flexible in its terminology and outlook, they could communicate their taxonomy-based solutions more effectively to colleagues and clients who might be turned off by “library talk.” One senior designer on the MSWeb team described his realization of the value of the taxonomy approach and its basis in UCD techniques as the moment he “drank the Kool-Aid.” From that point on, he bought into the approach 100 percent.

The team was also successful because it was flexibly designed—not just LIS people, but technologists, technical communicators, designers, and strategists. In addition to lending the team more credibility with outsiders, the team’s interdisciplinary nature meant that many ideas were explained, translated, and fought over before they were ever exposed to outsiders. Interdisciplinary perspectives lead, as always, to a better and more marketable set of services.

Company savings
The MSWeb team understands the need for baby steps in any significant information architecture project. They’ve spent years developing taxonomies and supporting tools to use on MSWeb. And they’ve taken a gradual approach to rolling them out as SAS services to other business units.

But it’s also important to note that within three months of launching SAS, nine sub-portals had already implemented SAS-based search on their sites. Two of those had created site-specific category label taxonomies to support browsing, and another was in the process of doing so. All leveraged the MSWeb Best Bets results as part of their own search systems.

Quick adoption of SAS represents success for the MSWeb team, but has much greater significance to Microsoft as a whole. Besides the benefits to users, which we’ll describe below, an incredible amount of labor has been saved. It’s estimated that SAS has resulted in a cost savings of 45 person years in avoided work (based on calculating the development efforts—estimated at 5 person years—and multiplying by 9— the number of business units that didn’t have to reinvent the SAS wheel). These savings were achieved with no increase in the MSWeb team’s staffing levels, and what was developed for MSWeb has been completely reusable by other business units.

Benefits to Users
As Microsoft’s intranet environment matured in the mid-’90s, it began to suffer from the same afflictions as most enterprise intranets: too many clicks to get to desired information, difficult site-wide navigation, and the best documents buried deep within search results. And, as mentioned earlier, users and their champions began to ask for taxonomies to make these problems go away.

The MSWeb team’s response is a work in progress. Four years is a brief moment in the lifespan of a large company and its information systems. The team is taking an evolutionary approach, avoiding unrealistic goals of fixing all problems for everyone in a few years. In this way, there are no false expectations. But even in four years, many concrete benefits have been realized, and taxonomies are at the forefront of these improvements. With category label taxonomy, for example, the labels are more representative and consistent, improving navigation within MSWeb and between Microsoft intranet sites.

Searching is also greatly improved. By encouraging resource record creation with UCS, MSWeb is able to identify valuable content in the intranet environment, and therefore can do a better job of crawling remote intranet content. Better crawling leads to more comprehensive indexing. Users are now querying indexes that represent both a much larger body of content and a higher-quality collection of content. More importantly, users’ queries are more powerful than before—they are able to take advantage of MSWeb’s descriptive vocabularies to reduce the ambiguity of individual search terms. Consider a search on “asp,” a very ambiguous term. During a search, the descriptive vocabularies stored in the MDR are automatically invoked to expand the search by including the different meanings (“Active Server Pages” and “application service providers”). These terms are also displayed as executable searches on the search results page to narrow or refine the search.

The MSWeb team has also helped pioneer a positive and increasingly common trend: “Best Bets.” These are search results that are the product of manual efforts. Often displayed before other, automatically generated results, Best Bets link a user to documents that a cataloger has determined to be highly relevant to the user’s initial search query. Best Bets are designed to address the “sweet spot” in searching, which consists of the few unique search queries that constitute the majority of all searches executed. Why not add value to the small number of frequently executed searches by adding Best Bets to their results?

Best Bets search results are drawn directly from resource records created using UCS

The screenshot shows the results for the search query “asp” from the MSWeb intranet, and you’ll note that the first five are all Best Bets. The components of the search results—resource title, URL, description, and categories—are drawn from the meta-data schema, as the query searched an index of the controlled vocabulary terms assigned to these Best Bet records when they were indexed with UCS.

The “View Query Logs” function is useful in determining popular queries

The MSWeb team uses a function provided as part of the SAS console to determine which searches merit Best Bet coverage. By invoking the console’s “View Query Logs” command and specifying a date range and collection, it’s possible to determine how many documents each query retrieved. If the “Where Query Returned” option is set to “0 Best Bets,” we can learn which of those high-retrieving queries do not have Best Bets associated with them, and create new Best Bets accordingly.

Best Bets are typically the most clicked-through documents

Another SAS Console function is “View Metrics.” Its “Ranked Hit Clickthrough” option provides a graphic representation of the rank of documents in a particular query’s search results are being clicked through. Typically, the Best Bets, ranked at the top, have a far higher clickthrough than other documents.

Best Bets seem to have increased search result clickthrough

So, does this hybrid approach—the combination of manually and automatically generated results—actually help users? It may be too early to tell, but the initial data is promising. Users are performing 18% fewer searches since Best Bets were implemented; this might suggest that the results of their initial searches are more successful, reducing the need to submit follow-up searches. And, as shown in here, users are clicking through the top results’ links close to twice as much as they had before Best Bets were implemented. This may suggest that users are finding Best Bets results to be more relevant than automatically generated results.

Overall, the MSWeb team has attempted to measure the cumulative impact that better browsing, searching, and content have had on users. Performing a task analysis exercise both before and after a major redesign, the team was rewarded with some hopeful results in terms of success rate, time on task, and number of clicks. The following table displays the results of the task analysis. The version 3.0 results were recorded in February 1999, prior to the implementation of the taxonomy-driven approach, and the version 4.01 results were recorded in July 1999, after the implementation of the taxonomy-driven approach.

Measurev.3.0 Averagev.4.01 AverageChange
Task Success Rate68.30%79%+10.7%
Time on Task3 minutes 26 seconds3 minutes 10 seconds-16 seconds
Number of Clicks135-8 clicks

Certainly, other factors may have had an impact on these numbers. But even if we discount them, there is still ample anecdotal evidence to demonstrate the value of the MSWeb team’s efforts.

What’s Next
The initial success of MSWeb’s approach is exciting, but it’s just the first step over the course of many years and phases to come. To some degree, the team expects continued growth in what’s currently in place: more resource records, more robust taxonomies, and more sites coming on board and utilizing an increasing array of SAS services and MSWeb consulting. But the MSWeb team also hopes to try out some interesting new plans in the not-too-distant future.

The rational, the obvious, and the good often never make it off the drawing board, thanks to corporate strategies that change with the wind, extreme fluctuations in budgets, and, worst of all, the dreaded reorganization.

One exciting possibility is an increased role for other business units in the creation of an even more mature infrastructure to support enterprise-wide information architecture and content management. MSWeb isn’t looking to own this endeavor, but move into a leadership role, with other units playing the role of partners. In this scenario, Microsoft will save money because its business units will engage in increased sharing of taxonomies and related tools and efforts. Additionally, a greater degree of awareness among content managers might result in more willingness to go along with future centralizing initiatives, such as requiring the registration of resources in order for them to be indexed for searching. This trade-off might make for a little more work on the part of content owners, but will result in improved searching for users, as well as much more efficient content management practices by establishing who’s responsible for what content, when it should be updated, and so on.

Even more exciting is the possibility of creating something of a Microsoft “semantic web” along the lines of what Tim Berners-Lee, creator of the Web, and others have recently proposed. A semantic web environment allows connections to be made automatically between related content objects. Some of the tools described in this chapter could be extended to support such automatic associations; for example, the taxonomies developed by different Microsoft business units could be “cross-walked,” meaning that relationships between similar terms or “nodes” in the taxonomies could be established. These relationships could go a long way toward improving search across Microsoft’s intranets, as content with different tags and similar content would be retrieved together. VocabMan and the SAS console already have built-in support for related tags, which will enable future cross-walking of taxonomies.

The concept of a semantic web offers much more potential. Alex Wade, Manager of Knowledge Access Services, sees a future where semantic objects—not physical documents—are the atoms that make up the MSWeb universe. He states: “We don’t draw many lines between objects today, and when we do, these are rarely delineated; now we’re moving to semantically derived relationships.” He’d like to see a semantic MSWeb provide access to people, places, and things that are connected by “strong rules” or relationships; once an initial set of rules is seeded, new rules can be inferred. This web of relationships could have a hugely beneficial impact in an intranet environment like Microsoft’s, where it’s often as important to find the right person as it is to find the right information. This transition requires a paradigm shift for information architects: as Alex suggests, we’ll need to “stop tagging documents and start drawing relationships between objects. Eventually they’ll have different types of hierarchical, associative, and equivalent relationships.”

MSWeb’s Achievement
Nothing that the MSWeb team did—whether considering the initial problem, coming up with an approach, and developing the tools and expertise to make it happen—can be described as revolutionary. Rather, these were rational steps taken to address complicated problems. So why discuss their work here?

Well, if you have ever worked in a large organization—or even many smaller ones— you know that what’s rational isn’t often what happens. The rational, the obvious, and the good often never make it off the drawing board, thanks to corporate strategies that change with the wind, extreme fluctuations in budgets, and, worst of all, the dreaded reorganization. And Microsoft isn’t immune to such problems; one MSWeb team member went through seven different managers and had three title changes in just five months.

The MSWeb team has developed some neat taxonomies and tools. But we’re recognizing the team for its most impressive achievement: successfully implementing a rational plan in a large, corporate environment. The team understood that only a holistic approach—one that accommodated content, users, and context—could make a difference. They also knew that enterprise-wide solutions require sufficient time—years, not months—to take hold.



http://www.boxesandarrows.com/view/msweb_an_enterprise_intranet_2