Tuesday, 20 October 2009
SharePoint Advanced Search Properties Don't Work - the skinny on Created By, Modified By, Author and more
If you've ever done much work with the Advanced Search web part it won't take you long until you discover that most of the default property searches simply don't work. As of writing this post there were dozens of very long forum threads describing the problem with no comprehensive fix in sight by Microsoft or anyone else. Perhaps until now...?
What's not Broken?
Despite the claims of many, the Size and (Created/Modified)Date properties are not completely broken. They simply require the correct format for input. It's also worth noting that the explicit value is required when using property searches. No wildcards or partials!
- Size - takes a value in bytes, so that 1000000 = 1Mb.
- Dates - require a xx/xx/xxx or xx/xx/xxxx format. The order of the days/months will depend on your regional settings. If you're still getting no results, then check the metadata mappings and XSLT references below.
Although uncovering the correct format was pain, it was nothing compared with what followed.
What is Broken and Why?
As many have discovered, the Created By, Last Modified By, Created Date, Last Modified Date and Author properties are all affected to some degree.
The root cause for all this is down to improper mapping of crawled properties to their managed properties within Search Administration and keeps going all the way through to the XSLT for the Advanced Search and Search Core Results web parts.
The reasons behind these poor relationships become obvious shortly after you begin looking for a solution. To be frank, it also becomes obvious why most people gave up trying!
Thanks to...
Much thanks goes Anne Stenberg and her 6-part series entitled - Mystery Solved - Crawled Properties in SharePoint.
In this series Anne patently and painstakingly goes through every last property in each defined category, and providing a description for many. I'm not entirely sure where she came by all this information but it proved invaluable when it came to identifying and testing the result of many changes to come in my metadata property mapping.
Please explain!
Using a combination of Anne's tables, the U2U CAML Query Builder feature, the ever useful SharePoint Manager, and the XSLT within the search web parts - it quickly becomes obvious that it's going to take more than a packet of off-the-shelf headache tablets to get through this.
Without going into too much detail - ignorance being bliss - let's take a look at something as simple as Author.
- We have a visible Author column whose internal name is _Author.
- A hidden Created By column whose internal name is Author.
- And a managed property called Author that seems to want to hedge its bets by trying to cover all these bases as well as a few more.
But that that's just the beginning - Created By and Modified By searches will invariably return zero results and also have their fair share of possible mappings and hidden values. What the heck is Write anyway?? Apparently just another value for Modified Date...but more on that later. I'm sure anyone's who's that interested can do their own research. I won't bore everyone alse any further.
What's the fix already!?
OK, OK. Keep your propeller hat on.
After days of stuffing around, tweaking mappings, modifying web part properties and performing a full crawl each time(!) I have finally found - I think - a solution. At least, a number of searches - using Author, Created By and Last Modified By properties with the AND operator - all returned correct results.
It's also worth noting that this solution is not Office-centric and will work with any document type.
First, the Metadata
This assumes a good knowledge of Central Administration. If you require detailed steps they can be fond elsewhere.
You can use all or some of the settings shown below but the only ones that really matter are the Mappings themselves. After you've added the crawled properties, be sure to click each one and check the "Include values for this property in the search index" checkbox, otherwise it won't get added to the index! In all cases I went with the default "Inlcude values from all crawled properties mapped" option.
Also note that there are often TWO properties with exactly the same name - e.g. Office:4(Text). Picking the right one is essential and I have provided the Property Set IDs below where this is relevant.
And, remember, what follows is in no way Gospel - it's just what worked for me.
NB: Don't forget to run a Full Crawl after making these changes.
Property Name | Type | May be deleted | Use in scopes | Mappings |
|---|---|---|---|---|
| Author | Text | No | Yes | _Author(Text), ows__Author(Text) |
| Created | Date and Time | Yes | No | Office:12(Date and Time), Basic:15(Date and Time) |
| CreatedBy | Text | Yes | No | Office:4(Text), ows_Created_x0020_By(Text |
| LastModifiedTime | Date and Time | No | Yes | Basic:14(Date and Time), Basic:16(Date and Time), ows_Modified(Date and Time) |
| ModifiedBy | Text | Yes | Yes | Office:8(Text) |
- Office:12(Date and Time) - f29f85e0-4ff9-1068-ab91-08002b27b3d9
- Basic:15(Date and Time) - b725f130-47ef-101a-a5f1-02608c9eebac
- Office:4(Text) - f29f85e0-4ff9-1068-ab91-08002b27b3d9
- Office:8(Text) - f29f85e0-4ff9-1068-ab91-08002b27b3d9
Advanced Search XSLT
The following go in PropertyDefs. There are many default values here, I'm just providing the full block. You'll then need to add the same 'Name' references to each ResultType in the order you prefer.
<propertydef name="Author" datatype="text" displayname="Author">
<propertydef name="Size" datatype="integer" displayname="Size">
<propertydef name="Keywords" datatype="text" displayname="Keywords">
<propertydef name="CreatedBy" datatype="text" displayname="Created By">
<propertydef name="Created" datatype="datetime" displayname="Created Date">
<propertydef name="ModifiedBy" datatype="text" displayname="Last Modified By">
<propertydef name="LastModifiedTime" datatype="datetime" displayname="Last Modified Date">
Search Core Results XSLT
Unless you're trying to provide custom results using some of the values described above you won't need to make any changes here. Quite frankly it's a little daunting but great things can be done - such as displaying Size, Author and a custom link to open the containing folder for each result. I'll probably leave this for another post as it's a topic in itself.
In conclusion...
So, hopefully, if you've done everything right and performed a full crawl, you should now be able to search using one or all of the properties we've discussed here.
One thing you may find still doesn't work is the "Does not equal" operator. You might also that it's not described anywhere in the web part code but is managed by a separate core JavaScript file. I'm just not willing to look into this right now - or the reasons why "Contains" and "Doesn't contain" aren't available for partial search term querying. If anyone else has any ideas - performance notwithstanding - feel free to drop me a line.
I look forward to any further insight and feedback others might have and hope that all my hard work isn't undone with the next upgrade!
Labels: Metadata, MOSS 2007, Search
Thursday, 16 July 2009
Deleting unused SharePoint Content Types
It happens to every site collection admin at some stage. For whatever reason you're required to replace a defunct content type in your document library with a new one. Adding t he new one is easy. But then you try to delete the old one and receive the terrifically informative "Content Type is still in use" error.
You've tried everything:
- Updated any files you could find with the new content type.
- Checked every file twice to make sure you didn't miss any.
- Managed Checked Out files to locate all those tricky 'hidden' docs that haven't been checked in yet and only exist in some funky temp/draft state! (You'll either need to take ownership of these as site collection admin, or email a list to the original authors to get them to check them in or delete then. I can't believe there is no option to delete these en masse!)
- Run custom CAML queries to make REALLY sure you didn't miss any (using the fabulous U2U SahrePoint CAML Query feature - www.u2u.be/res/Software.aspx.
- Written a console app just to make ABSOLUTELY sure you didn't miss any.
- Emptied the recycling bin - both of them!
But still the error persists.
Just when I thought I'd exhausted all options it occurred to me that perhaps versions were the culprit. Looking at the version history for a few suspect documents confirmed that they were.
Running the following SQL query will find them all.
SELECT *
FROM AllUserData
WHERE (tp_DirName LIKE '%sitename/Shared Documents%')
AND ((tp_ContentType = 'myDodgyCT'))
ORDER BY tp_DirNameThis should be self-explanatory but tp_dirName is just using a relative path from the domain to the library. And ContentType is the explicit name of the content type to search on.
From there I just exported the results to Excel, filtered out duplicates and was left with a workable list of documents. Simply publish a final version of each (if required), then go to the Version History and delete all previous versions.
You could do all this in one step by turning off versioning for the library and then turning it on again. But this would delete versions for ALL documents. A good trick to remember when you're site quota is reached.
Good luck!
Labels: Content Types, MOSS 2007
Monday, 15 June 2009
Renaming SharePoint Site Collections (the Secret of Managed Paths)
Everyone knows, or at least should know, that SharPoint sites (formally referred to as Webs) can be easily renamed with either stsadm -o renameweb or through the GUI.
But what about renaming site collections?
Well, the short answer to that question is no. Site collections cannot be renamed, at least not those on managed paths.
e.g. domain1.com/sites/sitecollection_name
Hosted site collections using host headers can be renamed using the stsadm -o renamesite command but it is limited to hosted sites only.
e.g. domain1.com -> domain2.com
Yes, another massive limitation of the current SharePoint version. But the real culprit - and what makes this all the more confounding - is managed paths. If you've never had any real experience with these little beasties then I envy you. But get ready to put your hard hat on because we're going in.
You can create a managed path for a site collection using one of two options - "Explicit exclusion" or "Wildcard inclusion" - but not both! The limitations of this will become obvious soon.
Let's say you want the following structure for your site collections:
domain1.com/groups domain1.com/groups/team1
- where both groups and team 1 are site collections.
This cannot be achieved conventionally. The reason is because to create a site collection at the managed path of /groups it must be set to "Explicit exclusion" - meaning you cannot create site collections below this path.
And if you create the /groups path to use "Wildcard inclusion" then you can only create sites below this path. Make sense?i.e. domain1/groups/my_sitecollection
This means that site collections can have no distinct hierarchy. Not that this matters given that they are treated as separate entities and current version web parts are unable to query across site collections anwyay!
So is there a workaround to both these limitations? I'm glad you asked. ;)
The following refers to explicitly renaming a single site collection which uses "Explicit exclusion" only. Depending on your needs you can use "Wildcard inclusion" for ensuing site collections.
To rename a site collection you need to:
- Backup your exsiting site collection using stsadm -o backup.
- Create a managed path of newpath "Explicit eclusion" using the new name you want.
- Restore the site collection to the new URL using stsadm -o restore.
To create a virtual URL hierarchy you can:
- Create a managed path of newpath/newsite "Explicit exclusion".
- Create a new site collection using this path.
Yes, it's true! MOSS will allow you to create a managed path in this form, despite the explicit exclusion on the original /newpath path.
Great isn't it? Now, whether this is intended or flawed behaviour, or a good idea or not is another matter entirely. ;)
Either way - good luck! And may version 14 resolve these and many other issues we're plagued with.
Labels: Managed Paths, MOSS 2007, Site Collections
Monday, 17 November 2008
Making MOSS accessible: a lost cause?
As an accessibility evangelist and someone from an open source background who fell head first into this new world of SharePoint and.NET programming I can tell you that this task is not for the faint-hearted. The effort required to make MOSS even remotely accessible (and I’m not just talking priority 1 and 2 checkpoints)is gargantuan.
To add to all the advice already given elsewhere, the most future-proof steps required to make your WCM sites accessible include:
- Specify an XHTML loose doctype in your custom master. Caveats:
- You can’t use a strict doctype without losing functionality (because non-compliant code will not render as expected, if at all).
- You can’t specify a doctype in your system master without losing valuable functionality (Datasheet View and Image Library functions notwithstanding).
- Rewrite all the layout code found in masters and layouts to be compliant. This is about the best start you’ve got. This includes:
- Using custom CSS wherever possible. Delete built-in style calls and let your base HTML selector styles do the work.
- Use custom ID selectors for layout to overwrite anything rendered by core.css. This also allows you to add all your skip link anchors and makes using a style switcher that much simpler.
- Go as far as to remove the csslink tag in your master page if you’re really game and see what happens. I managed to reduce all native SharePoint CSS to 101 lines and call it last as an override.
- Don’t use the MS Minimal Master. It fails to include useful (if not essential) placeholders.
- Use Data Form Web Parts wherever you can.
- Output for these can be controlled with XSL.
- They don’t have to live within Web Part Zones.
- Visit Heather Solomon’s site and grab her minimal master (infintely more usable) and the CSS cheat sheets can be invaluable.
- Develop in FireFox first. FireBug is your new best friend. Then test in IE, use the Developer’s Toolbar, and add any ‘fixes’ to your CSS override.
- Having gone to all this trouble the last thing you need is for your content editors to spoil all your hard work by pasting content from Word!
- Get Telerik or a similarly standards-compliant editor.
- Provide a Writing For The Web content editor’s guide which includes simple steps on producing nice, clean, legible copy.
Even after employing all the recommendations made here, at the end of the day you still have to accept certain limitations and realities. We have managed to come out of this exercise with accessible master pages (and some layout pages) but there is little control over content that is rendered at run-time. Everything I’ve found either re-writes the rendered code after the fact or just helps to bloat it in another way.
Much of the controls that make up a page use seriously flawed legacy code. If only all web parts included the XSL editor!
Last time I looked the AKS did little more than add summary="layout" and slightly deplete the concentric ring of nested tables that make up a typical page. And while the Telerik editor _produces_ compliant code, it’s my understanding that it is still rewritten at page load by the ASP render class.
Best piece of advice? Just keep hammering MS on the SharePoint forums and hope that future versions will one day get there. I use a number of aliases. ;)
Labels: accessibility, MOSS 2007, SharePoint, standards
Subscribe to Posts [Atom]
