Dilettante's Ball: April 2005

Saturday, April 30, 2005 As talented as Hemingway or Faulkner... If we're talking about the drinking

Well, it's official. I am a published author. Granted, I didn't do much; Dan wrote this paper almost singlehandedly, really. I put in a little more work than I did on this paper (this helped), although not much. Any of you who read my rather awkward white paper should immediately recognize that Dan is a much more inspired writer than I.

Still, I'm happy it's out and I'm extremely thankful to have been given the opportunity to work with this group. I was way out of my league.

Friday, April 29, 2005 RSS & Libraries: Ahab's great white whale

If the library blogosphere was your only source of current events within libraries, you would think that RSS was new MARC record and that podcasting will alleviate the need for circ rules.

This is ridiculous, of course, but that's not to say that there isn't merit to RSS (podcasting seems silly to me). RSS (in the right context) is perfect for libraries. Elsevier has just begun RSS feeds for current awareness searches (this has some problems, as well, but they are not insurmountable). RSS feeds for new acquisitions has potential, as well (although for larger libraries this needs to be filtered a bit... Emory, for example, gets 600 new items a week... put that in your Live Bookmarks, mofo). I know there are libraries that make RSS feeds of their news and announcements. This seems a little self-indulgent (how many people really subscribe to these?), but if they aren't much extra work, why not? I have no stone to throw here, really, I have come up with "Wag the 'Blog", after all.

RSS's real potential lies in computer-to-computer communication, though. All this talk of setting up feeds for our users is mostly noise to the primary goal which should be getting our services and collections visible in other sites and interfaces like A9's OpenSearch. What we need right now is marketing and mindshare. Feeds of new items that have gone into the anthropology subject guide are nice and all, but it'd help a lot more if that feed appeared contextually off of a Google search or something.

Lorcan Dempsey makes note of the University of Michigan exporting their reserves lists via RSS to Sakai, their portal/courseware application. This is the perfect application of RSS and, I note, something I put into Reserves Direct almost 2 years ago. Granted, I didn't think this was anything revolutionary or even noteworthy at the time, I just needed a way to get the reserves lists into Blackboard and this was a technology my feeble brain could wrap itself around. Actually, the RSS feed was never even advertised. Blackboard 5 has no native RSS capability (neither does Bb 6, for that matter... at least not in the course interface), I used a Perl CGI (now ported to PHP) to write the RSS feed as JavaScript in the page. This was a hack-y solution, but it worked, and that was what was necessary.

I guess what I'm drawn to here is the fact that the electronic reserves software that my employer has chosen to use doesn't support RSS. In fact, there is no alternative to its own interface. At Emory, I was present at a demo for said program and I left thinking, "huh... we can do that...". Two and a half to three years later, the commercial product still boasts the same feature set (which was indeed impressive -- three years ago) but has not evolved a bit. In the same time (and primarily because Jason White took over the main programming duties when I left) Reserves Direct has become an extremely powerful and extraordinarily flexible tool for getting the collection and services to where the user would want it. Namely, contextually among their other course items.

How could our vendors be so out of touch to something so simple and obvious?

Tuesday, April 26, 2005 I was meant for the stage... or Amazon or Google or something

Maybe it's the bachelor's degree in a joke major from a football factory school talking here... but why must libraries constantly overthink themselves into obsolescence?

Why must the only search that brings back relevant data be an "exact search"? And, even then, why can't that search include the leading article?

I realize metasearch isn't the solution to every problem. But it would solve a vast majority of them. And then I look at A9's OpenSearch and I think, "Jesus Christ, what are these momos in the NISO committee doing?"

That's, of course, not fair. OpenSearch will be nowhere near as sophisticated or as robust as NISO Metasearch. And that is, of course, exactly why it will exist everywhere but libraries and NISO Metasearch will exist nowhere but libraries.

Don't get me wrong. I love and embrace the rigidity of the MARC record. That being said, I want both authorities and friendly, loose and sloppy interfaces. I want Voyager and Amazon. I want Web of Knowledge and Google Scholar. And I want to be able to move around between simple and complex interfaces at will.

And as a developer, I want RSS and SRU, not OpenURL 1.0 and Z39.50. I like MODS. Hell, I like Dublin Core. I also like knowing that a good MARC record is living behind it.

Why can't libraries cater to the information semi-literate once and a while?

Saturday, April 16, 2005 Librarians and Lager: a bonding agent

Art Rhyno, Eric Lease Morgan and Jeremy Frumkin and I.

Dan Chudnov, Roy Tennant and I.

Todd Holbrook, Calvin Mah, John Durno and Mark Jordan put up with the 'Murrican Ass that I am.

William Wueppelman put up with that, too.

Jason White, Aaron Krowne, Will Young and I used to go for "free beer" at the Dogwood Brewery (R.I.P).

Bernardo Gomez.

Nathan & Michelle Robertson were good enough to let my drunk ass into their home.

Emily Patrick was, as well.

David Atkins (Pasquale), Cat Cochrane (Honda Civic Wagon thingy), Teresa Braden (apartment, among other places)... Anne Langley (Petron, wedding).

Megan Adams, you and your LSU ilk put up with the worst of my liver in New Orleans. Chicago, as well. No ilk there, though.

Ah, sweet beer. You, friend, are a uniter, not a divider. Maybe W should learn a thing or two from you.

Maybe it's the beer talking.

Oh the irony

So, we launched the WAGger on Wednesday afternoon... sort of. We created a couple of pages for it, and linked to it from a couple of places. A "soft launch", as they say. Still, it was a launch, nevertheless, so we needed to tell our public services librarians about it. We held a brownbag on Thursday at lunch to explain the fact that it was launched (soft launched) to help public services with the implementation.

Here's where the irony comes in. jake was down. But since jake runs on tomcat, it was still "alive", it just wasn't returning connections. So, despite my praise from the other day, jake wasn't allowing the WAGger to continue to SFX to display holdings.

Thankfully, the audience seemed to blow this off.

In the near future, expect to see a jake mirror at Georgia Tech.

Wednesday, April 13, 2005 "Evolution favours the pathetic"

--Art Rhyno

Tuesday, April 12, 2005 jake & CUFTS: studies in social frustration

Dan says jake could no longer be updated because of the heavy costs of maintenance.

Todd (or, rather, Mark) says CUFTS can no longer be free because of the heavy costs of maintenance.

They are both (all three of them) right, and it's incredibly sad. Knowledgebases are expensive to maintain.

I use both of these programs in WAG the Dog, despite the fact that I have access to SFX and make heavy use of the SFX API. jake is used in multiple places, and, regardless of the fact that the data has not been updated in years, is invaluable to the project. Before I bog down the SFX API with requests about an object that I know little about, I query jake to see if it is even a journal of some sort. Granted, exact title searches (especially on abbreviated titles) doesn't really exist in jake, and no journal that has been created in the last couple of years would be included (something I definitely need to address), but it filters out a lot of noise... especially from Google Scholar. Today I realized the value of jake for WAGnet, the resource advisement piece of WAG the Dog. jake is able to find a lot of things (albeit fuzzily) that SFX just isn't able to.

CUFTS is a really nice link resolver. Since SFU (or COPPUL, I'm not sure of the dynamics) has decided to charge for the knowledgebase (something I completely sympathize with), I have exported our live target data from SFX and imported it into CUFTS. Since I'm more interested in using CUFTS for an electronic holdings database rather than a link resolver, all of the links are OpenURLS pointing back to SFX. The nice part of CUFTS, is that it gives me a link to the database level as well as the journal and article level. When it comes to resource advisement, this is great. I can query CUFTS for a particular ISSN, find the database it resides in, and present a link saying, "hey, this journal you found appears in XYZ Academic. You might find more related things by searching there".

The sad part about CUFTS moving to a subscription model is that I don't have access to the A&I holdings information of a particular database. Yes, this data is in the SFX kb, but our e-resources team isn't going to activate the "getAbstract" service in SFX. Without that, I am not sure I can export it. This would be a great boon for WAGnet, but at the moment I have no clue how to tap this potential.

So, instead, I have to work with old data from jake and figure out ways to link jake's database names to our own.

Watch the maintenance costs increase.

Monday, April 11, 2005 Who will police the police?

The inimitable Art Rhyno and I are working on another project together (W-G?), which consists of several parts:

Refine Art's mind-bending idea of providing a WebDav interface to the OPAC
Provide an OAI provider for Voyager ILSes (Hello? Endeavor?!)
Try to put a real and legitimate Z39.50 (and SRW/U) server (that supports exact searches, relations, etc.) on the catalog
Create a new, useful and user-friendly natural language query interface to the catalog.

Art (Mr. Cocoon) is doing the heavy lifting. He's created a webapp that exports the bib database from Voyager to MODS (and downloads it with wget to a mirror). It is apparently easy (for him, not me) to then transform this output to WebDav.

I only really need the first part (although the second part is very "wow-cool"). We'll take the exported MODS records, put OCLC's OAICat upon it for OAI. Then I'll start building a new interface for the OPAC. The databases stay in synch by wget traversing the cocoon output and checking the timestamps of when a record was modified. Very very neat.

We're running into few problems, though.

Sometimes there are typos/data entry errors on records.

This raises an interesting conundrum. There is often the notion that the catalog is the "authority" for the library. Indeed, this is usually true, however it is impossible to completely eliminate human error.

When I was developing course/control, I sometimes had a problem with marcxml records not being valid. Almost every single time something was wrong with the MARC record. Of course, cataloging was always happy to fix it, but this shows chinks in our armor. If our metadata is wrong, who will get it right?

That being said, bad data (in this case) isn't the end of the world. I can live with a margin of error if the interface is even moderately better.

Friday, April 08, 2005 Thoughts on OpenURL Autodiscovery

The gcs-pcs list (no idea what it stands for) has been buzzing with activity since Eric Hellman released his "Latent OpenURLs in HTML" page to the world. It's not that Eric's idea is bad or wrong (the opposite, actually). It's just different.

Dan, Jeremy, Richard, Raymond and I have a paper coming out next month in Ariadne and it mainly comes from ideas that were hashed out on the gcs-pcs list. They are not perfect, and we make that point in the paper. In fact, I had mentioned to Dan that we hadn't brought up the issue of "OpenURL version" at all and that this could be problematic in the future.

Eric's format, although obviously informed by the ideas on the list, seemed to come largely out of the blue and without external input in the design process. Again, this isn't necessarily wrong either... I certainly have been known to run with something without asking others for input and then came back with something that may or may not be what the group as a whole expected.

The big difference is that I'm nobody and Eric's not. If he has people interested in this project, that's great. That's what somebodys are good for. And I also like the fact that he's not willing to wait around for a year to get a spec out (he is shooting for May 1st). I certainly don't want a NISO or ALA schedule.

Here's my problem:
OpenURL 1.0 is too freaking complicated to expect people to use.

It's not that I feel that something must be simplistic in order to be implemented or for people to "get it", it's just that there should be varying levels of entry into our collections. For SRW, there's SRU (still nothing I'm going to teach my mom in a day, but progress from Z39.50, certainly) and that should make it easier to link into our catalogs. If OpenURL seems too difficult for your average web hacker to use, they won't. And we'll be left on the sidelines with our little niche technology. Again.

My proposal is to have both formats supported (with the default being version 0.1) much like RSS currently does. State your version in your link and let the resolver work it out. This way, the people who use it can determine the easier way to implement OpenURLs on their site. Let the "market decide", as it were.

I do not know the membership of gcs-pcs, but Eric is the only link resolver developer that has weighed in. I'm curious how others feel about the 1.0 vs. 0.1 debate.

Countdown to WAGging

We're planning to (soft) launch WAG the Dog next week on the unsuspecting public. I am a little nervous about this. The WAGger is basically a proof-of-concept and a fun staging ground for developing new ideas. While Tech's user base may be able to roll with the punches of a less-than-stable system, I also don't want to put a bad taste in their mouths before the really useful features are introduced.

The WAG the Dog is basically two parts: the WAGger which parses the page you are looking at to see if there are things on there that could be localized (links that can be proxied, ISSNs, DOIs, etc.) and the (hopefully) soon-to-be-introduced "WAGnet" which takes what you're looking at and tries to find other useful and relevant items in the collection and present them to you. Currently, it takes the LCSH of whatever you're looking at (assuming there's anything to work with) and finds other electronic resources in the same subject heading, presents them, checks to see if they appear in a database, present it, etc.

Next week I hope to also have some sort of folksonomic linkage, as well, so if any of the resources appear, say, in del.icio.us or unalog, get the tags from there, and use those tags on sites like Connotea or CiteULike. The only problem here is that I have no idea if people are socially bookmarking homepages of scholarly content. I guess I'll find out.

Still, it will be interesting to see if the public finds WAG the Dog useful.

Community: