Here's a
NY Times article on antiquarian book bibliographies being used as entrepreneurial tools. Interesting use of metadata, no? Librarians can learn something from this. Compiling bibliographies is a service. A service which, in this case, has a niche market willing to spend on the long-tail, and can easily be automated. It's low-hanging fruit for savvy entrepreneurs gathering business opportunities.
Librarians compile bibliographies all the time. How do we sell these services? Using a word like megalist rather than bibliography would help, no doubt. Do we consider the needs of our particular niche market? Do we put those needs in context with rest of the information universe? I think most librarians think they do. User needs assessment has a long and strong megalist in the LIS realm. Librarians pride themselves on creating collections and designing services based on data gained from studying user behavior, based on many methodologies. Yet, use of traditional library services is declining. And, the NY Times article shows us that other people are providing the same services librarians do and
possibly doing it better. Obviously, user needs assessment is not enough. Because,
Users.don't.always.know.what.they.need.
AND
Past.behavior.may.not.necessarily.reflect.future behavior.
User needs assessment must be combined with other types of
market research in order to create a value proposition. In plain English: what are you selling? And how are you going to sell it? I'm definitely including business literature in my professional development because I only
think I can answer those questions for MPOW. My answers, right now, would only be opinions and conjecture. I need facts for decision-making. A business plan(s) will help with figuring out what facts to gather to create the most successful new library services.
Labels: metadata
This week's muffin, by special request, Chocolate Chocolate Chip (veg but not vegan. I couldn't find vegan chocolate chips :-( ).
This week's movie: None. We're still catching up with Diane Hillman's Metadata Standards and Applications.
FYI, No 4M next week. I'll be at ALA yawning my way through a CC:DA meeting.
Labels: 4M, metadata
This week's muffin (vegan as always): Orange Cranberry.
This week's metadata movie: None. We're catching up on the last 2 part's of Diane Hillman's Metadata Standards and Applications. We got busy. What can we say?
You may have also noticed that the weekly 4M doesn't always happen weekly. You would be very observant. Sometimes stuff pops up. Like Memorial Day. Or, simply me being on work travel or vacation (went to Berkeley, it was great, thanks for asking).
Labels: 4M, metadata
This week's muffin (non-vegan, store bought):
Assorted. Would you bake if it was over 100F and you didn't have AC in the house? Exactly.
This week's movies:
Part 5 of
Diane Hillman's Metadata Standards & Applications: metadata interoperability and distributionHow to build the semantic web using Dublin CoreLabels: 4M, metadata
I have resigned from CC:DA. The meetings at this year's ALA Annual will be my last as a voting committee member. As a new manager I've had less time to deal with the deluge of detail. My attitude since formally resigning last winter is wake-me-when-its-over. Once it's released I'll work with colleagues in my institution to determine whether or not we need to implement.
Let the triumvirate figure out the business case.
Don't get me wrong. I think RDA is a good thing. There's a good conversation happening at the
Inquiring Librarian about RDA implementation and the LC statement. I agree with Jenn Riley that
that RDA is overall a positive thing, and that it represents a necessary (although of course not perfect) step forward in the ongoing evolution of libraries
I'm also with Irvin Flack, who commented on Jenn's post
I want RDA to work but I've decided I'm going to wait for the full final draft before I try to read any more of it. I become too frustrated and confused. I can't afford to lose any more hair! I find myself wondering: why on Earth did they write that rule that way?
A-effing-men!
They wrote RDA by cutting and pasting wholesale portions of AACR2 then re-writing bits -- not a good way to create a whole new means of looking at content standards for cataloging, IMHO. It also introduced a lot of the consistency errors within the text. Then they re-arranged the ordering of the parts and only released certain parts at at time. I found it impossible to keep a cohesive mental model of the drafts. I look forward to the full release. I don't think I'll read it though. Life's too short. I resigned from CC:DA because I don't have time to faithfully review it and contribute to its development anymore. I'd love to follow it, but I need to be practical with my time and my health since
beginning to have problems in that realm. Not to mention
the hernia risk.
I intend to test catalog some things using the electronic version of RDA when available. Let the print version die please! I realize that some small, less funded, libraries will still need to work from a print version hence the JSC's decision to stick with publishing both print+online. But couldn't we write it online and let the people with less money print out customized versions rather than writing it as if we still live in a print-centric world when it comes to "standards" for working with metadata? That could help with the cohesiveness issues in the text.
As a manager of a small cataloging and acquisitions operation I sometimes wonder just how relevant RDA is going to be in our future. I suspect not much. Sorry. I had to talk about the elephant in my room.
Shelf-ready monographs, umpteen thousand title electronic resource packages, open access eBooks, etc. mean that I'll be ingesting more records directly from publishers. And do publisher's give a rat's ass about RDA? (see
EDitEUR) As for legacy bib records in my OPAC, I predict that somebody will write a MARC/RDA translator and that we'll be automating the migration of records (if it proves necessary, which I believe it may not).
I suspect it will be better for MPOW to play the middle road. Wait until other libraries adopt RDA and see how they do. I've got other priorities right now. MPOW is a specialized research institution. Our metadata services are moving in the direction of assisting with the information management of resources created on campus. Sure we'll always order books and journals but that stuff is going to become more automated as time goes by. RDA is not on my radar as a skill set I need to be training people to have. Understanding metadata formats and interoperability is a bigger concern. Ditto metadata for digital preservation and data curation. I suspect repositories and reference will be our library's life blood. I believe
John Wilbanks was right when he said providing things like namespaces will be the bread and butter of the new-school library. We need to have the skills to do that type of thing or we risk diminished relevancy when our primary clientèle's needs are not being met. And yeah, we need to do the appropriate needs assessment to determine that we prioritize in terms of evolving the library.
I'm filing RDA under nice-to-be-aware-of but not worth following in detail anymore. But that's just me. Your mileage may vary.
Labels: cataloging, CC:DA, life1.0, metadata, RDA
I've been spending a lot of hours of the past two weeks doing a batch import of 5001 ebook records from Literature Online.
It's been quite the educational adventure for me. Although I've been a working librarian for 13 years, I've only been a "cataloger" for the past five. At my former job I only ever had to use cataloging, serials, and, rarely, the acquisitions modules of Innovative Interfaces Millennium. Within each module I only used some of the functions with any regularity.
When I signed on at MPOW, they accepted my caveat that I wasn't a Millennium maven. They were comfortable that I could RTFM, especially since our Innovative coordinator would be handling most of the Millennium sys-admin. I know enough to bootstrap myself. Which I've done. Painfully.
Given the small size of our team in the Metadata Services Group, I've had to take on some more complicated batch imports. I knew how to do a data exchange, no problem. What I didn't know was that globally editing a large bunch of bibliographic records would fill up the transaction file on the server and cause
the.entire.system to crash. And I mean crash. No circulation check-outs, no back-end processing, nothing, nada, zip.
I learned this after doing a global update prior to going to a 2 hour meeting. Guess who got called out of the meeting? It was a bit hairy until I could locate our Millennium coordinator who saved the day by doing a manual back-up of the system.
Huh? A back-up? WTF? Apparently the only way to access a system menu option to clear the transaction file is during the back up dialog. That is stupid. I hope there is some technical reason for this because I think it should be possible to send a command to a server to clear a file without having to back-up (somebody please correct me if I'm way misinformed here). Of course, we don't have command line access to our server. Innovative keeps a tight grip on that type of thing. I can understand why, they probably don't want people to have enough rope with which to hang themselves. Whatever. When we migrate to a different ILS, which is inevitable (there's only two kinds of librarians. Those who've done a migration and those who will), I will insist that our requirements list include full shell access to the system. I know that Millennium lets one use regular expressions but I'm under the impression that access to that is still controlled from the GUI.
〈rant〉 We shouldn't let vendors have so much control over our systems. I recognize that there are situations where it's good for vendors to hold the reins (like small operations with no staff skilled to do the sys-admin). But there is an opportunity cost to the nimbleness of the library who relies on the vendor.〈/rant〉
One could do the global updating of records more quickly and easily with shell access and the right skill set. But, how many cataloging librarians are well versed in regex apart from the code4lib folk? Um, yah. Right.
MarcEdit came to my rescue, once again (I heart Terry Reese). Sorry
Robert, I wanted to use MARC Magician but they were too slow sending me a password for a free trial.
Global updating via MarcEdit is rather painless, once you get the hang of it. Getting the hang of it took me a few hours of messing around, however. The real bitch was doing the data transfer. Word up to my fellow Millennium users - 'tis sometimes better to use Data Exchange natively in Millennium than use records transfer from within MarcEdit. It's the only way to easily make a review file of records transferred.
I had to do several batch imports/deletes
before I got it right.
The #*$@!!% frustrating thing is that there were some global updates that could only be done natively within Millennium. Each run filled up my transaction file ~30% . Three big globals a day and you crash. That sucks. What will we do when bigger bulk imports are needed? Five thousand records is
nothing compared to the bulk ingests I foresee in our future (think GoogleBooks, etc.)
Naturally, I didn't want to keep interrupting our Innovative coordinator with requests to do a manual back-up. We have an automatic back-up each day at midnight. So each time the transaction file got 75-80% full I needed to stop for the day and await the magical file emptying before I could continue my learn-as-I-go batch work . Factor in that I had to each global a few times as I'd make newbie mistakes. You can understand why doing this took a few hours of my day for the past few weeks.
The final insult is that if you fill the transaction file in the midst of doing a global, the records which aren't yet updated will freeze. Twice I had this happen. Innovative only allows one to "free records in use" individually. A batch free-ing must be requested via their support ticketing system. And it may take them a day or two to do it. FRUSTRATING!!
There has GOT to be a better way. Really.
Labels: ebooks, III, MarcEdit, metadata, MPOW, what i did today
For each step two steps forward, there is the requisite step back.Last week's two steps forward: the
Rockefeller Press announcement (via Issues in Scholarly Communication) and the Harvard Law School joining the Harvard Faculty of Arts and Sciences in
unanimously adopting an OA mandate (via same).
Last week's step back:
Thompson-ISI puts restrictions on how authors using ResearchID (via Disrupted Library Technology Jester).
Thompson-ISI isn't high up on my fave vendor list because of their abysmal treatment of ISSNs within Web of Knowledge (don't get me started on the difficulties I encounter administering links to WoK within SFX). To their credit they're working on that. But this ResearchID thing makes it very obvious how they're developing their market -- they want to lock up author identifiers so only they can create web services with them. They've lost their monopoly on citation analysis now that Google, Scopus etc. are in the game. Makes me think that academic libraries better get on the ball with developing author identifier tools for their repositories and/or institutions. This is something I've been thinking about. I would love to make authority files for each faculty member and research group on campus and OpenID them or some such so that doing bibliographic citation analysis becomes more rationalized.
That's in keeping with a lively discussion the librarians at MPOW had with John Wilbanks of Science Commons during lunch last Monday. Wilbanks talked about the economic issues involved in creating and maintaining namespaces, largely who is to be responsible for long term funding and support. Wilkins said he believes that this is the type of work where librarians will find their niche as the academy moves towards cyberinfrastructure/eScience what-have-you.
Maybe. There's a big gap between the idea of librarians doing server/database/webby stuff and the reality of the technology skills of librarians on the front lines. I sure as heck don't know how to install and configure a namespace server. There are research and commercial interests which are way ahead of us on providing those types of services. Why should a researcher go to his librarian for help with managing his online identity if ResearcherID-type services already exist?
I don't know how to bridge that gap when it comes to what type of things I should be working on as professional development. Is it worth the energy to bootstrap myself into managing the technology behind semantic-webified authorities? That takes not only time but day-to-day projects with which to practice skills.
And technical services librarians are enmeshed in economically unsustainable models of cataloging and electronic resource maintenance anyway. For example, I've had to fix all the records for Proceedings of the Royal Society at each single place I've ever worked (hey, 300 odd years of title changes and splits makes for hard slogging managing the 78Xs and OpenURLs). This is the forest in which we toil and the trees are fading from view.
My only means of dealing with it is to partner with public service librarians to liaise with researchers, do user needs assessment for cyberinfrastructure services that we're capable of developing and delivering, then develop pilot projects from which to learn the requisite skills.
I fear that this type of work is too little too late for academic librarians. Yet, what choice do we have other than to persevere?
Labels: metadata, namespaces, open access, scholarly communication
MSG's weekly meeting with movies and muffins.
This week's muffin (vegan as always):
Carrot with walnuts & raisins (adapted from The Joy of Cooking)
This week's movies:
Part 3 of Diane Hillman's
Metadata Standards & Applications: Relationship ModelsSocial bookmarking in plain EnglishLabels: 4M, metadata
FYI, I'll be attending the annual
Innovative Users Group meeting so I'll be in Washington, D.C. from the evening of 4/26 through 5/1. I'll be at the Hilton, the alternative conference hotel. If anybody wants to hang out over coffee or beer or whatnot feel free to email or tweet.
Speaking of tweets, I'm finding I'm liking twitter more than I anticipated. I'm still not convinced it's not a waste of time. While it's great fun, I haven't yet found anything of work-related value there beyond locating people I follow via blog anyway.
Labels: IUG, metadata
The Metadata Services Group continues its Monday morning meeting ritual of watching metadata related movies while enjoying home-baked muffins!
Today's muffin: Blueberry
Today's movies:Tim Berners-Lee waxes enthusiastic about the Semantic WebMetadata Standards and Applications Trainer Screencasts: Part 2:
Approaches to Models of Metadata Creation, Storage, and ManagementLabels: 4M, metadata
I got me a
twitter account. Whoo. You can follow me at infod1va if you're so inclined. I do this with great hesitation as I'm the type who can get sucked into bulletin boards or groups or lists to the detriment of my work. I took the plunge so I can follow the folks participating on the
semanticlibrary wiki, which was created by
Fiona Bradley of semanticlibrary.net. If you haven't caught this blog yet, you should. Only up since 11/2007 and already it's a "must read" in my aggregator.
The wiki exists to support the goal of putting together an online learning program for librarians who are interested in learning more about semantic web technologies. One of my big goals for 2008 is to gain more hands-on experience with the relevant technologies. Yes I can read XML but transforming it is beyond me. Most of my professional development in things technical is project-based. Meaning, I think of a useful application and then set out to build it. For example, I taught myself javascript back in the '90s by
creating an interactive tutorial on patent searching (please forgive the color scheme. It was the 90s, I was hanging out with ravers, what can I say?). I read a lot but it's not the same as doing something yourself.
There's a good story as to why I have to use infod1va rather than infodiva. I'll tell it when my lawyers give me the go-ahead.
Labels: metadata, semantic web, what i did today
Chris Rusbridge of Digital Curation Blog wishes for the
submission of an open x open x open x open paper to the
4th International Digital Curation Conference. Open as in open authorship, open data-input, open metadata-output, and open access, but not explicitly open source. But, I think it's pretty safe to presume that open source would be desirable too. The conference is, after all, about digital curation. Open source code is in the best interests of digital curationists. And "Radical Sharing" is a key topic of the conference.
It will be fun to watch this one develop should anybody choose to run with it. I wonder what tool one would use to do what Rusbridge suggests. I haven't had much time to play with such tools myself. I've had
Sophie installed on my Mac since the first release and have yet to write something with it. And then there is
CommentPress. My WordPress skills are so sad that I created this blog with Blogger (and yes, I know I could have used wordpress.com) It's too bad
OCS and
OJS don't seem to have any co-authoring tools, near as I can tell from skimming their executive summary documentation. I wonder how well a Sophie or CommentPress authored document would integrate with an OCS or OJS? If it were me writing, I'd probably have everybody use GoogleDocs. It's probably got the lowest barrier of entry in terms of needing tech-savvy to collaborate.
I also begin to wonder which tools are being used by researchers in subject disciplines to create collaboratively authored papers. I suspect that its still MS Word or Adobe Acrobat and their commenting features or, for more technical disciplines, TeX and LaTeX or some other PostScript derived thingamabob. It would be interesting to do a local inventory of what people use at MPOW -- especially as we migrate to ePrints3 and try to figure out new services to develop in support of our researchers here.
Labels: dcc-2008, metadata, open access, open data, open scholarship, open source
I dislike meetings without a purpose. I'm also of the opinion that meetings which take longer than an hour are probably wasting time. That said, I also think that meetings should happen with more regularity than they are sometimes scheduled. Frequent informal interaction may help us (a) build collegiality and (b) get regular opportunities to share information and (c) learn a few new things. My colleagues within the Metadata Services Group and I have decided to experiment a bit to find which meeting model works best for us.
To that end, we've started meeting weekly, for roughly half an hour, to discuss department business and to do some shared professional development. That means movies! We interpret "movies" loosely and include podcasts/videocasts/screencasts etc. The common thread is that the subject needs to be related to our current or future work. We kicked off this meeting format last week by watching Arlington Heights Memorial Library's
Behind The Scenes - Technical Services (part of their awesome
LibVlog on youtube).
This week we begin exploring technology and filling in gaps in our knowledge about computing and networking
by looking at the inside of a computer. We also begin viewing Diane Hillmann's screen casts from the
LC Metadata Standards and Applications workshops.
After 6 months, we'll evaluate and figure out if we want to keep doing this.
Labels: 4M, metadata, what i did today
My new gig is still at the overwhelming-but-in-a-good way phase. I've
got jobs open, btw, if you want to work with me. Once I'm a bit less short-staffed I'll be able to blog a bit more as previously promised. I've set myself a goal of posting at least once per week.
I've got a few ideas percolating plus a few longish posts which have been in draft since (eep!) last summer. I've been holding in my snarks about RDA. I've got long ignored notes from DCC to discuss. I've got a few stories about online identity management. And so on....
The big but is that I've been experiencing a lot of health issues. I've recently learned that my heart murmur may be getting worse and I've also got some thyroid funny business happening. Nothing to worry about most likely. It does mean a slew of tests and doctors appointments, however. All my promises above are predicated on my ability to stay well.
Labels: dcc-2007, life1.0, metadata, RDA
I need to define the scope of this blog. Since I'm not a repository rat at MPOW, calling this blog
repositories for the rest of us seems a bit bungling. I considered changing the title but decided to keep it. I am still interested in repositories. Shift happens. Shift will continue to happen. There is no point in changing the blog title every time my career takes a turn.
The common theme of my professional life has been the intersection of people and digital collections. I have always worked with digital repositories. I still do.The title is malleable enough to handle the vagaries of my career path. Please consider the
"repositories for the" portion of the title to be referring to repositories in the broadest of senses. According to the Oxford English Dictionary, when "repository" is used as a noun it means, "a vessel, receptacle, chamber, etc., in which things are or may be placed, deposited, or stored" (2nd ed.1989).
Repositories are containers, physical and virtual, in which you put things, stuff, and junk. Repositories is a general enough term for a blog that writes about library and information science stuff.
The
"...rest of us" within the blog title is a bit more difficult to justify. The phrase implies that there is some select group out there classed differently than "us." Who is the "us" in "...the rest of us?" Does having an "us" in the title necessarily mean there has to be a "them" to which "us" is compared and contrasted? Is there a binary opposition? I don't think it's that simple.
When I began this blog about a year ago I was a repository rat in the second wave of institutions creating IRs. The
"...rest of us" meant non-ARL institutions building repositories without big grants or dedicated repository staff. The "them" were early adopters. Now I'm proud to say that I work for "them." The "us" vs. "them" dichotomy doesn't work for me anymore. What group does this blog purport to serve now? Who do repositories exist for, if not for "us?"
I write to get stuff out of my head. To make ideas tangible. To say what I think. There's lots going in the universe of libraries and librarianship. This blog is my effort to make sense of things I'm working on career-wise. My last blogging effort had more of a purpose -- to provide facts about
RFID in Libraries and my opinions regarding implementations. I don't think I've really found a purpose or a voice for this blog yet. I know it's going to evolve over time. I haven't been writing as much as I would like since starting the new gig. I have been overwhelmed (but in a good way!) with my first management level position.
I don't know who
"...the rest of us" are. It could be pretty much anybody. It could be those of us who seek explanation or instruction in how repositories function. It could be those of us who want to find, identify, select, and obtain information resources from repositories which were not created specifically for us.
"Scholarly communication" seems to refer only to the communication between scholars. The Internet is making academia more accessible to layman scholars. Maybe
"...the rest of us" is those of us who want to participate in learned discourse although we are not tenured faculty. I do know that
"...the rest of us" are folks who support the "Open" movements (access, data, source).
"...rest of us" is vague and uncertain. Librarianship is vague and uncertain so
"...rest of us" can stay. I like alliteration anyway.
Labels: metadata, scope
I'm going to pull a
Mark Lidner and give up on trying to comment
the Futures report. There's been a huge-o explosion of happenings in the bibliographic wilderness this past week. It's hard enough for me to get the most relevant-to-me stuff even read.
I did attend the
3rd International Conference on Digital Curation and I'll try to summarize my copious notes. I'll be writing a trip report anyway, so it serves a dual purpose. In the meantime, check out what
Peter Murray-Rust and
Chris Rusbridge have to say about it.
Other things worth reviewing and commenting on which I probably won't:
- OAI-ORE alpha specification
- Yee's cataloging rules
- Zotero IA alliance
- Roy Tennant on the term "bibliographic control" (which I've always LOATHED ... it gives me mental images of leather-clad dominatrices demanding all the books be returned to a library)
What's with all this stuff coming out during this season anyway? Holy
Toledo people! It's time for holidays. Stop blogging already and go spend time with your families! I wish I could, but the next draft of RDA is going to be released very soon and I need to have it under my belt prior to ALA Midwinter for CC:
DA's discussion.
I'm starting to wonder if I even have time to blog at all... how the heck does everybody else manage it? Don't you have lives?
Labels: CC:DA, dcc-2007, LC, life1.0, metadata, Peter Murray-Rust, RDA
The
unedited web cast of the 11/13 meeting of the LC working group is available.
Karen Coyle does a nice summary.
Recommendation 4.2 re: RDA is particularly interesting esp. coming on the heels of the JSC's recent re-re-organization of the RDA structure.
4.2 Realize FRBR. The framework known as FRBR has great potential but so far is untested. It is being used as the basis for RDA, even though FRBR itself is not clearly understood. The working group recommends that no further work be done on RDA until there has been more investigation of FRBR and the basis it provides for bibliographic metadata. [Note: this recommendation is likely to change such that there will be specific recommendations relating to RDA; FRBR will be treated separately.]
The LC report has it dead.bang.on with that recommendation. The connections between FRBR and RDA weren't made explicit until the last revision of the
RDA Prospectus and the release of
the mapping this past June . They key phase is "
FRBR itself is not clearly understood. "
Building a de-facto standard based upon a conceptual model which isn't clearly understood seems kind of bass-ackward. Is it realistic, however, to wait for FRBR to be better understood? We've had it for almost a decade. Let me play devil's advocate for a second. If a conceptual model is difficult to understand than maybe it's not a very good model? It's one argument for a do-over on writing RDA.
I think stopping the RDA process and re-starting from scratch from the conceptual model up would be ideal. Especially if when re-starting the process the JSC continued to consult with non-library related communities to discuss their conceptual models and come to a common, or at least complimentary, model(s)
prior to coming up with specifications for the actual metadata. The problem is stopping the RDA process is never going to happen. The publication schedule is driving this train and JSC has no intention of deviating from the proposed 2009 release. Remember that JSC reports to the RDA Committee of Principles, which consists of reps from the national library organizations which have a vested interest in their publishing income. There is no economic incentive for the JSC to call a halt to things and re-think. It's unrealistic to expect that to change and crazy-making to those of us commenting on RDA to keep requesting it. The stop-the-presses option has been discussed more than once at CC:DA and we keep coming back to the same conclusion.
What to do then? As best we can. I am optimistic that RDA may, eventually, get things right. The JSC is getting better about working with other stakeholder groups. See, for example, the
DCMI/RDA task group working on a DC application profile for RDA and a controlled element vocabulary. I think it's conceivable that more partnerships will be developed and that RDA will evolve to be based upon a mutually understood conceptual framework -- most likely a more fully-understood FRBR. It will be interesting to see the full LC working group recommendations when they are released on 12/1. Maybe, given recommendation 4.2 they will suggest another RDA/?? working group on fully exploring the implications of FRBR as a conceptual model and integrating the conceptual models of other communities.
All I really know is that RDA will be an imperfect work-in-progress for quite a long time. We all have to accept that we will have a release-refine-release cycle and that we won't get perfection the first time out. Otherwise we'll wait forever.
Labels: cataloging, CC:DA, FRBR, JSC, LC, metadata, RDA
I'm heading to Boston tomorrow to visit the folks at Ex Libris for some training. Speaking of work travel, that reminds me that I've gotten the go-ahead to attend the
3rd International Conference on Digital Curation with my boss and another colleague.
I will share my conference notes, but perhaps not in raw form.
ALA Midwinter will be the next work travel after dcc-2007. Beyond that, I don't want to think that far ahead.
If you're in the Boston area and want to meet up for a cuppa joe (or better yet, a nice tasty microbrow), give me a holler. Same applies for those of you in D.C. in December.
Labels: data curation, dcc-2007, digital preservation, metadata
I've finally had a chance to read the current issue of
D-Lib. If you're involved in science, engineering, or technology librarianship you should NOT miss
Anna Gold's two part article on Cyberinfrastructure, Data, and Libraries (full disclosure: Anna was my boss when we both worked at
UCSD's Science and Engineering Library).
Anna gives the best summary of the evolution of eScience/Cyberinfrastructure that I've read. Her bibliography alone is worth the price of admission. Even better, she describes the unique position of academic libraries amongst Cyberinfrastructure stakeholders and provides excellent advice on the new skills librarians should master if they hope to play a role in this emerging area.
This is key. I've often spoken to high level managers in science libraries who think that librarians do not have a role to play in the cyberinfrastructure and that data curation is best done by domain experts. Anna makes the point that "domain expertise may also be needed to provide credible expert help with data management problems or tools." It is indeed true that librarians aren't equipped to deal with peta scale storage and high performance computing. It is also true, however, that librarians are highly experienced with the assessment and selection functions inherent in developing "collections," developing standards and building communities of practice, applying metadata, preservation, managing licensing and access rights, and developing discovery services. Librarians know how to help people re-purpose information for multi and interdisciplinary use. Anna also suggests that library funding models may assist in the ongoing quest for a workable business model for data curation since libraries are accustomed to getting money from a variety of sources.
Go read this. Do not pass go. Do not collect $200.
Labels: cyberinfrastructure, data curation, digital preservation, eScience, metadata, scholarly communication
Announced today by Cheri Folkner, CC:DA chair
I am happy to announce that Patty Hatch has agreed to serve as CC:DA's webmaster through ALA Annual 2010. Patty received her MLS from Simmons College and currently works as an educational technology & communications Specialist at Harvard. Previously she was a senior training librarian in Harvard University Library's Office for Information Systems.
Patty will be working with Christine Taylor of the ALCTS office in transitioning the CC:DA website from Penn State to ALA hosted servers once ALA is ready to host the web pages -- that may be after Midwinter rather than this fall. I am sure that Patty will keep us updated on the timetable and progress of the transition. I have also appointed Patty to CC:DA's TF on internal/external communication.
Although John Attig will be doing some webmaster duties until the transition is complete, my hope is that the transition will ease the workload for him so he won't have to be worrying about the CC:DA website while performing his ALA rep responsibilities. John's work as CC:DA webmaster has been tremendous -- CC:DA is in his debt.
Congratulations Patty! I look forward to working with you via the CC:DA Task Force on Internal and External Communication
Labels: CC:DA, metadata
I finally have had the opportunity to use
MarcEdit (btw, recently updated to v.5.1!). The mission? Convert EAD to a collection level MARC and import to the ILS. 'Twas relatively simple and pain free.
I've spent a lot of time learning new-to-me things this past summer -- Palm OS, Mac OS and all the FOSSy software I've installed on it. In comparison, MarcEdit has wonderful ease-of-use and self-explanatoriness. Thanks Terry Reese!
Labels: EAD, MARC, MarcEdit, metadata
The NSF and JISC had a joint invitational workshop on repositories in Phoenix, Arizona on April 17-19, 2007. The report, "
The Future of Scholarly Communication: Building the Infrastructure for Cyberscholarship " is now available from the workshop web site at
http://www.sis.pitt.edu/~repwkshop/Labels: JISC, metadata, NSF, scholarly communication
Today I read the
AHDS Digital Moving Images and Sound Archiving Study.
I've been looking into metadata for the preservation of digital video/multimedia. It's an area of intense interest for me but I haven't had a work-related need to review it since I wrote, "OAIS, METS, MPEG-21 and archival values" for the Spring 2002 issue of The Moving Image. That means I haven't taken a good look at it since early 2001, given peer-review time lag.
I've had a request for consultation about archiving digital video, so I'm taking the opportunity to re-familiarize myself with the latest and greatest. The AHDS Digital Moving Images and Sound Archiving Study was released in August 2006 and it's the most current description of the state-of-the-art that I've found so far. Please do send me pointers to more recently available materials if you're in the know. I figure NDIPP probably has some recent information too and I'll be sure to look there next.
The AHDS surveyed 92 individuals and organizations working in digital audiovisual preservation. Most respondents indicated that their focus was on practical work flow management issues like data capture, access/dissemination, metadata, and rights management. They conclude that preservation work is at an early stage of development. Much remains to be done.
The big take-aways I get from the article:
- there's no sustainable funding model in place
- organizations and institutions can't do digital preservation alone. There needs to be wider collaboration in order to achieve economies of scale
- there's no consensus on best practices and standards especially regarding file formats and codecs (the report recommends JPEG 2000 and MFX as the best lossless format and wrapper for digital masters – but they do think there needs to be more research comparing METS and MPEG-21)
- more tools need to be built for automated metadata extraction
- technical obsolescence is still the greatest threat to long-term preservation of A/V materials.
It's pretty much the same conclusions that I took away from DigCCurr2007. No surprises.
It's a bit discouraging to see that the same big issues that existed in 2001 still exist today. It is encouraging, however, that big funding agencies like JISC and the NSF are on the case, and that smart folks like Reagan Moore are building the tools to assist with wide scale distributed collaboration on digital preservation. I suspect that digital video/multimedia preservation is going to be all about the Grid. That's the key research area to watch, I think.
Labels: digital preservation, digital video, metadata, what i did today
metadata.net doesn't list RDA as a resource discovery/description
metadata initiative. To be fair, the site hasn't been updated for a year. Also, they have
METS, MODS, and
CIMI listed, as well as a link to
IFLA's list of
metadata initiatives, so they're not completely ignorant of
librariana.
It's just sort of interesting to ponder the oversight as one considers the relevance of RDA outside of the library world.
metadata.net,
fwiw, is run by the
MAENAD project, which was(is?) a
metadata research project run by the venerable
Jane Hunter, a leading researcher in
metadata issues for complex digital objects.
Is it an act of hubris to reach out to other communities with the expectation that they will leap to adopt our "standard"?
Labels: metadata, RDA
If you (a) understand cataloging and
metadata and (b) have mad web skills then the ALA Cataloging Committee on Description and Access (a.k.a. CC:DA) needs you.
At ALA Annual CC:DA voted to establish an official web master position. John
Attig, our current web master, has become our representative to the
JSC. If you are interested contact me (laura.j.smart on
gmail) or Cheri
Folkner (
cherifolkner at
boisestate dot
edu) , chair of CC:DA.
The web master may not have to attend every ALA conference in order to do the job. This is an important gig because we are porting the current CC:DA web site to ALA managed servers and because we have a Task Force examining all aspects of CC:DA communication. We want the committee to make full use of emerging technologies to better reach our constituents.
Labels: cataloging, CC:DA, metadata
I finally got a moment to review the Peter Murry-Rust presentation. Unfortunately I couldn't do it using my Macbook and Firefox. It would crash my browser each time, probably got something to do with the ActiveX, I haven't had time yet to troubleshoot.
So many apologies to any of my Mac-using audience. I'll post an update/work-around as I get to it. I'm off to Santa Barbara county today with my beautiful wife and daughter so it will need to wait. LTB=life trumps blogging.
Labels: data curation, etd, life1.0, metadata, Peter Murray-Rust
I knew I was going to love my new gig when I returned from the Underground Railroad tour to find out that my library, in partnership with campus Digital Media Services had invited Peter Murray-Rust to give a talk on campus on data driven science and digital repositories.
The Power of the Scientific eThesis is now
publicly available for your screen
casting pleasure.
Labels: data curation, etd, metadata, Peter Murray-Rust
I want to write about CC:DA comments on RDA Ch.6 and 7, but it will take me eons to pull together the issues. I started the new gig yesterday and have some thoughts akin to Karen's about starting a new job.
In the meantime, I want to point out
SciVee, an incredibly cool tool for disseminating scientific information. It makes me incredibly happy to see
SDSC's involvement. I first saw scientific visualizations back in 1996 when I was
SDSC's librarian. At the time I was responsible for maintaining a bibliography of publications by the center's researchers. The "bib" as it was affectionately known, was kept for NSF reporting purposes. Big grants mean big accountability and the number of peer-reviewed publications is one measurement of a project's success. At the time I felt that it would be really cool to create
metaworks of articles with their associated publications,
presentations AND raw data. Those
metaworks would need to be engendered via
metadata for bibliographic families. Ten years later all of these disparate yet related materials are beginning to come together within tools scientists can use. That's just wicked rad!
Even though I'm glad to see this trend, it does make me ponder the role of cataloging and
metadata services within academic libraries vs. public libraries. Our audiences are so different and the types of materials we deal with are so different that I wonder if it's a good idea to continue to hold alliances with rules such as RDA. As CC:DA reviews RDA, I'm continually reminded to think about small, rural, public libraries and the needs of those libraries with less funding. Can a standard for bibliographic description work for both the academic handling bibliographic families pulled together on-the-fly AND the small town librarian handling graphic novels?
I have my doubts.
Labels: CC:DA, metadata, RDA, SciVee, SDSC
Haworth recently announced the new
Journal of Library Metadata.
I feel a bit irritated every time a new LIS journal arrives on the scene which doesn't let authors retain full copyright. Haworth, to it's credit, is a
SHERPA/RoMEO "green publisher," meaning that authors can archive pre-prints and post-prints of the work if they meet certain conditions. In the case of Haworth, those conditions include: the archiving must be on the author's website or author's institutional web site,there should be notice of the publisher's copyright and citation pointing readers to the published version of the article, and the server upon which pre/post print is archived must be non-profit.
Sounds OK. Articles from this new journal will be available, in some form, as Open Access so why am I irritated? I'm not fond of
Haworth's copyright transfer agreement. Authors transfer full copyright to Haworth and retain limited rights of re-use rather than authors retaining their copyright and licensing publication privileges to Haworth.
As a long term strategy, it's not optimal. Authors don't need to sign away full rights to publishers and they shouldn't. It's a nit-picky thing for me. Publishers need permission to make the article available, to archive it/re-purpose to different formats when necessary, etc. It's great that the publishers allow authors to retain rights. I just think that in the very long term, it's not a good practice to let the publishers have it all just because they let authors keep a manifestation of a work on their own server to do what they will.
It's really a question of how much one trusts publishers to share any profit they may make from your work in the long term.
At least the individual subscription price for the
Journal of Library Metadata is a reasonable $48. I still chafe at any type of reader fee for metadata research, given the interoperability issues that face the metadata community. Less affluent libraries should be able to access the research up-front without relying on the individual vagaries of personal archiving practice. Just because Haworth allows authors to archive their articles, doesn't mean that those authors will archive those articles.
Under currant practice, the only guaranteed, timely, access to the "published" work is via the journal. When there are barriers to that journal, it doesn't serve the LIS community. It's not an easy black/white issue. It does cost money to review and produce the final article and the journal publishers are providing a service. Somebody needs to foot the bill.
We cannot develop new economic models, however, if we continue, as a profession, to support the status quo. I haven't decided yet if I'll read the new journal or write for it. Depends on the content, I suppose. I'm inclined to avoid it, however, and continue patronizing freely available OA journals instead.
Labels: metadata, open access
Diane Hillmann comments on a May 11 post to NGC4LIB by Karen Coyle. Karen says "The problem that we see today in the library world is that when there is a standard that is rising up to the point of being useful and usable by many in our community, it isn't clear where to take it so that it can move from being a neat hack to being a community standard," and suggests that ALA is the obvious body to promote library interests, at least in theory.
Diane asks "given this standards reality check from Karen, what are the implications for us?"
I say the implication for ALA is that the Divisions need to coordinate better on standards. They need to speed up the official channels of communication between committees. The extreme busyness of people contributes to the lack of standards work being done. Nobody wants more work. The other part of it is that we're not making effective use of social tools to do the business of the association. We create more work for ourselves by not using the time-saving new tools. The difficulty is that learning the tools takes time+effort=more work. There's no incentive to change.
That's starting to change (hooray for ALA communities,wikis, and blogs despite their growing pains! hooray for hiring Jenny Levine! ). But I still have trouble convincing people to use web-based conferencing a go. The reason it takes months for a committee to write a report is that, even with email, it takes time to send out a doc, get responses, compile responses, synthesize and summarize, check back in with committee members, then take necessary actions.
Come to think of it, a committee probably only recommends actions. Another problem is assigning responsibility for action and following through to make sure it's done!
As a task force chair, I'd much rather have one single real-time discussion with the task force members to gather all comments at once. It's faster. I'd like to spend less time volunteering please. Perhaps if we did better with the social tools, we'd do better with the standards work? ALA already has some channels in place for standards development. I give you the example of CC:DA.
I just submitted a preliminary report from the
CC:DA's Task Force on Internal and External communication. The TF reviewed
CC:DA's charge as well as
"Building international descriptive cataloging standards..." (the promotional "pamphlet" to explain to the masses just what-the-heck CC:DA does).
In the CC:DA charge section of the
"Building international.." document it says:
To develop official ALA positions on proposed international cataloging policies and standards pertaining to the committee’s area of responsibility and to advise the official ALA representative; or, if there is no official ALA representative, to act as the clearinghouse within ALA for review of these policies and standards and to serve as the formal liaison between ALA and the originating organizations.
Most of the committee scope described "Building international..." is related to the development of AACR and interactions with the JSC. Yet it also says CC:DA's role is to develop official ALA positions on cataloging and related standards. This bullet point quoted above indicates, to me anyway, that CC:DA should be taking a proactive role in standards discussions within ALA. It also means we need to pay attention to the first two words, "to develop." The FBI calls that a clue, son. To develop implies taking action. (smile). I think this action needs to be both internal to ALA and external to other standards bodies. CC:DA has sucked at taking the external-to-ALA actions.
Take a look at the CC:DA roster, for example. Most of the external liaison members are from library or librarian associations. There weren't any non-library bodies represented until Diane Hillmann (for DCMI) and Curtiss Priest (for IEEE) were added.
The "Building international ... standards" document also says that CC:DA welcomes suggestions
*In applying standards for bibliographic control to new and emerging technologies
*In employing automated solutions to the development of descriptive cataloging records.
Yes, CC:DA welcomes suggestions but has really only been taking them from librariankind.
If CC:DA is supposed to do standards work, why hasn't it? The snark in me wants to say that it's because the minutiae of dealing with AACR and MARC takes up all of CC:DA time and probably a forest's worth of paper. To be fair, there is the "pertaining to the committee’s area of responsibility" clause in the "Building international ... standards" document. AACR really is the bulk of CC:DA's area of responsibility as per the written charge. I can understand how we could collectively miss following through on a wee little suggestion to develop positions for ALA beyond AACR/RDA. I don't think it excuses the neglect, however. At CC:DA meetings we really don't much discuss standards beyond AACR/RDA (if we consider that a standard).
Betty Landesman, ALA's NISO rep, gives us a report each Midwinter and Annual, and she announces NISO proposals/votes on the CC:DA email list which gives committee members the opportunity to respond. I've tried to review those and give Betty feedback, but I just couldn't. My life is f.u.l.l. And I have no idea if other CC:DA members, voting or non, give Betty any feedback either. My sense is that nobody does, but you'd have to ask Betty.
I bear some of the blame for this lack of attention to the standards proposals as a voting member of CC:DA. Diane hit the nail on the head when she said the work of standards development doesn't happen, "mostly because we already have busy lives and sometimes our institutions don’t support such activity very well. " The RDA publication process has CC:DA members in a mire of reading/thinking/responding work. Not an excuse for not paying attention to standards. Especially when I hold the radical view that ALA should insist on decoupling RDA development from the Committee of Principles' publication schedule. I can't very well argue that radical stance unless CC:DA members are willing to be proactive in their involvement with the other related standards work.
I think it means that we need to add more people to CC:DA in order to spread the work load around a bit more. I also think it means that the CC:DA TF on Communication really needs to come up with concrete, do-able, alternatives to CC:DA's current methods of disseminating information.
Labels: CC:DA, metadata, RDA, standards
Karen Coyle interviews Diane Hillman about the outcomes of a recent meeting between the editor of RDA, some members of the Joint Steering Committee for the Revision of RDA,* and other stakeholders.
Diane Hillman has been a tireless champion encouraging the JSC to work with other metadata communities to develop RDA.
The RDA/DCMI collaboration will include an RDA application profile for DC and a formal element vocabulary. An controlled yet extensible element vocabulary is necessary for describing carriers as per the revised chapter Ch.3 of RDA. No, I haven't forgotten to write up my notes on that, btw. With any luck, I should get to that today!
*note the new name! for those of you not in the know, the JSC used to be for the revision of AACR
Labels: DCMI, interoperability, metadata, RDA, standards