Academic Libraries: Under-Used & Under-Appreciated

I’m guilty. I often admit this when I meet librarians at conferences and workshops – I’m guilty of never using my librarians as a resource in my 13 years of higher ed, spread across seven academic institutions.  At the very impressive MBL-WHOI Library in Woods Hole MA, there are quite a few friendly librarians that make their presence known to visitors.  They certainly offered to help me, but it never occurred to me that they might be useful beyond telling me on what floor I can find the journal Limnology and Oceanography.

In hindsight, I didn’t know any better.  Yes, we took the requisite library tour in grad school, and yes, I certainly used the libraries for research and access to books and journals, but no, I never talked to the librarians.  Why is this? I have a few theories:

Librarians are terrible at self promotion.  Every time I meet librarian, I’m awed and amazed by the vast quantities of knowledge they hold about all kinds of information.  But most of the librarians I’ve encountered are unwilling to own up to their vast skill set.  These humble folks assume scientists will come to them, completely underestimating the average academic’s stubbornness and propensity for self-sufficiency.  In my opinion, librarians should stake out the popular coffee spot on campus and wear sandwich boards saying things like “You have no idea how to do research” or “Five minutes with me can change your <research> life“.  Come on, librarians – toot your own horns!

Academics are trained to be self-sufficient.  Every grad student has probably gotten the talk from their advisor at some point in their grad education.  In my case the talk had phrases like these:

  • “You don’t have to ask me EVERY time you want to run down to the supply room”
  • “Which method do YOU think would work best?”
  • “How should I know how to dilute that acid? Go figure it out!”

It only takes a couple of brush-offs from your advisor before you realize that part of learning to be scientist involves solving problems all by yourself.  This bodes well for future academic success, but does not allow us to entertain the idea that librarians might be helpful and save us oodles of time.

Google gives academics a false sense of security. Yes, I spend a lot of time Googling things.  Many of this Googling occurs while having a drink with friends – some hotly debated item of trivia comes up, which requires that we pull out our smart phones to find out who’s right (it’s usually me).  But Google can’t answer everything.  Yes, it’s wonderful for figuring out who that actor in that movie was, or for showing a latecomer the amazing honey badger video.  But Google is not necessarily the most efficient way to go about scholarly research.  Librarians know this – they have entire schools dedicated to figuring out how to deal with information.  The field of information science, which encompasses librarians, gives out graduate degrees in information.  Do you really think that you know more about research than someone with a grad degree in information??  Extremely unlikely.  Learn more about Information Science here.

Sterotype alert: there's a lot of knowledge hiding behind librarians' sensible shoes. From Flickr by Kingston Information & LIbrary Service

This post does, in fact, relate to the DCXL project.  If you weren’t aware, the DCXL project is based out of California Digital Library.  It turns out that librarians are quite good at being stewards of scholarly communication; who better to help us navigate the tricky world of digital data curation than librarians?

This post was inspired by a great blog posted yesterday from CogSci Librarian: How Librarians Can Help in Real Life, at #Sci013, and more

Ontologies and Data

Ontologies is one of those words I hear people toss about in conversations about computing, programming, and development. I usually nod and smile, pretending I know exactly what the word means, and how it relates to scientific data. It took some vigorous Google searching and a great discussion with M. Schildhauer of NCEAS before I can say, with confidence, that I kind-of understand the concept of ontologies.

In case you are in the same situation I was a few months ago, allow me to enlighten you.  First, let’s start with the pre-computer era definition: ontology is the study of the nature of existing, the categories of being, and the relationships between these categories. Still not clear? Let’s let Wikipedia explain what the study of ontology entails:

Questions concerning what entities exist or can be said to exist, and how such entities can be grouped, related within a hierarchy, and subdivided according to similarities and differences.

I haven’t thought about the nature of existence since university-level philosophy courses, so this explanation makes my brain ache mildly. Remarkably, the computer science definition for ontology is slightly more tangible (and also sheds light on the descriptions above). In this field, an ontology is a set of concepts that represent the knowledge of a particular field of study (i.e. domain).  It also includes the relationships between the concepts.  Here’s examples of some important consequences of a field having an ontology:

  • shared vocabulary and taxonomy
  • explicitly defined concepts
  • the relationships between different concepts

And Wikipedia provides an example that may help clarify things:

Particular meanings of terms applied to that domain are provided by domain ontology. For example the word card has many different meanings. An ontology about the domain of poker would model the “playing card” meaning of the word, while an ontology about the domain of computer hardware would model the “punched card” and “video card” meanings.

An important point to make is how vital ontologies are now for this era of  international collaboration, data deluge, and digital data.  Take the field of genetics. What if every geneticist decided on their own way to describe genes, proteins, and sequences? Furthermore, what if they used words other than “genes”, “proteins”, and “sequences” to describe these things?  It would be incredibly difficult for the field to progress since no one is quite sure what anyone else is talking about in their research.  A Gene Ontology has been established within the community to prevent this scenario from taking place.

There is much more to ontologies than standard vocabularies, but this is certainly the easiest ontology concept to grasp.  In terms of the DCXL add-in, ontologies could be used to structure how Excel spreadsheets are formatted and coded to facilitate universal discoverability and usability.  It’s not likely that the first version of the add-in will be able to accomodate a wide range of ontologies (i.e. domain-specific vocabularies), but we hope that future versions might find ways to direct users to standards used in their field of interest.

science ontology

A map of science from Ontology Explorer: Ontologies can be thought of as maps describing relationships.

NSF Panel Review of Data Management Plans

With the clarity of the New Year, I realized I broke a promise to you DCXL readers… in my post on data policies, I stated that my next post would be about the current state of data management plan evaluation on NSF panels.  Although it is a bit late, here’s that post.

My information is from a couple of different sources: a program officer or two at NSF, a few scientists who have served on panels for several different directorates, and some miscellaneous experts in data management plans.  In general, they all said about the same thing: we are in early days for data management plans as an NSF requirement, and the process is still evolving.  With that in mind, here are a few more specific pieces of information I gathered (note, these should be taken with a grain of salt since this is not the official position of NSF):

zach morris cell phone

Just like Zach Morris' cell phone, data management plans are sure to evolve into something much fancier in a few years. From

  1. The NSF program officer that leads the panel set the tone for DMP evaluation.  Scientists that serve on the proposal review panels generally are not experts in data management or archiving, and therefore are unsure what to look for in DMPs.
  2. The contents of a data management plan will not tank a proposal unless it is completely absent. Since no one is quite sure what should be in these DMPs, it’s tough to eliminate a good proposal on the basis of its DMP. Overall, DMPs are not currently a part of the merit review process.  One person said it very succinctly:

    PIs received a slap on the wrist if they had a good proposal with a bad DMP. If it was a bad proposal, the bad DMP was just another nail in the coffin.

  3. The panelists are merely trying to determine whether at DMP is “adequate”.  What does this mean? It generally boils down to two criteria: (1) Is the DMP present? and (2) Does the PI discuss how they will archive the data?  Even (2) is up for debate since proposals have made it to the top despite no clear plans for archival, e.g. no mention of where they will archive the data.
  4. Finally, there is buzz about some knowledgeable PIs using DMPs as a strategic tool.  Rather than considering this two-page requirement a burden, they use the DMP as part of their proposal’s narrative.  Food for thought.