The Data Lineup for #ESA2013

Why am I excited about Minneapolis? Potential Prince sightings, of course!

Why am I excited about Minneapolis? Potential Prince sightings, of course! From

In less than  week, the Ecological Society of America’s 2013 Meeting will commence in Minneapolis, MN. There will be zillions of talks and posters on topics ranging from microbes to biomes, along with special sessions on education, outreach, and citizen science. So why am I going?

For starters, I’m a marine ecologist by training, and this is an excuse to meet up with old friends. But of course the bigger draw is to educate my ecological colleagues about all things data: data management planning, open data, data stewardship, archiving and sharing data, et cetera et cetera. Here I provide a rundown of must-see talks, sessions, and workshops related to data. Many of these are tied to the DataONE group and the rOpenSci folks; see DataONE’s activities and rOpenSci’s activities. Follow the full ESA meeting on Twitter at #ESA2013. See you in Minneapolis!

Sunday August 4th

0800-1130 / WK8: Managing Ecological Data for Effective Use and Re-use: A Workshop for Early Career Scientists

For this 3.5 hour workshop, I’ll be part of a DataONE team that includes Amber Budden (DataONE Community Engagement Director), Bill Michener (DataONE PI), Viv Hutchison (USGS), and Tammy Beaty (ORNL). This will be a hands-on workshop for researchers interested in learning about how to better plan for, collect, describe, and preserve their datasets.

1200-1700 / WK15: Conducting Open Science Using R and DataONE: A Hands-on Primer (Open Format)

Matt Jones from NCEAS/DataONE will be assisted by Karthik Ram (UC Berkeley & rOpenSci), Carl Boettiger (UC Davis & rOpenSci), and Mark Schildhauer (NCEAS) to highlight the use of open software tools for conducting open science in ecology, focusing on the interplay between R and DataONE.

Monday August 5th

1015-1130 / SS2: Creating Effective Data Management Plans for Ecological Research

Amber, Bill and I join forces again to talk about how to create data management plans (like those now required by the NSF) using the free online DMPTool. This session is only 1.25 hours long, but we will allow ample time for questions and testing out the tool.

1130-1315 / WK27: Tools for Creating Ecological Metadata: Introduction to Morpho and DataUp

Matt Jones and I will be introducing two free, open-source software tools that can help ecologists describe their datasets with standard metadata. The Morpho tool can be used to locally manage data and upload it to data repositories. The DataUp tool helps researchers not only create metadata, but check for potential problems in their dataset that might inhibit reuse, and upload data to the ONEShare repository.

Tuesday August 6th

0800-1000 / IGN2: Sharing Makes Science Better

This two-hour session organized by Sandra Chung of NEON is composed of 5-minute long “ignite” talks, which guarantees you won’t nod off. The topics look pretty great, and the crackerjack list of presenters includes Ethan White, Ben Morris, Amber Budden, Matt Jones,  Ed Hart, Scott Chamberlain, and Chris Lortie.

1330-1700 / COS41: Education: Research And Assessment

In my presentation at 1410, “The fractured lab notebook: Undergraduates are not learning ecological data management at top US institutions”, I’ll give a brief talk on results from my recent open-access publication with Stephanie Hampton on data management education.

2000-2200 / SS19: Open Science and Ecology

Karthik Ram and I are getting together with Scott Chamberlain (Simon Fraser University & rOpenSci), Carl Boettiger, and Russell Neches (UC Davis) to lead a discussion about open science. Topics will include open data, open workflows and notebooks, open source software, and open hardware.

2000-2200 / SS15: DataNet: Demonstrations of Data Discovery, Access, and Sharing Tools

Amber Budden will demo and discuss DataONE alongside folks from other DataNet projects like the Data Conservancy, SEAD, and Terra Populus.

It’s Time for Better Project Metrics

I’m involved in lots of projects, based at many institutions, with multiple funders and oodles of people involved. Each of these projects has requirements for reporting metrics that are used to prove the project is successful. Here, I want to argue that many of these metrics are arbitrary, and in some cases misleading. I’m not sure what the solution is – but I am anxious for a discussion to start about reporting requirements for funders and institutions, metrics for success, and how we measure a project’s impact.

What are the current requirements for projects to assess success? The most common request is for text-based reports – which are reminiscent of junior high book reports. My colleague here at the CDL, John Kunze, has been working for the UC in some capacity for a long time. If anyone is familiar with the bureaucratic frustrations of metrics, it’s John. Recently he brought me a sticky-note with an acronym he’s hoping will catch on:

SNωωRF: Stuff nobody wants to write, read, or fund

The two lower-case omegas, which translate to “w” for the acronym, represent the letter “O” to facilitate pronunciation –i.e.,  “snorf”. He was prompted to invent this catchy acronym after writing up a report for a collaborative project we work on, based in Europe. After writing the report, he was told it “needed to be longer by two or three pages”. The necessary content was there in the short version – but it wasn’t long enough to look thorough. Clearly brevity is not something that’s rewarded in project reporting.

Which orange dot is bigger? Overall impressions differ from what the measurements say. Measuring and comparing projects doesn't always reflect success. From

Which orange dot is bigger? Overall impressions differ from what the measurements say. Project metrics doesn’t always reflect success. From

Outside of text-based reports, there are other reports and metrics that higher-ups like: number of website hits, number of collaborations, number of conferences attended, number of partners/institutions involved, et cetera. A really successful project can look weak in all these ways. Similarly, a crap project can look quite successful based on the metrics listed. So if there is not a clear correlation between metrics used for project success, and actual project success, why do we measure them?

So what’s the alternative? The simplest alternative – not measuring/reporting metrics – is probably not going to fly with funders, institutions, or organizations. In fact, metrics play an important role. They allow for comparisons among projects, provide targets to strive for, and allow project members to assess progress. Perhaps rather than defaulting to the standard reporting requirements, funders and institutions could instead take some time to consider what success means for a particular project, and customize the metrics based on that.

In the space I operate (data sharing, data management, open science, scholarly publishing etc.) project success is best assessed by whether the project has (1) resulted in new conversations, debates and dialogue, and/or (2) changed the way science is done. Examples of successful projects based on this definition: figshare, ImpactStory, PeerJ, IPython Notebook, and basically anything funded by the Alfred P. Sloan Foundation. Many of these would also pass the success test based on more traditional metrics, but not necessarily. I will avoid making enemies by listing projects that I deem unsuccessful, despite their passing the test based on traditional metrics.

The altmetrics movement is focused on reviewing researcher and research impact in new, interesting ways (see my blog posts on the topic here and here). What would this altmetrics movement look like in terms of projects? I’m not sure, but I know that its time has come.