Oceanographers: Why So Shy?

Last week I attended the TOS/ASLO/AGU Ocean Sciences 2012 Meeting in Salt Lake City. (If you are a DCXL blog regular, you know I was also at the Personal Digital Archiving 2012 Conference last week: my ears were bleeding by Friday night!).  These two conferences were starkly different in many ways.  Ocean Sciences had about 4,000 attendees, while PDA was closer to 100.  Ocean Sciences had concurrent sessions, plenaries, and workshops, while PDA had only one room where all of the speakers presented.  Although both provided provisions during breaks, PDA’s coffee and treats far surpassed those provided at the Salt Palace.  But the most interesting difference? The incorporation of social media into the conference.

There are some amazing blogs out there for ocean scientists: Deep Sea News and SeaMonster come to mind immediately.  There are also a plethora of active tweeters and bloggers in the ocean sciences community, including @labroides @jebyrnes (and his blog) @MiriamGoldste @RockyRohde @JohnFBruno @kzelnio @SFriedScientist @rejectedbanana @DrCraigMc @rmacpherson @Dr_Bik .  I’m sure I’ve left some great ones out- feel free to tweet me and let me know! @carlystrasser).

That being said, ocean scientists stink at social media if OS 2012 was any indication.

First, the Ocean Sciences Meeting did not declare a hash tag – this is the first major conference I’ve been to in a while that didn’t do so.  What does this mean?  Those of us who were trying to communicate about OS 2012 via Twitter were not able to converge under a single hash tag until Tuesday (#oceans2012). Perhaps that isn’t such a big deal since there were only a dozen Tweeters at the conference.  This is unusual for a conference of this size: at AGU 2011 in December, I would hazard to guess that there were more like 200 Tweeters. Food for thought.

Second, I heard from @MiriamGoldste that there was actual, audible clapping when disparaging comments were made about social media in one of the presentations. For shame, oceanographers!  You should take advantage of tools offered to you; short of using social media yourself, you should recognize its growing importance in science (read some of the linked articles below).

Now for PDA 2012. A hash tag was declared (#pda12) and about 2 dozen active tweeters were off and running.  We had dialogues during the conference, helped answer each others’ questions, commented on speakers’ major conclusions, and generally kept those that couldn’t attend the conference in person abreast of the goings-on.  Combine that with real-time blogging of the meeting, and you had a recipe for being connected whether you were sitting in a pew at the Internet Archive or not.  Links were tweeted to newly-posted slides, and generally there was a buzz about the conference.

So listen up, OS 2012 attendees: You are being left in the dust by other scientists who have embraced social media.  I know what you are thinking: “I don’t have time to do all of that stuff!”  One of the conference tweets says it best:

More information…

Read this great post from Scientific American on Social Media for Scientists

COMPASS: Communication partnership for science and the sea. I attended a COMPASS workshop two years ago at NCEAS and was swayed by the lovely Liz Neeley that social media was not only worth my time, but it could advance my career (read “Highly tweeted articles were 11x more likely to be cited” from The Atlantic).

Generally all of the resources on the Social Media For Scientists wikispace

Social Media for Scientists Recap from American Fisheries Society blog

As for how social media relates to the DCXL project, isn’t it obvious? I’ve been collecting feedback straight from potential DCXL users using social media.  Because I have tapped into these networks, the DCXL project’s outcomes are likely to be useful for a large contingent of our target audience.

zach morris cell phone

It seems that oceanographers are stuck in the olden days of communication. For those keeping count, that's TWO DCXL blog references to Zach Morris' cell phone. From www.funnyordie.com


Archiving Your Life: PDA 2012 Meeting

I’m currently sitting in a church.  No, I’m not being disrespectful and blogging while at church.  Technically, I’m in a former church, in the Richmond District of San Francisco.  The Internet Archive bought an old church and turned it into an amazing space for their operation, as well as for meetings like the 2012 Personal Digital Archiving Meeting I’m currently attending.

I wasn’t sure what “personal digital archiving” meant, exactly, before I heard about this conference.  It turns out the concept is very familiar to me.  It’s basically thinking about how to preserve your life’s digital content – photos, emails, writings, files, scanned images, etc. etc.  The concept of archiving personal materials is a very hot topic right now.  Think about Facebook, Storify, iCloud, WordPress, and Flickr, to name a few.  As a scientist, I actually think my of my data as personal digital files: they represent a very long period of my life, after all.  So I’m at this meeting talking a bit about DCXL, and also learning a lot about some amazing new stuff.  Here’s a few interesting tidbits:

Cowbird: This is a place to tell stories, rather than just archive their lives.  According to the founder (who is attending this conference), Cowbird is about the experience of life, as opposed to merely curating life. For an amazing, moving example of how Cowbird works, check this out: First Love

The Brain: Very cool, free software that helps you organize links, definitions, notes, etc.  The idea is that it works just like your brain: it makes connections and creates networks to provide meaning to each link.  Play with it a bit and you will be hooked.

Pinboard: Technically, I already knew about Pinboard. But the founder of the bookmarking system gave a great talk, so I’m including it here.  Pinboard has been described as how the bookmarking service Delicious used to work, before it stopped working well.  For a very small fee (~$10) you can store your bookmarks, tag them, and even save copies of the web pages as they were when you viewed them- this comes in particularly handy if you use a website for research and it might mysteriously disappear without warning.  My favorite thing about Pinboard is it isn’t mucked up with ads and other visual distractions.

Internet archive

The church meant for worship of all things digital: The Internet Archive. From Flickr by evan_carroll

What’s the Deal with .xlsx?

A few years back, Microsoft Excel started automatically saving my spreadsheet files with the extensions .xlsx.  I first noticed it when I got a new laptop for my postdoc at University of Alberta.  Suddenly, I had to be cognizant of the fact that if I left Excel to its own devices, the spreadsheets I generated would not be readable on my home computer equipped with an older version of Excel.

First, let’s cover exactly what that extra “x” is for. The additional “x” in Excel file extensions stands for XML.  XML is Extensible Markup Language, which is a markup language useful for data, databases, and data-related applications.  The file type .xlsx is a combination of XML architecture and ZIP compression for size reduction.  Here’s a succinct summary from mrexcel.com:

If you’ve ever looked at the “View Source” view of a webpage in Notepad, you are familiar with the structure of XML. While HTML allows for certain tags, like TABLE, BODY, TR, TD, XML allows for any tags. You can make up any sort of a tag to describe your data.

You can also check out Microsoft’s description of XML in Excel.  What all of this means is that .xlsx files are more generalized and easier to use with web-based applications.  It’s a good thing!

beatles album cover

Just like John and Paul, XML and Excel come together to make beautiful things happen.

You might be asking yourself why I’m writing about .xlsx.  Isn’t this an old issue that folks have figured out by now?  The answer to that is yes and no. Many of the scientists I have spoken with over the last few months are entrenched in their current Excel version, and have major complaints about moving to newer versions.  Excel 2003 (2004 for Mac) is still heavily used among some groups, which predates the .xlsx file type. Other scientists have moved on to later versions of Excel, but still have colleagues, advisors, or collaborators who use older versions and therefore cannot open the .xlsx file type.  So while many scientists can tell you they have noticed the new extension on their Excel files, they don’t understand the underlying changes.

Of course, you can tell Excel to generate and save files in the old .xls format by going to the “Excel Options… Save” and changing your settings so files are saved as .xls:

Or on a Mac, the “Preferences…. Compatibility” menu:

Google Refine: An Interesting Take on Data Organization

A powerful tool for working with messy data. This is the tag line for Google Refine, a web-based application that can be used to manipulate and clean up data sets.  The history of Google Refine is that Google acquired Freebase Gridworks (originally developed by Metaweb Technologies, Inc.) back in 2010.  They re-branded the application as Google Refine.

I certainly don’t claim to be an expert on exactly how Google Refine works, but it has great potential.  You download the application, which works through a browser.  The idea is that you upload your spreadsheet or download it from the web from within Google Refine.  You can then manipulate your data, remove duplicates, rename cell entries in bulk, etc.  The underlying code is available and it appears that developers are encouraged to participate.  Alternatively, if you are generally fearful of code, Google Refine “protects users from all that nasty command line stuff,” as my smart friend Karthik says.

The trajectory of the DCXL project is still in flux, but I can say with certainty that Google Refine is a pretty great web-based application we can aspire to learn from in the course of our development.  Just yesterday the blog iPhylo had a great post about using Google Refine along with taxanomic databases.  This is one of the features we would like to incorporate into the DCXL project, so it’s great to hear that others have been hammering away at the problem of linking controlled vocabularies and data sets.

Want to know a bit more? Here’s Google’s blog entry about Google RefineFlowingData also posted a blog about Google Refine, which is where I first heard of it.  Freebase (which appears to be some iteration of Metaweb Technologies Inc.) has a Twitter feed that mentions Google Refine quite a bit at @fbase.

And in keeping with the organization theme of this post, here’s some links to one of my latest artist crushes: Ursus Wehrli.  He’s the embodiment of organization, in beautiful art form.  One of his photographs is below, but check out his Ted Talk, this Visual News post about him, or Google image search him for more amazing visuals.

If you love organizing as much as me, check out the artist Ursus Wehrli. He tidies up in amazing, artsy ways. From Flickr by Lawrence