And the data are

There will be many more posts about my primary research arena (computational astrophysics) over the next years. For a few months, however, I am pursuing a slightly different track, doing a project in pure observational astronomy between finishing my Master’s and beginning my PhD. Actually seeing stars and galaxies on a daily basis has been refreshing, and I’m working with some exciting data from the Hubble Space Telescope. I’ll make a few short and sweet posts about this work over the next few weeks, this being the first of a series.

I am lucky to have some extremely talented mentors, Professor Carollo and Dr. Cameron at the ETH Zurich. I had a short meeting with the both of them yesterday. Professor Carollo, having worked with astronomical data for her entire career, had some solid simple advice, which can apply to any sort of data-driven scientific endeavor. Really get to know your data by looking in detail at 5-10 well chosen samples.

Over the course of the last month, I’ve been working with a data set containing almost 1500 galaxies. Sure, I’ve looked at a haphazard smattering, especially in the course of testing and refining my analysis scripts, but did I get to know any of the galaxies as if they were my friend or next door neighbor? As of yesterday, no, and I spent today doing just that, which ended up being fun if laborious work (requiring an invocation of the pomodoro technique to keep me on track).

I first created a spreadsheet with the most salient attributes, such as the appearance of the galaxies in different observed wavelengths, then picked the most massive galaxies in my sample, and finally proceeded to fill it up with such quaint entries as “a bit banana like”, “pancakesque”, “neighbors visible in this band!”, etc. I still have a few more galaxies to get to know tomorrow, but Know Your Data is certainly a very good maxim to add to one’s repertoire from this data point: I already have some new ideas to improve my analysis methods.

Share this:

Related