Reflections on an internship

Our latest superstar intern, Max, has just left us to go back to his last year of university at Queen Mary University of London.

https://twitter.com/AnnaPawlicka/status/505408829627326464

He’s been working with us on server automation and on separating out our code base into better modules – heavy work, and with a steep learning curve, but important work for the company. Happily, he’s done a brilliant job. We asked him to write up some thoughts on how he found it, and what he did and didn’t like – and we thought you’d like to read them too.

From the start, I felt very included. Even as an intern I was never left out of anything and that was my first big impression of the team at MastodonC. I remember in my first week when everyone was moving to the meeting room for democake, I was saying that I have nothing to show. But Anna explained that didn’t matter, and I could just talk about what I had learned.

During my internship, I was given the opportunity to work on many different and interesting tasks. From initially learning the basics of Clojure and getting the chance to do some ClojureScript too, all the way to creating virtual machines to allow certain tasks to be run locally for testing. I learned a new way of thinking with Clojure which was different to all of my previous OO experience, and using virtual machines locally to run servers was completely new to me too. In the end I eventually managed to conquer setting up a virtual FTP server, which took me far too long!

A lot of my work over the summer involved writing tests for existing code. I had to check whether given inputs would match a schema. However, the team wanted to tests these in massive numbers, not just a couple of hard-coded tests. So I got to learn about generative testing using Clojure’s test.check library. Through this I got the chance to write my own small Clojure library. The team wanted to generate test data from existing schemas built from the Prismatic Schema library. After a lot of researching I couldn’t find anything that would do this well enough, so I had a go at creating my own. It is completely open source and can be found (and added to!) on GitHub. Doing this showed me how friendly and helpful the Clojure community is, when I had problems I could ask questions on specific Google Groups or the more general London Clojurians, as well as various IRC channels.

Now that it has finished, I definitely miss everyone at MastodonC and I would happily work there again if I get the chance to. Thank you so much for a fun, interesting and rewarding time!

Thank you Fran for being a great boss, who I could talk to about any questions I had. Thank you Bruce, for making me into a person who paredit is for. Thank you Neale for helping me with all of my Git mishaps and showing off ridiculous Emacs commands. Thank you Anna for coercing me into going to Clojure Dojos.

Thanks very much Max for spending time with us – it’s been a great experience, and we hope to get you back someday.

Open Health Data platform is launched

We had the big launch this week for the Open Health Data Platform, at Shoreditch Village Hall. It seemed like the audience agreed with us that there was plenty of good stuff to be done with open health data:

We’ve built the platform on behalf of Connected Digital Economy Catapult, along with Error Creative Studio who did the work to make it beautiful. The Platform is really a collection of examples, how-tos, and case studies for using open health data to build useful analytics and applications: everything there, including both code and design, is open source and open access, and intended for people to reuse and remix for their own purposes.

The Catapult are now really keen to take this forward with more examples, and to support health tech entrepreneurs as well as bigger organisations and the public and voluntary sectors to do good things with this data, so if you’re interested in the area you should definitely stay in touch.

 

Network analysis to find ‘innovators’

We’re midway through some work with Nesta to systematically find ‘innovators’ in the UK technology scene. It’s been a really interesting project so far, and we’re looking forward to launching the work more widely in 2014.

The system we’re building is collecting and joining together data from multiple sources about software developers and what they work on, the idea being to spot innovative people, innovative companies, and to understand the tech innovation landscape better than we can just using official information. “Innovative” is a pretty subjective term, but we’ve been exploring ways of identifying innovative individuals and companies by analysing their apparent importance and influence in their professional networks.

software_innovators

Our sources are pretty varied, and aren’t usually used in official tracking. Namely, we’re finding data from:

  • Github, a popular code sharing service which many programmers use to store, track, and share their projects, both public and private
  • StackOverflow, a question and answer site for technical issues
  • Twitter, where a lot of social chat goes on, and
  • Open Corporates, the open database of the corporate world, which pulls together the official public data available on companies

We’re still working out what the final interface will look like – and would be really interested to hear any thoughts on what would be most useful – but we expect it to be a web-based way to identify explore innovators and their relationships by region, specialty, and maybe other factors as well. We hope that this will give a way to escape the ‘filter bubble’ of known innovators and start to spot those people and companies who are slightly under the official radar at present.

Here are five of the interesting things we’ve found so far:

1.    As in lots of other social networks, the number of followers of UK Github users follows the Pareto (80/20) principle, where 80% of followers are watching just 20% of the total users. This is handy for us, since there are a few central users and innovators who are genuinely influential

2.    Big network analysis is way more computationally intensive than you might guess. Because everybody can be linked to everybody else, there are lots of potential relationships to analyse: if there are only 1,000 users, there are 1,000,000 relationships, so the sums get very big very fast. In fact, there are about 15,000 UK ‘innovators’, plus their friends, who we want to look at. We’ve been working with modern open source tools including Neo4j and Gephi, which are pretty good, but we’re still stretching the limits of what’s practical.

3.    Living in our own filter bubble at a technology company in Shoreditch, it’s easy to imagine that all the innovators work at small technology companies in Shoreditch. In fact, that’s off the mark: we find lots of people working for big corporations, for organisations like the BBC, and for universities.

4.    On the other hand, our filter bubble isn’t too bad: in our random sample of innovators, we also found a few people who we knew and who Nesta knew personally, so we’re pretty sure that we are hitting the right networks of people.

5.    Innovative people seem to have side projects; we’d initially assumed that we could link people with companies just by looking at their websites. The fact is, a lot of the innovators we identified have got their own personal websites, as well as corporate identities – they’re connected to multiple projects and not just their main employer

(cross posted from http://www.nesta.org.uk/blogs/)

What and why is #democake?

We use an agile approach in all the work we do here at Mastodon C.

Part of our agile technology development is breaking work up into short, set-length iterations, with clear deadlines at the end of each, which helps us to clarify what we’re doing right now, to plan effectively, and also to have a rhythm of regular checkpoints where we can look up from our keyboards, review progress, and respond to anything that’s changed.

Internally, we run 1-week iterations, which end on a Tuesday afternoon. Then, every Wednesday, we take 20 minutes to prioritise what we want to do over the next week. This gives us a chance to make sure we’re still working on the right things and not forgetting to do important stuff that might be buried by the urgent, but less important things that appear daily.

At the end of the iteration we all show off what we’ve achieved, relax, and bask in a little glory before heading, refreshed and ready to tear into the next iteration. To help us with that celebration, every Tuesday afternoon we also have demo cake.

The heavy responsibility of demo cake acquisition this week fell to Merici. She did well.

IMG_20130903_170743