Dublin’s Data Science Guild

by Humberto Corona - 15 May 2018

How to establish and evolve your data science community

In Zalando, we have many guilds: self-organized groups of people who share interests. The topics, scope, size, and ways to organize the guilds varies. We have technical guilds like the web or API guilds, local and artistic guilds like the knitting guild in Helsinki, and some guilds that support the growth of people in certain job families, like the Data Science Guild.

For more than two years, I have been co-organizing the Data Science Guild in our tech hub in Dublin, creating a place to share data science knowledge and best practices, and creating a framework that allows the guild to evolve and grow autonomously.

I like to think of guilds like teams or products. This philosophy helps you build the kind of framework you need to run or be part of a guild. It gives the guild a reason to exist, and it can potentially tell you when it is time to pivot or move on to the next thing. When we started the data science guild, it was not perfect (and it isn’t perfect now), but as we were “releasing” a new product, it was important to see if the product was viable: Do people find our talks valuable? Can we guarantee to have content for 70% of the 52 weeks in a year?  

Initially, we had really good feedback. Everyone was interested in giving talks, attendance was high, and as expected, a lot of people had ideas on how to make the guild better! A few months after the initial positivity, we started to see some challenges that needed to be addressed if the guild was going to survive. We needed a structure to maintain a constant flow of content (talks, discussions, etc.,) and we needed to scale the organization by creating a sense of collective ownership.

Once we saw there was both a need and value from having the data science guild after the initial ramp-up, we started operating within the Zalando radical agility framework. We established our mission, “Sharing Data Science (DS) knowledge and experience to expand our DS expertise.”  We devised and measured Objectives and Key Results (OKRs), and implemented the collective ownership model.

We focused on three types of content: Internal talks led by any of our data scientists, in which they present a topic in depth. It could be a new library they are using, how to bring a model into production, or describing the results for the latest A/B test the team ran. This type of format is very useful to do dry runs of conference talks. Secondly, we have a “learning club,” a smaller and less formal setting, in which we discuss a recent scientific paper, or watch and discuss a video lecture. Finally, we also invite researchers from universities to present their work. For now, we ask them to present their work to PhD students, who benefit from getting feedback from our data scientists in different teams, and seeing new opportunities for the application of their work in different contexts.

When we implemented the collective ownership model, we iterated a few times. The idea was to give our community the opportunity to shape how the guild works, and to avoid having bottlenecks or too few people shouldering too much of the work. At first, we had one person who had ownership for each of the topics; one in charge of the speaker lineup every week, one sending the invites, and one taking care of the budget. Worth saying: that didn’t work. It required a lot of alignment between individuals, adding unnecessary overhead, which no one enjoyed.

We settled on a much smaller model, where we have fewer contributors who are part of small committees for half year periods (aligned with how we set and evaluate our OKRs). The structure looks like this:

  • Our content team designs and maintains a content portfolio that reflects our OKRs. They plan the topics, invite speakers, book rooms and send agenda updates.
  • Our audiovisual (AV) committee is a group of volunteers who know how to operate our not-so-easy-to-use AV system for streaming and recording presentations. Lately, we also have support from IT for this topic, which eases some of the burden.
  • Our social committee is in charge of coordinating the communication with the Data Science guild in Berlin and running our social events (this involves selecting and buying a mountain cakes and sweets)

When running the Data Science Guild, the most important aspects to consider is communication. Because we depend on our colleagues to give talks, lead discussions or invite speakers for our external event series, I found that people are much more willing to say “yes” to participating when the request comes in person. Face to face, our guild member can spend time explaining exactly what is required, answer any questions the potential speaker has, and set a date in the calendar for the talk. Of course, we then need to inform the attendees with enough time so they can plan to attend a talk; nothing is worse than spending time preparing a talk to which no one shows up! Finally, we also communicate the evaluation of our objectives and key results to our stakeholders and a wider internal audience in our monthly meeting; that helps form the image and reputation of the guild inside our office.

Being part of the Data Science guild has been a wonderful experience. There have been plenty of internal and external successes, and more importantly, we created a place where we Data Scientists come together beyond our day to day teams. In the last two years, almost all of us have presented at least once (some much more), we have co-organized and presented our work in two company-wide data science conferences, invited half a dozen PhD students to give talks in Zalando, and we have built a reputation for openly sharing knowledge and best practices in the Data Science community in Dublin, resulting in our members being shortlisted for two DatSci Awards in 2017. I believe Zalando is a “good neighbor”; a company where everyone can make a positive impact in their community, whether that is with their team, a guild, or the whole company.

If you want to help the guild model grow, we have community managers positions, and if you see yourself being a regular contributor to our talks, you might consider applying for our data science or research engineering positions.

Similar blog posts