04/01/2018

The engineering team at Smartology consists of software and systems engineers and the CTO, all based in Southwark in London. We are responsible for all back-end systems, including the machine learning algorithms used for content matching and brand safety, the real-time bidding systems for programmatic ad-serving, as well as the ad server itself.

The team has a wide range of experience, ranging from graduates who joined the company straight from university, all the way through to senior developers with 20+ years’ experience. As an agile team, we prioritise business value and flexibility over strict rules and processes. We work in short, one week sprints and have a daily stand up to share progress. We have a retrospective once every two weeks and a very short planning meeting once per week. We have a short backlog grooming session most mornings where we discuss upcoming tasks and estimate their sizes as a team. Apart from this, meetings are rare, meaning that we average approximately 35 hours of real engineering work every week.

 

Technologies

We use the following technologies daily:

  • Java
  • Spock
  • Gradle
  • Jenkins
  • MySQL

Almost all of our infrastructure is provided by AWS:

  • EC2 – Hosts our older applications as well as the performance-critical ad server.
  • S3 – Storage of the assets that are used in our adverts (images, javascript) as well as long-term backups.
  • Lambda – Read about how we use Lambdas in a blog post by Jason Marden.
  • API Gateway – Used in combination with Lambdas to create serverless APIs.
  • EMR – Runs our machine learning algorithms against the millions of publisher articles we hold.

Other AWS products we use include CloudFormation, CloudFront, DynamoDB, ElastiCache, ELB, Kinesis, RDS, Route 53, SES, SNS, SQS, and VPC. Our log files are pushed to Logz.io, a cloud-hosted ELK-based log management solution, which is becoming ever more essential as we move towards serverless applications.

 

Challenges

There are a number of challenges we face:

  • Scraping, semantically profiling, and storing an average of 50,000 publisher articles per day from over 50 publishers.
  • Responding to thousands of bid requests per second, each within 40 milliseconds. If we drop below this standard, our bids will be ignored and we will serve no adverts.
  • Serving millions of impressions per day, while accurately tracking user engagements.
  • Fine tuning our machine learning algorithms to ensure that we display relevant content, while protecting our clients’ reputation by filtering articles on sensitive topics.
  • Managing infrastructure across 3 continents, with the ability to scale quickly to meet demand, all while keeping our costs as low as possible.

Despite these challenges, we have a strong focus on quality. We use test-driven development wherever possible and ensure that all code is covered by unit and integration tests. We generally work in pairs, which we have found to be more effective at preventing bugs than code reviews. Our ad server undergoes performance testing before any production release; if it does not pass the performance test, the release does not happen.

Our deployment process is completely automated, so we aim to deploy to production as often as possible; we made 66 test and 41 production deployments in the past month. We avoid production deployments after 2pm and on Fridays. This gives us plenty of time to spot any problems and minimises the risk of anyone needing to work late or at the weekend.

 

Projects

The major projects we have worked on this year are:

  • Brand safety R&D – Improvements to our machine learning algorithms which prevent ads being served next to blacklisted content. Our biggest challenge has been efficiently collecting enough labelled training data, as well as keeping it up-to-date enough so that our classifiers can recognise newly emerging topics.
  • AdX integration – This was our first programmatic integration.
  • Index Exchange integration – This allowed us to run programmatic campaigns on more publishers than before.
  • New tracking pipeline – We can now track impressions and clicks in near real-time. The entire stack was designed with scalability in mind, meaning we can now process hundreds of millions of impressions per day.
  • API Gateway migration – Many of our existing APIs run on their own EC2 servers. Moving these to API Gateway results in a cost saving, both in the direct costs of servers, as well as the effort needed to maintain them.
  • As well as these, we are constantly working to improve our internal reporting dashboard and admin console.

 

 

2018

Some of the work for 2018 will include:

More programmatic exchange integrations – This will allow us to work with even more publishers.

  • Admin console 2.0 – This will streamline the process of setting up campaigns.
  • Bidding algorithm improvements – We will be optimising our bidding algorithms to offer more flexibility in our campaign delivery

 

We’re always looking to hire talented engineers. If you are interested in joining the team, please send your CV and a covering letter to careers@smartology.net.

https://www.smartology.net/blog/category/smartology/

smartology_banner

This website uses cookies to give you the best experience.
Agree by clicking the 'Accept' button.