Untangled Development

Backend Engineer Interview Script

Dwight Schrute

Dwight Schrute, Assistant to the Regional Manager. Dunder Mifflin Inc. Credit: NBC.

Last month I’ve written a post with Django-specific interview questions. That post is about verifying a candidate’s depth of knowledge with Django.

This post is about less specific questions I ask a candidate. Before proceeding to the Django specific ones. If I do proceed after all. In case the candidate does not have any Django-specific experience, what’s the point?

These questions do not look for specific answers. Rather the aim is to open up a discussion. So the questions are more in the form of how or why rather than what.

And no. How do you describe yourself? is not one of these questions. If you ever get that question, the only correct answer is that by Dwight Schrute as shown above.

The questions in this article revolve around:

  • improving web application response time
  • designing authentication
  • handling asynchronous work / task queues
  • HTTP REST API design

1. Server response time

Q: In the context a dynamic “data driven” web application, how would you decrease a page’s response time?

Asking this should allow the candidate to come up with multiple answers. Among the common answers you should note the below for further discussion:

  • caching
  • database query optimisation

Caching

Caching provides a form of “low hanging fruit” to speed up serving a response in web applications.

Q: Where can you cache results in a Web application?

Allow the candidate to go into the multiple levels of caching a Web application offers.

Brownie points for mentioning:

  • caching HTTP response via HTTP headers, i.e. cache-control
  • caching response via web server
  • server-side in memory caching

The cache-control HTTP header allows caching resources across various points on the network Reference docs on HTTP caching.

Web server caching caches resources on the server end of the network. The Nginx cache is a fine example.

Server-side caching using an in-memory cache as memcached or redis is another option.

Caching can also be enabled at “database layer”. Through “views” or outright database denormalisation. Ideally we do not go there during this interview.

Caching though comes with its own challenges.

Q: What are the tradeoffs of caching?

Discuss expiration or “freshness” methods. For any of the caching methods the candidates brought up for the previous question.

Q: When would serving stale data be incorrect?

Some scenarios are better suited to caching. Think Wikipedia. Others are not. Think live football scores. How to optimise the latter?

Segway the discussion into query optimisation.

Database Query Optimisation

Q: What makes a query “slow”?

Any query whose result can be retrieved faster.

Brownie points the candidate mentions redundant outer joins. A Visual Explanation of SQL Joins by Jeff Atwood provides a great revision on this topic.

Once the candidate mentions indexes the question is:

Q: How to decide which columns to index?

Brownie points for mentioning:

  • Column selectivity. For example a gender column with a few values is not a good candidate for an index. Whereas a unique id column is.
  • The field type. For instance, a numeric integer type is index-friendly. A “text” field type is not. Because it will result in a large index which takes computing resources to scan.

2. Authentication

Q: We have a monolith central web app where users data is stored and managed. And we have several microservices that need to authenticate those users. How would you go about it?

At the time of writing JWT is the way to go.

Still, any answer involving “tokens” and token-based authentication should be good. Such an approach allows not only authenticating requests for the microservices above. It is also a great way to authenticate frontend requests.

3. Task queues and async work

Q: Our users need to generate reports on the fly. But an individual report takes several minutes to generate. How would you structure this in a web application?

Allow the candidate to distinguish between the synchronous part of this feature. And async parts.

One such way would be:

  • Web request queues task to generate report (syncronhous).
  • Report generation task writes file to storage, e.g. AWS S3 (async).
  • Once the task writes the file strage, task sends out email with link to file (async).

This should give a nice opener to discuss task queues with the candidate.

A lot has been said and written about task queues. I have written an article myself: Do you need a task queue in your web app?

Allow the candidate explain how the task queue works in this context:

Q: What are the basic components for async task processing in a web context?

Answer should cover these basic building blocks:

  • the web server process “producing” tasks
  • the task queue component managing the tasks’ lifecycle
  • workers “consuming” or “executing” those tasks

Brownie points for getting the basics right.

Do you think the candidate has several years’ experience dealing with queues? You can dig deeper. How?

David Yanacek at AWS has written on strategies to deal with queue backlog scenarios. This provides some excellent talking points:

In a queue-based system, when processing stops but messages keep arriving, the message debt can accumulate into a large backlog, driving up processing time. Work can be completed too late for the results to be useful, essentially causing the availability hit that queueing was meant to guard against.

Putting it another way, a queue-based system has two modes of operation, or bimodal behavior. When there is no backlog in the queue, the system’s latency is low, and the system is in fast mode. But if a failure or unexpected load pattern causes the arrival rate to exceed the processing rate, it quickly flips into a more sinister operating mode. In this mode, the end-to-end latency grows higher and higher, and it can take a great deal of time to work through the backlog to return to the fast mode.

The above should motivate the next question:

Q: How do you make sure that tasks are being executed quicker than they are being queued?

There is no one correct answer. The point of this is to allow the candidate to explore alternatives with you. Which is what you’d be doing together in case you go on with the candidate and they join the team.

Brownie points if the candidate mentions making tasks idempotent. I.e. have the task achieve the same result, whether it runs 1 time or x times. This removes a lot of headaches in managing a task queue.

Explore with the candidate how they would go about achieving this.

4. REST API design

Most, if not all, web applications, provide some form of REST API backend component. A “rich” frontend consumes this REST API. Such rich frontend is usually implemented by the “hot” JS framework at the time. At the time of writing, among the hottest we arguably find React and VueJS.

Candidates submit some code as part of the recruitment process. This coding assignment has the candidates add functionality to an existing REST API. I.e. in the form of new URIs.

Based on submitted assignment discuss their choices made in designing URLs:

  • Endpoint names. Hint: use nouns rather than verbs. And name collections with plural nouns.
  • Endpoint ability to allow filtering, sorting, and pagination.
  • Request and response encoding. Hint: use JSON.
  • Proper usage of HTTP methods or “verbs”.

Couple of excellent resources on REST API design best practices:

To Conclude

The “brownie points” above pinpoint some “good answers”. If the candidate does not mention those it’s not a problem.

We are humans not search engines. We forget stuff. Especially during interviews.

Keep in mind the candidate can answer questions well even when the answer is not the one you expect.

You can use the comments box below to let me know:

  • about anything I’m missing
  • how I can improve this interview script

May the interviews you’re involved in lead to a better life!

Comments !