👾
Interview Journey Episode 1

Interview Journey Episode 1

In this article, I will share my experience with a home assignment, my thoughts, and some useful tips.

Introduction

Recently, I realized that my current company offers limited opportunities for technical growth. Most of the work is already done, and there are no new projects, prototypes, or migrations planned soon. Staying in the same company with the same scope of tasks can slow down my growth during my Data Engineer career. So, this article marks the beginning of my interview journey in 2025. I received a home assignment from some company (I can't share the name 🤐). In my opinion, home assignment format is highly convenient for candidates. Unlike live coding interviews, it allows you to think through the solution, refine it, and work at your own pace without stress.

General Information

The deadline for this home assignment was 4 days, which was enough time to develop a simple solution that meets all requirements. Let's start with the task description. The company provided a well-written, detailed task description. In short, I needed to write code that processes geospatial data and optimizes performance.

ℹ️ Develop a preprocessing strategy in Python to fetch and stitch orthophotos based on geospatial queries. Deliver these images resized to 256x256 pixels. Additionally, optimize the performance and functionality, benchmarking the throughput. Document the setup and the idea behind it.

Hint №1: Read the Task Multiple Times

No seriously! This recommendation is simple, but absolutely crucial. I suggest you:

💡 Read the task carefully until you have concrete questions for the company.

If the home assignment is given for a week or more, it wouldn't be so easy to implement. And if you don't have any questions, then you are 100% missing something. Moreover, asking questions can demonstrate to the interviewers that you take the task seriously and are really interested.

For example, after analyzing the task description, I realized that the provided coordinates were in EPSG 4326, not EPSG 25832. This was important, because incorrect coordinate systems could lead to incorrect results.

Hint №2: Explore the Data

The second step is understanding the data, especially if you are a Data Scientist, Data Engineer, or Data Analyst. The dataset can provide insights that influence your approach to solving the task. In my case, after analyzing the input data (a large JP2 image file with multiple raster bands), I asked the company about the output image format. Should I send the image as RGB? Should I retain the original format? Should I apply image transformations? These were key questions I could only answer after inspecting the data.

Hint №3: Plan a Simple and an Enhanced Version of Your Solution

I strongly recommend creating a structured plan with both essential and optional tasks. If you aim for a perfect plan with too many mandatory tasks, you might never complete the assignment.

Here’s an example of how I structured my plan:

1- [X] Read the task definition carefully (2+ times to ensure clarity) and write it down in my own words. (Mandatory) 2 - [X] Verify that reading the image is not required to determine latitude and longitude. (Mandatory) 3 - [ ] Find a way to use the prompt with a constant `radius`. (Optional) 4- [X] Understand the data (read and visually inspect it). (Mandatory) 5- [X] Research geospatial data handling, GIS, and Coordinate Reference Systems (CRS). (Mandatory) 6- [X] Implement a basic pipeline first (_time_ is more critical than _optimization_ at this stage). (Mandatory) 7- [ ] Add tests. (Optional) 8- [ ] Use the `click` library for a more user-friendly CLI. (Optional) 9- [X] Document the process. (Mandatory) 10- [ ] Draft slides for presentation. (Mandatory) 11- [X] Optimize performance. (Optional) 12 - [X] Investigate ways to process images without reading the entire file. (Optional) 13- [X] Update documentation. (Optional) 14- [X] Conduct performance analysis (partially completed). (Optional) 15- [X] Submit the assignment. (Mandatory)

💡 To prioritize tasks, ask yourself: Can I complete the assignment without this step? If the answer is no, it's a high-priority task.

For instance, I marked documentation as mandatory. Why? Because while you can run an application without documentation, future users (or even your future self) will struggle to understand it. Try opening a pet project you haven’t touched in a year. I am sure you will be confused about your own code!

Moreover, poor documentation is a red flag for reviewers. It signals a lack of consideration for teammates and maintainability.

Hint №4: Document Your Work While Developing

Since documentation is essential, when should you write it?

You can document everything at the end or while developing. I recommend documenting during development to avoid rushed, incomplete explanations at the last minute. People often underestimate how long documentation takes, leading to a solution that even looks incomplete.

Hint №5: Create a Pre-Built Environment Template

I only realized this after finishing the assignment, but it’s a great time-saver.

If you’re applying for a specific role (e.g., Data Engineer) and frequently use the same tech stack (Python, PySpark, Hadoop, etc.), consider preparing a GitHub template with:

  • A predefined environment
  • Linting tools
  • Tests
  • Containers (e.g for pyspark, hadoop, postres etc)
  • An initial README structure

Having this setup ready allows you to jump straight into analyzing the data and solving the problem. For example, in python project you can use flake8, black,isort and mypy. These tools reduce cognitive load by enforcing code consistency, letting you focus on business logic.

Hint №6: Be Strict About Code Consistency

I learned this lesson the hard way.

I introduced a frustrating bug by accidentally swapping latitude and longitude in different parts of the code. This cost me 3 hours of debugging.

💡 Establish naming conventions and force yourself to follow them.

Hint №7: Ask for a review from your friends

Getting feedback from others allows you to see your solution from a different perspective. I was lucky to receive a review from someone with experience in image processing. They pointed out weak spots in my solution that I hadn’t considered.

Using bilinear interpolation (Image.BILINEAR) for downscaling introduces artifacts and a jigsaw puzzle effect. Bilinear works better for upscaling. Instead, for downscaling, Image.LANCZOS (Pillow) or cv2.INTER_AREA (OpenCV) are better choices.

This was a detail I wouldn’t have noticed unless I had deeply explored image processing. A fresh pair of eyes—especially from someone with domain expertise—can reveal blind spots and significantly improve your approach.

Conclusion

This home assignment from a different area was a good experience that refreshed me during my current job routine. I not only learned more about geospatial data but also reinforced software development best practices.

Last Updated: 19 Jan 2025