Remote

Senior Software Engineer, Performance

Xonai logo Xonai
UK - Remote - Europe 🇪🇺

About Xonai

Apache Spark powers data infrastructure worldwide, but at petabyte scale, performance bottlenecks translate directly into massive infrastructure costs. At Xonai, we're solving that with a novel engine purpose-built to dramatically accelerate Spark jobs at its core and without requiring data teams to change how they work.

We've raised $4.5M in seed funding to build the best-in-class data infrastructure optimisation engine for the AI era.

About the Role

Optimising data infrastructure at scale lies in providing breakthrough data processing acceleration to the bottomline that drives up costs: the software deployed at scale by data engineering teams. As a Senior Software Engineer for this role, you will collaborate with the founding team in the implementation of a next-generation accelerator for Apache Spark, the most widely used Big Data processing engine at petabyte-scale. Working at the intersection of compilers and Big Data analytics, you’ll drive state-of-the-art implementation of algorithms and techniques that span across the entire software stack, from SQL pushdown to enhancements in low-level C++ data processing APIs and beyond. Your contributions to our core IP will directly impact data processing infrastructure transforming petabytes of data every day where Xonai is being deployed.

Responsibilities

  • Own SQL-acceleration projects end-to-end from planning to implementation and measurable outcomes
  • Implement algorithms (targeting a custom DSL) for complex SQL functions within a Scala codebase
  • Drive performance optimisations for Big Data processing algorithms primarily in Java but also C++
  • Research and develop new greenfield development lying at the intersection of Big Data analytics and compilers

You may be a good fit if you:

  • Have 5+ years of software engineering experience working in large performance-driven codebases
  • Have strong experience with statically-typed compiled languages (in particular C++, Java and Scala)
  • Can navigate through large codebases that plumbs low-level APIs into high-level operations
  • Have experience deploying or benchmarking data processing software to clusters in the cloud
  • Leave your comfort zone to tackle challenges across a multi-language software stack
  • Solve challenging problems independently and know when to pull others in

Strong candidates may have:

  • Entrepreneurial spirit and previous experience in early stage startups
  • Experience contributing to popular open-source projects in the domain of data processing
  • Experience or familiarisation with the implementation of large-scale query engines
  • Experience in interfacing with OS or systems-level APIs with nuanced compatibility characteristics

Representative projects:

  • Implement (in a custom DSL) a major SQL operator (e.g. Window aggregate) or smaller SQL expressions
  • Migrate a complex code-generated algorithm in a custom DSL to a C++ equivalent
  • Invoke Java code via C++ JNI to provide execution of a SQL function for compatibility reasons
  • Implement a variant of repartitioning algorithm optimised for edge-case application configurations
  • Refactor existing code generation algorithms to accommodate new improvements of a custom DSL
  • Improve GitHub actions workflows to accommodate new build options or new comparative benchmarks

Our Commitment

We are highly committed to create new transformative technologies that deliver unique benefits to our customers. We understand that developing a best-in-class product requires a diverse team of intelligent, passionate and curious people bringing new perspectives. We take great pride in being an equal opportunity employer and we encourage everyone to apply.

Looking for more roles like this?

Join our talent network and get matched with similar opportunities from top companies.