Degree in Computer Science, Engineering, Mathematics, or equivalent experience
5+ years relevant professional experience
Ability to write clean, maintainable, scalable and robust code in an object-oriented language, e.g., Python, Scala, Java, in a professional setting
Familiarity with distributed computing frameworks (e.g. Spark, Dask), cloud platforms (e.g. AWS, Azure, GCP), containerization, and analytics libraries (e.g. pandas, NumPy, matplotlib)
Proven experience building data pipelines in production for advanced analytics use cases
Experience working across structured, semi-structured and unstructured data
While we advocate for using the right tech for the right task, we often leverage the following technologies: Python, PySpark, the PyData stack, SQL, Airflow, Databricks, our own open-source data pipelining framework called Kedro, Dask/RAPIDS, container technologies such as Docker and Kubernetes, cloud solutions such as AWS, GCP, and Azure, and more
Exceptional time management to meet your responsibilities in a complex and largely autonomous work environment
Willingness to travel
Strong communication skills, both verbal and written, in English and Spanish or Portuguese, with the ability to adjust your style to suit different perspectives and seniority levels