A Data Engineer compiles and transforms data into a useful format for analysis. They are software engineers who design, build and integrate data from a variety of sources utilising data lakes or repositories. They write complex queries, making data clean and accessible with the goal of optimising performance and integration of big data ecosystems. Key tools include SQL, Scala, Spark, Kafka, Hadoop, Python, Java etc.