Hi, sharing this role for a data engineer in a team working on Giga, a global initiative to connect every school to the Internet and every young person to information, opportunity, and choice. An open source background is a good fit for this role since this team has several strong open source dependencies for solving compelling challenges in connectivity, and also publishes work by default under open licenses.
This opening is not in my team, so I am not qualified to give detailed answers. But I saw the opening go up and thought it could be an interesting opportunity for someone here.
Full details are below:
The Data Engineer will be the first hire in a small team charged with scaling the systems infrastructure of Giga, an initiative to connect every school in the world to the Internet and every young person to information, opportunity, and choice. This position will have considerable autonomy to:
- Maintain and expand our data warehouse/lake with information from multiple sources and new, Open Source tools
- Maintain ETL pipelines to those sources with appropriate data integrity and governance
- Document processes and maintain data dictionaries for internal and external contributors
Ultimately, Giga’s Data Engineer will be an evangelist for how data can help our entire team meet and exceed Giga’s goal-- connecting every school, and every community, to the internet by 2030.
Your main responsibilities will be…
- Identifying and maintaining ETL data pipelines for data sources such as Quality of Service (QoS), geospatial, population, infrastructure data
- Build and then document data tools and models, then create testing plans to monitor data integrity and quality.
- Collaborate with technical stakeholders in designing system data models, data integrations and automated data management solutions.
- Work with key stakeholders to expand the data warehouse/lake to support new data investigations.
- Collaborate with data science team to implement learning models
- Collaborate with software engineering counterparts to assure a clean and efficient data modelling that is responsive to product (Project Connect and Blockchain), country office, partner and customer needs.
- Work closely with Information and Communication Technology Division (ICTD) on data lake infrastructure and ETL data pipeline management.
- Be the team’s subject matter expert for appropriate data governance and in-market compliance, e.g. GDPR and SOC2 standards.
- Create automated testing and monitoring systems to ensure data quality and timely updates.
- Build and maintain documentation to ensure data accessibility to key stakeholders including non-functional requirements such as exception/error handling, performance optimization and code management.
- A first University Degree (Bachelor Degree) in Computer Science or Information Technology.
- Minimum 3 years of relevant professional experience in Data Engineering.
- Advanced knowledge of Azure data warehousing tools. Knowledge of and competitive cloud computing technologies such as AWS and GCS is an asset.
- Experience with ETL scheduling, Airflow and Azure data factory automation and orchestration tools in a production setting.
- Experience with and knowledge of database management systems, e.g. Postgres, Snowflake, SQL Server
- Experience with scalable data quality pipelines and profiling.
- Experience with database scripting in Python, Java, PowerShell and Bash development.
- Experience performing root cause analysis on data and identifying opportunities for improvement.
- Experience working with Agile / Lean practices and teams.
- Proven ability to be self-driven to support the data needs derived from multiple teams, systems, and products.
- Proven ability to train stakeholders to advance their data discovery using the tools you have created.
- Fluency in English is required.
- Demonstrated success and passion for the end-to-end creation of a scalable data environment, including reporting and visualization using the data environment, ETL processes, and training technical stakeholders to maximize adoption.
- Are a maker at heart who enjoys being in the code while also uncovering new opportunities using new methods and tools are right for the task.
- Enjoy collaborating within a small, globally remote team (Data Science, Development, Product and UNICEF ICTD) while having the ability to work independently and proactively to get things done.
- Always look for gaps and opportunities with existing data, creating data product specifications, automations and working with cross-functional teams to improve Giga data outputs.
- A drive to learn, a motivated problem solver who wants to help tackle the new and interesting challenges that we encounter as a leader in global school connectivity.
- Exceptional communication skills and desire to share your knowledge with clarity, patience, and empathy
It’s an asset if you have:
- Experience with collecting, maintaining and increasing quality of data at a global scale
- Experience with ISPs, MNOs and/or other connectivity providers.
- Experience with system configuration using a variety of protocols, including JSON, REST, and XML
- Experience and understanding of work in the international development sector
- Experience in data analytics system design and ability to use business intelligence (Tableau or similar) platforms for both production and ad hoc experimental data analysis to support stakeholder analysis.
- Developing country work experience and/or familiarity with emergency is considered an asset.
- Knowledge of another official UN language (Arabic, Chinese, French, Russian or Spanish) or a local language
Launched in 2019 as a joint initiative between UNICEF and ITU, Giga has set the ambitious goal to connect every school in the world to the internet.
Half of the world’s population has no regular access to the Internet. Millions of children leave school without any digital skills, making it much more difficult for them to thrive and contribute to local and global economies. This has created a digital divide between those who are connected and those who are not, a divide that has become even wider during the Covid-19 pandemic. UNICEF and ITU have therefore joined forces to create Giga, an initiative to connect every school in the world to the Internet and address this new form of inequality.
Giga focuses on connecting schools so that children and young people have access to information, opportunity, and choice. It also uses schools as anchor points for their surrounding communities: if you connect the school, you can also connect local businesses and services. This creates opportunities for service providers to generate revenue from paying users, making connectivity more sustainable. A recent report by the Economist Intelligence Unit found that a 10% increase in school connectivity can increase effective years of schooling by 0.6% and increase GDP per capita by 1.1%.
You cannot fix a problem unless you can see it, so the first step is to map schools and their connectivity levels. Giga uses machine learning to scan satellite images and identify schools. These are then marked by coloured dots on an open-source map: green where there is good connectivity (over 5mb/s); amber where it is limited; and red where there is no connectivity at all. The project has already mapped over 980,000 schools in 40 countries, including several which were previously unknown to governments.