The Center for New Data

VP of Data Engineering

Remote USA Only Published 3 weeks ago

The Center for New Data ( is dedicated to measuring and addressing disparities and inequities in society. Bringing together industry data scientists and data engineers; socially conscious tech companies and impact organizations; and data journalists and academic researchers to use data for good is key to the Center's success in tackling pressing social problems. Examples of recent projects:

  • Observing Democracy program focused on measuring barriers to equitable voting access, profiled by The New York Times and cited extensively by Stacey Abrams and the NAACP.
  • Covid Alliance program monitoring the implications of superspreader events, profiled by the Washington Post and leveraged by public health departments across the country.
  • A new program in development focused on measuring social determinants of health, in partnership with HBCU research institutions.

Nature of Work & the Ideal Candidate

The ideal candidate will be able to organize design sessions with team leaders and ensure projects are scoped and resourced for success. This role will make high-level decisions and help scale analysis, data pipelines, and data visualization efforts at the speed necessary to make timely impacts on current issues.

Oversight will include organizing and resourcing cross-functional teams in coordination with our director of partnerships and volunteer sourcing partners. You will support development of the technology roadmap in collaboration with the executive director. Experience running a cross-functional engineering organization will be helpful.

Big data engineering, ETL processing, and machine learning experience will be important, as the organization has over 2 trillion rows of data so scalability matters. Cloud-based databases like Snowflake and dimensional modeling and ETL best-practices will be applied, in order to manage the heart of the Center for New Data's data platform.

Strategic, Technical & Managerial Responsibilities

  • The position will be influential, as it is designed to carry on the responsibilities currently managed by our organization's founder and interim CTO.
  • Supporting technical roadmap development and managing cross-team dependencies.
  • Assessing the resourcing requirements of the engineering teams and working with the volunteer coordinator to ensure proper resourcing and support.
  • Designing technical architecture in line with strategic planning (supported by industry-leading data engineering architect volunteers). Guides teams in anticipation of future use cases and help them make design decisions to minimize the cost of future changes.
  • Works with leadership on strategic organizational plans. Consistently works at a strategic level, influencing decisions to achieve organizational alignment on major goals.
  • Critically review technical operations and programs, ensuring it's appropriately broken down and prioritized across the organization.
  • Managing and collaborating with high-skilled volunteers to ensure smooth implementation of projects and operation of the platform.

Desired Leadership Qualities and Behavior

The VP of Data Engineering role will be looked to for leadership in fostering our organization's young culture in the following ways:

  • Mentoring across the organization in an open, respectful, flexible, empathetic manner.
  • Communicating in a clear, concise, effective, and audience-oriented way, ensuring teammates actively listen to others and are understood.
  • Facilitating ample documentation and knowledge sharing across the organization.
  • Working collaboratively to ensure technical processes align with organizational goals.
  • Appetite for clean, maintainable, and tested code and ample documentation.
  • A passion to shape the world through your work.

Preferred Qualifications

The Center for New Data is dedicated to building a leadership team whose background reflects the diversity of our work. The ideal candidate brings professional experience including:

  • 2+ years of data engineering management experience, ideally with distributed datasets.
  • 3+ years of experience in engineering management.
  • Strong communication and collaboration skills with a proven track record of success in fast-paced products, Back-End, DevOps, Data Science, and Security teams.

The ideal candidate is well versed in standard data engineering or large system design, such as:

  • Airflow, Pachyderm, or equivalent modern data scheduling and execution framework, with an emphasis on best practices in Python and SQL on distributed datastores.
  • Standard SQL, including advanced aspects like query optimization, indexing, common table expressions, window functions, materialized views, and the merge statement.
  • Design patterns in ETL (e.g. dimensional modeling, insert vs truncate).
  • Snowflake, Redshift, BigQuery, or similar large-scale distributed datastore experience.
  • Cloud-heavy environments like AWS that leverage S3 and distributed file systems.
  • Familiarity with DataBricks, Spark, and PySpark is a plus!