Receive alerts when this company posts new jobs.
Cloud Data Engineer
at Homesite Insurance
Homesite Insurance was founded in 1997 and was one of the first companies to enable customers to purchase home insurance directly online, during a single visit. Since then, we've continued to innovate rapidly to meet the needs of our customers and their changing expectations.
One thing that's stayed the same since our founding: our commitment to our customers, partners and employees.
Join us on our journey as we continue to grow into a powerful contender in the field of insurance.
We’re looking for a Data Engineer to help us transform our data systems and architecture to support greater variety, volume, and velocity of data and data sources. You might be a good fit if:
You enjoy extracting data from a variety of sources and find ways to connect them and make them suitable for use in software systems and for the development of models and algorithms.
You enjoy interacting with new database systems and learning new data technologies and are interesting in developing your knowledge of new tools and techniques.
You are interested in automating data engineering efforts to minimize human interaction and optimizing data quality.
You have an interest in developing your knowledge of practical data science techniques and technologies in addition to your data engineering knowledge and experience.
This role requires comprehensive data engineering skills and is not a SQL developer role though SQL is a required skill.
We’re looking for an experienced data engineer to help us:
Build and Maintain serverless data ingestion and refresh pipelines in terabyte scale using AWS cloud services - Amazon Glue, Amazon Redshift, Amazon S3, Amazon Athena, DynamoDB, and others
Incorporate new data sources from external vendors using flat files, APIs, web-scraping, and databases.
Maintain and provide support for the existing data pipelines using Python, Glue, Spark, and SQL
Work to develop and enhance the database architecture of the new analytic data environment that includes recommending optimal choices between relational, columnar, and document databases based on requirement
Identify and deploy appropriate file formats for data ingestion into various storage and/or compute services via Glue for multiple use cases
Develop real-time/near real-time data ingestion from web and web service logs from Splunk
Maintain existing processes and develop new methods to match external data sources to Homesite data using exact and fuzzy methods
Implement and use machine learning based data wrangling tools like Trifacta to cleanse and reshape 3rd party data to make suitable for use.
Develop and implement tests to ensure data quality across all integrated data sources.
Serve as internal subject matter expert and coach to train team members in the use of distributed computing frameworks for data analysis and modeling including AWS services and Apache projects
Master’s degree in Computer Science, Engineering, or equivalent work experience
Two to four years’ experience working with datasets with hundreds of millions of rows using a variety of technologies
Intermediate to expert level programming experience in Python and SQL in Windows and Mac/Linux environment
Intermediate level experience working with distributed computing frameworks, especially Spark
Intermediate level experience working with relational databases including PostgreSQL and Microsoft SQL Server
Experience working with contemporary data file formats like Apache Parquet, Avro, and columnar databases like RedShift
Experience working with distributed SQL query engines like Presto DB and Athena
Experience with Amazon Web Services including Redshift, S3, Kinesis, Glue, and DynamoDB
Experience analyzing data for data quality and supporting the use of data in an enterprise setting.
Nice to have:
Some experience working with clustering and classification models
Some experience working with Trifacta
Some experience working with Google Analytics
Some familiarity working with RDFs and SparQL and some experience working with Graph Databases
Experience with enterprise search engine systems including ElasticSearch and Apache Solr
Homesite is an insurance company that's big on technology. Finding faster and smarter methods of improving how people buy insurance is our jam. Our crew is made up of talented and passionate professionals who aren't afraid to push the envelope. When you work at Homesite, you'll have the opportunity to pursue your creative ideas in an environment that welcomes them.
Join our team as we shake up the world of insurance!