Full Time Minnesota Big Data Job
At phData, we believe that we are witnessing a new age of digitization and decision automation. To compete, even traditional companies must deliver machine learning and analytics-enabled data platforms and data products. Examples include a medical device manufacturer building machine learning models to guide therapies or a global industrial manufacturer using deep learning models to identify product defects. This means that the data problems that used to apply only to Google and Facebook, now apply to a filter manufacturer, a medical device manufacturer, or a healthcare provider.
phData matches machine learning and analytics to the business goals of the world's largest and most successful companies. phData is one of the fastest-growing companies in the U.S. and demand for our services and software has skyrocketed. This has resulted in quality growth and an expanded presence at our company headquarters located in Downtown Minneapolis, as well as in Bangalore, India, and across the U.S. We’ve been recognized as one of the Best Places to Work in Minneapolis for three consecutive years and we were listed as the 48th fastest growing private company in the U.S. on the Inc. 5000 list.
We pride ourselves on offering employees phenomenal growth and learning opportunities in addition to competitive compensation, health insurance, generous PTO and excellent perks including extensive training and paid certifications.
As a Sr. Data Engineer on our Data Engineering team, your responsibilities include:
- Integrate data from a variety of data sources (data warehouse, data marts) utilizing on-prem or cloud-based data structures (AWS); determine new and existing data sources
- Develop, implement and optimize streaming, data lake, and analytics big data solutions
- Create and execute testing strategies including unit, integration, and full end-to-end tests of data pipelines
- Recommend Kudu, HBase, HDFS, and relational databases based on their strengths
- Utilize ETL processes to build data repositories; integrate data into Hadoop data lake using Sqoop (batch ingest), Kafka (streaming), Spark, Hive or Impala (transformation)
- Adapt and learn new technologies in a quickly changing field
- Be creative; evaluate and recommend big data technologies to solve problems and create solutions
- Recommend and implement best tools to ensure optimized data performance; perform Data Analysis utilizing Spark, Hive, and Impala
- Work on a variety of internal and open source projects and tools
- Previous experience as a Software Engineer, Data Engineer or Data Analytics
- Solid programming experience in Python, Java, Scala, or other statically typed programming language
- Production experience in core Hadoop technologies including HDFS, Hive and YARN
- Hands-on experience in one or more ecosystem products/languages such as HBase, Spark, Impala, Solr, Kudu, etc
- Strong working knowledge of SQL and the ability to write, debug, and optimize distributed SQL queries
- Excellent communication skills; previous experience working with internal or external customers
- Strong analytical abilities; ability to translate business requirements and use cases into a Hadoop solution, including ingestion of many data sources, ETL processing, data access, and consumption, as well as custom analytics
- 4 year Bachelors Degree in Computer Science or related field
If yes, you can apply directly at phData Careers or send me a note at firstname.lastname@example.org and include any of the following and I will do an introduction for you: summary, resume, links to LinkedIn or GitHub.
If not, click => “Say Hello” and introduce yourself to me and we can talk about what you are looking for.