Hadoop Developer Resume
Los Angeles, CA
SUMMARY
- Over 10+ years of progressive IT experience including 3+ years of experience in Big Data Technologies namely Hadoop, Spark and MongoDB in Banking & Financial (Credit Card), Insurance (Personal and Commercial) domains
- Primary experience in Hadoop, Spark & MongoDB Frameworks
- Worked in Cloudera Hadoop Distribution Platform
- Proficient in building applications using Hadoop components (HDFS, MapReduce, Hive, Pig, HBase, Mahout, Avro, Sqoop, Oozie,) and Spark Framework in Scala
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice - versa
- Involved in data pre-processing (Data - Cleaning, Transformation, Integration) in HDFS to obtain better accuracy during computation
- Expertise in optimizing the network traffic across the Hadoop Cluster using Combiners, Custom partitions and perform secondary data sorting as needed
- Vast Experience in Hive Query Languge (HiveQL) to develop data abstraction layer over the data in HDFS. Develop BI reports using HiveQL
- Possess good understanding of Massive Parallel Processing (MPP) databases and its data storage and processing methodology
- Good Experience in developing MapReduce programs in Java to process Text, Sequential, XML and JSON formatted files which aligns to Industry Data Transfer standards
- Experience in writing Custom counters for Analyzing data and testing using MRUnit Framework
- Extensive experience in noSQL document oriented data store MongoDB. Possess good understanding of CRUD, Indexing, Aggregation, Replication, Sharding & GridFS using both WiredTiger and MMapV1 Storage Engines
- Collaborate with Architects & DBAs to design data model, define query patterns that needs Optimization in mongoDB
- Possess good understanding of Data Modelling techniques in MongoDB involving both Manual and DBReferencing
- Experienced in extending Hive and Pig core functionality by developing custom UDFs using Java
- Possess good hands on experience in developing Transformations, Actions over the single and Pair RDDs using Scala to perform in-memory computing in Spark Framework
- Experience in connecting to streaming data to ingest near real time data into HDFS using Spark Streaming
- Developed coordinator workflow jobs in Oozie based on time and data availability
- Hands on experience in all phases of Software Development Life Cycle (SDLC), experience in leading technical team and handling multiple complex projects.
- Strong experience working in a team and as individual, strong interpersonal skills, very strong technical, debugging, analytical and problem solving skills. Proven capability in quickly learning and adapting to the new technologies.
TECHNICAL SKILLS
Big Data Ecosystem: Cloudera Hadoop Stack (MapReduce, HiveQL, Pig, HBase, Mahout, Yarn, Oozie, Zookeeper) Spark Ecosystem - Scala, YARN, SparkSQL, SparkStreaming Data Ingestion - Sqoop, Flume, Kafka
Programming Languages: Java, Pig Latin, SQL, T-SQL, PL/SQL, R Programming
Web Technologies: HTML, CSS, JavascriptJ2EE components: Servlets, JSP, Web Services (RESTful), Apache Tomcat
Frameworks: Java Spring, Hadoop, Spark
Databases: MySQL, Microsoft SQL Server, DB2
NoSQL Data Stores: MongoDB, HBase, Cassandra
Operating Systems: Windows 98/NT/XP/VISTA/7/10, Linux, Unix
Methodologies: Agile Methodology, Rapid Application Development, Waterfall Model
IDE/Testing Tools: Eclipse, InteliJEA
PROFESSIONAL EXPERIENCE
Confidential, Los Angeles, CA
Hadoop Developer
Responsibilities:
- Involved in Design and Development process framework and support data migration in Hadoop system
- Installed, configured and supported Hadoop MapReduce and HDFS
- Involved in data collection from several disparate sources, performed data aggregation and ingestion into HDFS
- The data included information from Web Logs, Policy Quote, Policy Claims, Marketing Details from IMS-DB, DB2 systems, Third Party data obtained from several external sources (CLUE, MVR, A-PLUS etc.,)
- Processed real-time web logs data using Spark Streaming
- Developed D-Streams to capture to NRT data using Flume & Kafka and performed data ingestion into HDFS
- Involved in Architectural Discussions during design phases to review the Data Modelling Patterns
- Developed MapReduce programs in JAVA for cleansing for standardizing data, transforming and integrating data
- Developed Managed and External Tables in HIVE depending on the data criticality and availability
- Create static and dynamic Partitions, Buckets using HIVE to meet business needs
- Involved in importing & exporting Data between RDBMS and HDFS using SQOOP
- Designed and developed custom Partitioner’s, Comparators, Sorting Comparators to achieve desirable results in MapReduce computation
- Developed and Managed Oozie coordinator jobs by executing MapReduce Jobs sequentially where complex computation is required
- Maintained Hadoop Clusters for Development/Staging/Production
- Integrating Big Data Technologies into the overall architecture
Confidential, Hartford, CT
Java Developer/Business Analyst
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) as requirements Gathering, Data Modelling, analysis, architecture design & development for the project
- Designs and develops application code for solving complex problems in order to meet business requirements
- Develop the system GUI using JSP and client-side validations are performed using Javascript
- Built and accessed the database using JDBC for Oracle
- Developed Stored Procedures using PL/SQL at database end
- Developed Web pages for Policy Processing Application in the new platform for different lines of Business using HTML/DHTML/CSS
- Deployed application using Apache Tomcat WebServer
- Used MVC architecture to build the application
- Perform all levels of testing including unit, integration, regression, system and UAT to ensure that applications will perform error free and according to business specifications when promoted to production
- Provide planning, organization, and control leading a small team of few programmers in project work to ensure that high quality solutions are developed to meet business needs
- Mentor the programmers in order to help them progress in their professional development and optimize performance
- Actively involved in determining the stake holder impacts once the Business intents are finalized
- Prepare use case diagrams, process flow diagrams to better represent the problem
- Prepare Business Requirements and Function Requirements
- Peer review the BRDs, FRDs to ensure the business intent is captured accurately
- Obtain necessary sign-offs from SMEs and Business Partners once all the artifacts are ready
- Walkthrough the requirements to Development & Quality team and act as a liaison between business users and IT development team
- Design test scenarios and write test scripts based on application flow
- Coordinate with QA team in preparing test plans and test strategies for the application
- Track, Update and Monitor defects & change management control requests in HP Quality center
- Maintain Issues log and Requirements Traceability matrix
- Develop effective knowledge transfer process and documents
- Provide end user training as and when needed
Confidential, Miami, FL
Responsibilities:
- Analyze designs, develop and execute custom application features and functions
- Design, code, test, debug, and record those programs proficient to work at top technical level of every level of applications programming activities
- Evaluate, design, code and test improvement as required to complex modules
- Develop system parameters and interact for complicated components
- Design and develop code applications to technical and functional programming standards
- Provide prime assistance toward application releases installation into production under given direction
- Coordinate and involve in structured peer reviews and walkthroughs
- Plan and implement every needed process steps as defined in methodologies
- Develop operational documents for Application
- Present application and technical support as necessary at right time