Trainee Data Scientist

Job Description

Superport-IT is inviting applications for its Trainee Data Scientist program designed for passionate, early-career individuals from a Data Science background. This role provides a launchpad into real-world projects involving data analysis, machine learning, and business insights.

You’ll work closely with our Data Science team, learning and contributing to meaningful analytics tasks and models. This is a remote internship with certification upon completion and potential opportunities for full-time employment based on performance.

Roles & Responsibilities

  • Assist in data collection, cleaning, and preprocessing using Python or R
  • Work on exploratory data analysis (EDA) and interpret patterns and trends
  • Support the development of machine learning models and data-driven solutions
  • Collaborate with senior data scientists and domain experts on real-world problems
  • Present insights and findings using data visualization tools (e.g., Matplotlib, Seaborn, Power BI)
  • Document and maintain code repositories using Git/GitHub
  • Contribute to data science research, internal dashboards, and project reports
  • Participate in team meetings, reviews, and training sessions

Eligibility Criteria

  • Must have a degree in Data Science, AI, Machine Learning, or a related field
  • Fresh graduates, final-year students, or career-switchers with a strong foundation in data science
  • Must pass the Superport-IT Data Science Entrance Exam to qualify for the role
  • Proficiency in Python, Pandas, NumPy, scikit-learn, and basic statistics
  • Familiarity with Jupyter Notebooks, version control (Git), and cloud platforms is a plus
  • Good communication, problem-solving, and analytical thinking skills
  • Self-motivated and eager to learn in a remote setup

Job Category: Data Science & Analytics
Job Type: Full Time
Job Location: Remote
Positions Opened: 5

Apply for this position

Allowed Type(s): .pdf, .doc, .docx
Scroll to Top

Data Science Foundation Track

Course Curriculum: Your Journey from Novice to Data Scientist

Our curriculum is meticulously designed to take you on a step-by-step journey through the world of data science. Each module builds upon the last, blending core theory with practical, hands-on projects to ensure you don’t just learn—you learn by doing.

Module 1: Kickstart & Python Fundamentals (Week 1-2)

Objective: Build the foundational launchpad for your data science journey.

1.1 Introduction to Data Science:
  • What is Data Science, AI, and Machine Learning?
  • Roles & Responsibilities: Data Analyst vs. Data Scientist vs. ML Engineer.
  • The Data Science Lifecycle: From Business Problem to Deployed Solution.
1.2 Python Programming Essentials:
  • Setting up Your Environment: Anaconda, Jupyter Notebooks & VS Code.
  • Python Basics: Variables, Data Types, Operators, and Control Flow (Loops & Conditionals).
  • Core Data Structures: Lists, Tuples, Dictionaries, and Sets.
  • Functions and Object-Oriented Programming (OOP) Concepts.
  • Project 1: Create a simple command-line application (e.g., a calculator or a text-based game) to solidify Python programming logic.

Module 2: Data Analysis with NumPy & Pandas (Week 3-4)

Objective: Learn to manipulate, clean, and analyze complex datasets with industry-standard libraries.

2.1 Numerical Computing with NumPy:
  • Introduction to NumPy Arrays and their advantages over Python Lists.
  • Array Indexing, Slicing, and Mathematical Operations.
  • Statistical functions and Linear Algebra basics.
2.2 Data Manipulation with Pandas:
  • Introduction to Pandas Series and DataFrames.
  • Importing Data: Reading from CSV, Excel, and other file formats.
  • Data Cleaning: Handling Missing Values, Duplicates, and Inconsistent Data.
  • Indexing, Filtering, and Sorting DataFrames (iloc, loc).
  • Grouping and Aggregation with groupby().
  • Project 2: Take a messy, real-world dataset (e.g., sales data) and perform a full data cleaning and initial analysis to uncover key insights.
Module 3: Database Management with SQL (Week 5)

Objective: Master the art of querying databases to extract exactly the data you need.

3.1 Relational Database Fundamentals:
  • Understanding Tables, Primary Keys, and Foreign Keys.
3.2 Essential SQL Queries:
  • SELECT, FROM, WHERE for data retrieval.
  • GROUP BY, HAVING, ORDER BY for data aggregation and sorting.
3.3 Advanced SQL:
  • Joining multiple tables (INNER, LEFT, OUTER JOINs).
  • Subqueries and Common Table Expressions (CTEs).
  • Project 3: Answer complex business questions by writing SQL queries against a sample relational database (e.g., an e-commerce store database).
Module 4: Data Visualization & Storytelling (Week 6)

Objective: Transform raw data into compelling visual stories that drive business decisions.

4.1 Principles of Effective Visualization:
  • Choosing the right chart for your data.
  • The art of storytelling with data.
4.2 Visualization with Matplotlib & Seaborn:
  • Creating basic plots with Matplotlib (Line, Bar, Scatter).
  • Building advanced statistical plots with Seaborn (Heatmaps, Box Plots, Violin Plots).
  • Customizing plots for professional reports.
4.3 Interactive Dashboards:
  • Introduction to interactive plotting with libraries like Plotly.
  • Project 4: Create a multi-chart dashboard to present the findings from your Project 2 dataset, telling a clear story about the insights you discovered.
Module 5: Essential Statistics & Probability (Week 7)

Objective: Understand the statistical concepts that form the backbone of all data science models.

5.1 Descriptive Statistics:
  • Measures of Central Tendency (Mean, Median, Mode).
  • Measures of Dispersion (Variance, Standard Deviation).
5.2 Probability Distributions:
  • Understanding Normal, Binomial, and Poisson distributions.
5.3 Inferential Statistics:
  • Hypothesis Testing, P-values, and Confidence Intervals.
  • Introduction to A/B Testing for business decision-making.
  • Project 5: Analyze the results of a sample A/B test to determine if a change to a website resulted in a statistically significant improvement.
Module 6: Machine Learning Fundamentals (Week 8-9)

Objective: Build and evaluate your first predictive models.

6.1 Introduction to Machine Learning:
  • Supervised vs. Unsupervised vs. Reinforcement Learning.
  • The Train-Test Split and Model Evaluation Metrics.
6.2 Regression Models:
  • Linear Regression: Predicting continuous values (e.g., house prices).
6.3 Classification Models:
  • Logistic Regression: Predicting binary outcomes (e.g., customer churn).
  • K-Nearest Neighbors (KNN) and Decision Trees.
  • Project 6: Build two models: one to predict house prices from a real estate dataset and another to predict customer churn from a telecom dataset.
  • Module 7: Advanced Machine Learning & Capstone Project (Week 10-12)

Objective: Tackle complex problems with advanced algorithms and complete a portfolio-worthy project.

7.1 Advanced Techniques:
  • Ensemble Methods: Random Forests and Gradient Boosting (XGBoost).
  • Unsupervised Learning: K-Means Clustering for customer segmentation.
  • Introduction to Feature Engineering.
7.2 Capstone Project:
  • Choose from a selection of real-world business problems.
  • Apply the entire data science lifecycle: Data Acquisition, Cleaning, EDA, Modeling, and Interpretation.
  • Present your findings and methodology in a final report.
7.3 Introduction to Deployment:
  • Learn how to build a simple, interactive web app for your model using Streamlit.
Module 8: Career Accelerator Workshop (Exclusive to the ₹58,000 Program)

Objective: Polish your professional profile and prepare for the job market.

8.1 Building Your Professional Brand:
  • Crafting a high-impact, keyword-optimized CV/Resume.
  • Optimizing your LinkedIn profile to attract recruiters.
8.2 Acing the Interview:
  • Technical Interview Preparation (SQL, Python, ML Theory).
  • Live Mock Interviews with personalized feedback.
8.3 Internship & Placement:
  • Onboarding for your guaranteed internship.
  • Workshops on abroad assistance (SOP/LOR writing) and job application strategies.