Python Programming | Week 1 | Synkoc AI/ML Internship

Synkoc AI/ML Internship · Week 1 · Lesson 1 of 11

Python for
AI & Machine Learning

Master the complete Python foundation — Variables, Loops, Functions & Data Structures. Every concept connects directly to real ML code.

📦 Variables

🔁 Loops

⚙️ Functions

🗂️ Data Structures

🧑‍💻

Synkoc Instructor

AI/ML Professional · Bangalore

⏱ ~60 minutes
🟢 Beginner Friendly

By the end of this lesson, you will be able to...

📦

Create & use Variables

Store any data type. Understand why every ML model's weights, accuracy and labels are stored in variables.

🔁

Write Loops that process data

Iterate over entire datasets automatically. Understand how ML training loops over every example in every epoch.

⚙️

Build reusable Functions

Write clean, reusable logic with def & return. Every sklearn algorithm is a function you call with data.

🗂️

Organise with Data Structures

Use Lists, Dicts, Tuples, Sets. Direct predecessors to NumPy arrays and Pandas DataFrames in Week 2.

Chapter 1 of 4

Variables

The most fundamental concept in all of programming. Every ML model's weights, accuracy, and labels live in variables.

What is a Variable?

A variable is a named container that stores a value in memory. Give it a name, assign a value — Python handles the rest.

📦

name = value

The equals sign means assignment — store the right-side value under the left-side name. Any time Python sees that name, it retrieves the stored value from memory. You can reassign any time.

student_name = "Priya"    # String — text in quotes
exam_score   = 94.5      # Float — decimal number
batch_size   = 32        # Integer — whole number
is_trained   = False      # Boolean — True or False

⚡ML Connection: learning_rate = 0.001 · epochs = 100 · accuracy = 0.956 · model_name = "RandomForest" — every ML project config lives in variables exactly like these.

The 4 Core Data Types

Python auto-detects type from the value you assign. These 4 types cover 95% of everything you store in an ML project:

🔢

Integer (int)

Whole numbers. For epoch counts, batch sizes, neuron counts, tree counts.

epochs = 100
batch_size = 32
n_trees = 200

ML: epochs, layers, trees

📏

Float (float)

Decimals. For accuracy %, learning rates, model weights, probabilities.

accuracy = 0.956
lr = 0.001
dropout = 0.2

ML: accuracy, loss, weights

📝

String (str)

Text in quotes. For class labels, file paths, feature names, model names.

label = "spam"
file = "data.csv"
model = "SVM"

ML: labels, paths, names

✅

Boolean (bool)

True or False only. For flags, conditions, binary classification outputs.

verbose = True
is_trained = False
use_gpu = True

ML: flags, binary outputs

variables_demo.py● LIVE

1# ── ML Project Config ────────────────────────

2project_name = "Synkoc ML Internship" # String

3learning_rate = 0.001 # Float

4epochs = 100 # Integer

5verbose = True # Boolean

7print(f"Project: {project_name}")

8print(f"LR: {learning_rate} | Epochs: {epochs}")

9epochs = 200 # Reassignment — update any time

10print(f"Updated epochs: {epochs}") # 200

4 types in one config block. String, Float, Integer, Boolean. f-strings embed values inline. Reassignment on line 9 — this is how learning rate decay works in ML training.

Variables in Real ML Projects

Every professional ML project starts with a config block. Every setting has a descriptive variable name — change one variable to update the entire project.

ml_project_config.py

1# ── Project Config ──────────────────────────────

2project_name = "Synkoc Student Pass/Fail Predictor"

3dataset_path = "data/students.csv"

4target_col = "passed" # What we are predicting

6# ── Model Hyperparameters ───────────────────────

7learning_rate = 0.001 # How fast the model learns

8epochs = 100 # Training rounds

9test_size = 0.2 # 20% held back for testing

10random_state = 42 # Seed for reproducibility

11verbose = True # Print training progress

💡

Professional Tip

Always use descriptive names like learning_rate not just lr. Change one variable → entire project updates. Every ML team at every company follows this pattern.

Chapter 2 of 4

Loops

Repeat actions over data without writing the same code thousands of times. The engine behind every ML training process ever built.

What is a Loop?

A loop says: "For every item in this collection — do this action." Write your logic once. Python repeats it automatically for every item.

🔁

for item in collection:

Three parts: the for keyword, a variable name that holds the current item, and the collection. Colon ends the line. Indented code below runs once per item — automatically, for every item, start to end.

scores = [78, 92, 65, 88, 71]

for score in scores:
print(f"Processing: {score}")

# Visits: 78 → 92 → 65 → 88 → 71

⚡ML Connection: Training on 10,000 records × 50 epochs = 500,000 loop iterations. The for loop handles every single one automatically — you write the logic once.

loops_demo.py● LIVE

1scores = [78, 92, 65, 88, 71]

3for score in scores:

4 print(f"Score: {score}")

6passes = 0

7for s in scores:

8 if s >= 70:

9 passes += 1

10print(f"Pass rate: {passes/len(scores)*100:.0f}%") # 80%

for score in scores: visits 78, 92, 65, 88, 71 automatically. Counter pattern (lines 6-9) is exactly how accuracy_score() works inside sklearn.

The for Loop — Every Part Explained

Six parts. Every part matters. Missing any one causes an error immediately.

💡 Real Life Analogy — The Delivery Driver

A driver has 100 addresses. Picks up package 1, drives to address 1, delivers, returns. Address 2 — same. Address 3 — same. Identical action for every address in order until the list is empty. That is a for loop. The list is your collection. Each address is one item. The delivery action is your loop body. Python is the infinitely patient driver — never skips, never gets tired.

for_loop_anatomy.py

1for student in class_list:

2 print(student) # runs once per student

3# Part 1: "for" → keyword that starts the loop

4# Part 2: "student" → YOUR variable (holds current item)

5# Part 3: "in" → connects variable to collection

6# Part 4: "class_list" → the collection to loop through

7# Part 5: ":" → colon — NEVER forget this!

8# Part 6: 4 spaces → indentation = inside the loop

⚠️

Most Common Beginner Mistakes

Forgetting the : gives SyntaxError. Forgetting the 4-space indent gives IndentationError. Python uses indentation as actual syntax — not just style.

range() — Loops with a Counter

When you need to repeat something a fixed number of times — not over a list — use range(). This is how every neural network epoch loop is written.

🔢

range(n) — count 0 to n-1

Gives 0, 1, 2 ... n-1. Standard form for epoch training loops in ML.

for epoch in range(5):
print(f"Epoch {epoch+1}/5")
# Epoch 1/5 ... Epoch 5/5

📈

range(start, stop)

Numbers from start up to (not including) stop. Use when tracking index positions.

scores = [85, 72, 91, 68]
for i in range(len(scores)):
print(f"Student {i+1}: {scores[i]}")

⚡ML Connection: for epoch in range(100): is the standard training loop. 100 epochs = 100 complete passes through your training data. The epoch variable tracks progress for printing, learning rate decay, and saving checkpoints.

The while Loop — Repeat Until Done

Repeats as long as its condition is True. Stops the moment it becomes False. Use when you don't know how many iterations are needed in advance.

🔄

while condition: → run. False → stop.

Checks condition before every iteration. Critical: your loop body must eventually make the condition False — otherwise it runs forever, an infinite loop crash.

accuracy = 0.50
while accuracy < 0.90:
accuracy += 0.08
print(f"Training... acc={accuracy:.2f}")
print("Target reached!")

⚡ML Connection: Early stopping uses this pattern — keep training while validation loss is still improving. Stop when it plateaus. You don't know if this takes 10 or 50 epochs.

Nested Loops & Loop + if — The ML Training Pattern

A loop inside a loop is nested. This is the exact structure running inside every neural network training call ever built.

🔁

Loop + if/else — Filter data

Process only items meeting a condition. This is how you filter a dataset — keep only fraud rows, only adult records.

data = [85, 42, 91, 55, 78]
passed, failed = [], []
for score in data:
if score >= 60:
passed.append(score)
else:
failed.append(score)

⚙️

Nested Loop — Deep Learning Training

Outer loop: epochs. Inner loop: batches. This IS what model.fit() runs inside Keras every time you train a neural network.

for epoch in range(epochs):
print(f"=== Epoch {epoch+1} ===")
for batch in batches:
loss = train_step(batch)
print(f" loss: {loss:.4f}")

💡

This Is the Core of Every Neural Network

Every Keras model.fit() runs this nested loop internally. Outer: for epoch in range(100). Inner: for batch in dataloader. Inside: forward pass → loss → backprop → update weights. You will write this in Week 4 Deep Learning.

loops_real_ml.py● LIVE

1dataset = [{"name":"Priya","score":85},{"name":"Raj","score":52},{"name":"Kavya","score":91}]

3# Loop 1: compute average

4total = 0

5for s in dataset:

6 total += s["score"]

7print(f"Avg: {total/len(dataset):.1f}") # 76.0

9# Loop 2: filter passed students

10passed = [s for s in dataset if s["score"] >= 60]

11print(f"Passed: {len(passed)}/{len(dataset)}") # 2/3

13# Loop 3: epoch training simulation

14for epoch in range(3):

15 loss = 1.0 - (epoch * 0.3)

16 print(f"Epoch {epoch+1}/3 loss={loss:.1f}")

3 patterns every ML engineer uses daily. Loop to compute mean = what df.mean() does. List comprehension filter = what df[df.score>=60] does. Epoch loop = what model.fit() runs inside.

The Loop Analogy

Synkoc Instructor Analogy

"Imagine a Synkoc instructor marking 30 exam papers. She picks up paper 1, grades it, puts it down. Paper 2 — same process. Paper 3 — same. She repeats the identical action for every paper until done. That is a for loop. The pile of papers is your list. Each paper is one item. The grading action is your loop body. Python is the instructor — infinitely patient, never skipping, completing every iteration without mistakes."

🤖

In Real ML Training

10,000 records × 100 epochs = 1,000,000 iterations. The for loop — the exact same one you are learning right now — handles all of it. TensorFlow and PyTorch training loops are built on this exact concept.

Chapter 3 of 4

Functions

Write code once, use it a thousand times. Every sklearn algorithm — LinearRegression, KMeans, RandomForest — is a function you call with your data.

What is a Function?

A function is a named, reusable block of code. Define once with def. Call from anywhere with any data. return sends the result back.

⚙️

def function_name(parameters):

Four parts: def starts it, a descriptive name, parameters (placeholder names for inputs), and return. Call it by writing the name with actual values — called arguments.

def calculate_accuracy(correct, total):
return (correct / total) * 100

result = calculate_accuracy(87, 100)
print(f"Accuracy: {result}%") # → 87.0%

⚡ML Connection: model.fit(X, y) · model.predict(X_test) · accuracy_score(y_true, y_pred) — you already call functions every time you use sklearn. Now you write your own.

functions_demo.py● LIVE

1def compute_average(scores):

2 """Returns the mean of a list"""

3 return sum(scores) / len(scores)

5avg = compute_average([85, 92, 78, 96])

6print(f"Average: {avg:.2f}") # 87.75

8def grade(score, threshold=60):

9 return "PASS" if score >= threshold else "FAIL"

10print(grade(75)) # PASS

11print(grade(75, 80)) # FAIL

def defines the function. return sends result back. Default parameter threshold=60 works exactly like RandomForestClassifier(n_estimators=100) in sklearn.

Function Anatomy — Every Part Explained

Six components. Each has a specific job. Understand each one and you can read any function in any ML library.

💡 Real Life Analogy — The ATM Machine

ATM built once. Name on front: WITHDRAW CASH. You insert card and type amount — inputs. Machine runs internal logic. Gives cash and receipt — return values. You never need to know how it works. Same inputs, same outputs, every time. A Python function works identically. Define once, call from anywhere with any inputs, get back the correct output every single time.

function_anatomy.py

1# ↓ def keyword ↓ function name ↓ parameters

2def calculate_bmi(weight_kg, height_m):

3 """Calculate Body Mass Index for health ML models"""

4 bmi = weight_kg / (height_m ** 2)

5 return round(bmi, 2) # ← return sends result back

6result = calculate_bmi(70, 1.75) # call with arguments

7print(result) # → 22.86

💡

Parameter vs Argument

Parameter = placeholder in the definition (weight_kg). Argument = actual value when calling (70). Python substitutes 70 wherever weight_kg appears in the body.

Default Parameters & Multiple Returns

Two features used in every sklearn function. Default params make functions flexible. Multiple returns let one function give back several values at once.

🎯

Default Parameters

Give a parameter a default value. Caller overrides or leaves it. Exactly how sklearn works — most parameters have sensible defaults you rarely need to change.

def train_model(data,
epochs=100,
lr=0.001):
pass

train_model(data) # uses defaults
train_model(data, epochs=50) # override one

🔘

Multiple Return Values

Return several values separated by commas. Caller unpacks into separate variables. Exactly how train_test_split works in sklearn.

def split_data(data, ratio=0.8):
n = int(len(data) * ratio)
return data[:n], data[n:]

# Unpack both return values
train, test = split_data(dataset)

⚡ML Connection: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) — default parameter and 4 return values unpacked in one line. This is the exact pattern above.

Variable Scope — Inside vs Outside

Scope means where a variable can be seen. Write pure functions for reliable, predictable ML code.

📌

Local = inside function only. Global = visible everywhere.

Local variables disappear after return. Two functions can both use a variable named result without any conflict. Best practice: write pure functions — only use parameters as inputs. Same inputs always give same outputs. Every sklearn algorithm is pure.

score = 95 # GLOBAL — visible everywhere

def check_pass(s):
threshold = 60 # LOCAL — only inside here
return s >= threshold

passed = check_pass(score) # True
# print(threshold) ← NameError! threshold is local only

functions_ml_pipeline.py● LIVE

1def load_data(filepath):

2 """Load and return dataset as list of dicts"""

3 return [{"name":"Priya","score":85},{"name":"Raj","score":52},{"name":"Kavya","score":91}]

5def normalise(values):

6 """Scale to 0-1 range — same as StandardScaler"""

7 mn, mx = min(values), max(values)

8 return [(v-mn)/(mx-mn) for v in values]

10def predict(score, threshold=0.5):

11 return "PASS" if score >= threshold else "FAIL"

13data = load_data("students.csv")

14scores = [s["score"] for s in data]

15normed = normalise(scores)

16for i, s in enumerate(data):

17 print(f"{s['name']}: {normed[i]:.2f} → {predict(normed[i])}")

Real 3-function ML pipeline. load_data → normalise (= StandardScaler) → predict (= classifier). In Week 3 replace with sklearn equivalents. The pipeline structure stays identical.

Chapter 4 of 4

Data Structures

Organise collections of data — the containers that hold your entire dataset before feeding it into any ML model.

The 4 Core Data Structures

When a single variable is not enough, use a structure. These 4 are the direct foundation of NumPy arrays and Pandas DataFrames in Week 2:

📋

List [ ]

Ordered, changeable, allows duplicates. Access by index from 0. The most-used structure in all of data science.

scores = [85, 92, 78, 96]
labels = ["pass","fail","pass"]

🗂️

Dictionary { key: value }

Key-value pairs — access by name. One dictionary = one complete row of your ML dataset with feature names mapped to values.

student = {"name":"Rahul",
"score":91, "passed":True}

🔒

Tuple ( )

Like a list but immutable — cannot be changed. Use for fixed shapes and configs that must never be accidentally modified.

img_shape = (224, 224, 3)
split_ratio = (0.8, 0.2)

⚡

Set { }

Unordered, unique values only — duplicates auto-removed. Pass 10,000 labels in, get only unique class names back.

classes = {"cat","dog","bird"}
unique = set(all_labels)

data_structures_demo.py● LIVE

1scores = [85, 92, 78] # List

2student = {"name":"Priya", "score":94} # Dict

3shape = (28, 28, 1) # Tuple

4labels = {"spam", "ham"} # Set

6print(scores[0]) # 85 (index from 0)

7print(student["name"]) # Priya (by key)

8print(shape[0]) # 28 (immutable)

9print(len(labels)) # 2 (unique only)

11# List of dicts = dataset = what Pandas DataFrame IS

12dataset = [{"name":"Priya","score":94}, {"name":"Raj","score":72}]

List index from 0 · Dict by name · Tuple immutable · Set unique only. List of dicts at bottom = exactly what a Pandas DataFrame is internally.

Lists In Depth — Most Used Structure in Data Science

Every feature column is a list. Every prediction sequence is a list. Every training batch is a list. Master this completely.

💡 Real Life Analogy — The Marks Register

Teacher's marks register: scores in order, row 1 = student 1, row 2 = student 2. Find by row number. Add, remove, update — changeable. A Python list is this register. Each item has a numbered index starting at 0. Order preserved exactly as inserted.

lists_complete.py

1scores = [85, 92, 78, 96, 71]

2print(scores[0]) # 85 (first — index 0)

3print(scores[-1]) # 71 (last)

4print(scores[1:3]) # [92, 78] (slice)

5scores.append(88) # add to end

6scores.sort() # sort ascending

7print(len(scores)) # 6 (count)

8print(sum(scores)) # total

9print(max(scores)) # highest

10print(min(scores)) # lowest

⚡Pandas connection: df["score"].tolist() converts a column to a list. len(df) = len(list). df["score"].max() = max(list). Pandas is powered by lists internally.

Dictionaries In Depth — One Row = One Dictionary

One dict = one complete record. A list of dicts = a dataset. This is exactly what a Pandas DataFrame is internally.

💡 Real Life Analogy — The Aadhaar Card

Aadhaar card = dictionary. Named fields: name, date of birth, address, number — each with a value. You find info by field name, not by position. "Give me the name" not "give me item 3". Dictionary access: by key, not index. Every row of a dataset is this card.

dictionaries_complete.py

1student = {"name":"Priya", "age":21, "score":94.5, "passed":True}

2print(student["name"]) # Priya

3print(student["score"]) # 94.5

4student["grade"] = "A" # add new key

5student["score"] = 96.0 # update existing

7# List of dicts = dataset = what Pandas DataFrame IS

8dataset = [

9 {"name":"Priya", "score":94.5},

10 {"name":"Raj", "score":72.0},

11]

⚡Pandas connection: df.iloc[0] returns first row as a dict-like object. df.to_dict("records") returns exactly a list of dicts. Understanding dicts means understanding Pandas internally.

Tuples & Sets — When to Use Each

Tuples for values that must never change. Sets for instantly finding unique values. Both have specific ML jobs.

🔒

Tuple ( ) — Lock it in place

Round brackets. Immutable after creation. Use for values fixed by design — image dimensions, model input shapes that must never change.

# Image dimensions — never change
img_shape = (224, 224, 3)
# Keras model input shape
input_shape = (784,) # MNIST
# Try to change → TypeError!
# img_shape[0] = 128 ← BLOCKED

📑

Set { } — Unique values only

Curly brackets, values only. Removes ALL duplicates automatically. Find unique class labels instantly.

labels = ["spam","ham","spam",
"ham","spam","ham"]
classes = set(labels)
print(classes) # {"spam","ham"}
n = len(classes) # 2
print("spam" in classes) # True

💡

Decision Rule — Which Structure to Use

Ordered changeable collection → List. Access by name → Dict. Fixed immutable values → Tuple. Unique values only → Set. Pandas DataFrame = optimised list of dicts. NumPy array = optimised list of numbers.

all_4_structures_ml.py● LIVE

1# All 4 structures in one ML pipeline

2config = {"model":"RF", "trees":100} # DICT

3dataset = [{"age":22,"label":"no"},

4 {"age":35,"label":"yes"},

5 {"age":28,"label":"yes"}] # LIST OF DICTS

7ages = [r["age" ] for r in dataset] # LIST

8labels = [r["label"] for r in dataset]

9classes = set(labels) # SET

10shape = (len(ages), 1) # TUPLE

11print(f"Dataset: {shape} | Classes: {classes}")

All 4 in one pipeline. Dict for config. List of dicts = dataset (same as pd.read_csv()). List comprehension = df["age"].tolist(). Set for unique classes. Tuple for shape. You now understand the internals of every Pandas and sklearn operation.

All 4 Pillars Together — Complete Program

student_analyzer.pyComplete Program

1# DATA STRUCTURE — list of dicts, one per student

2students = [{"name":"Priya", "scores":[85,92,78,96]},

3 {"name":"Rahul", "scores":[70,65,80,75]},

4 {"name":"Anjali","scores":[95,98,92,97]}]

6# FUNCTION — takes any list, returns average

7def compute_average(scores):

8 return sum(scores) / len(scores)

10# LOOP — process every student automatically

11for s in students:

12 name = s["name"] # VARIABLE

13 avg = compute_average(s["scores"]) # FUNCTION CALL

14 print(f"{name}: Average = {avg:.1f}")

This program uses all 4 pillars: a list of dicts holds the data · a function computes averages · a loop processes every student · variables store name and avg. This is the exact pattern used in real ML data pipelines.

Lesson Summary

You have completed the Python foundation. Here is what you can now do in every ML project:

📦

Variables

Store any data type with a meaningful name. Configure ML projects professionally. Know int, float, str, bool and when to use each.

🔁

Loops

Iterate over any list with a for loop. Combine with if/else. Understand that ML training is a massive nested loop over data and epochs.

⚙️

Functions

Define reusable logic with def and return. Use parameters and defaults. Understand that every sklearn call is a function like the ones you now write.

🗂️

Data Structures

Use Lists, Dicts, Tuples, Sets. These are the direct foundation of NumPy arrays and Pandas DataFrames you will use in Week 2.

🚀

Python Complete!

Foundation mastered. Open the Practical Lab to write real code across 5 tasks. Complete the lab, then take the Quiz. Then — Statistics for Data Science.

✅ Video — Done

✏️ Practical Lab — Next

❓ Quiz — After Lab

Synkoc IT Services · Bangalore · [email protected] · +91-9019532023