Thursday, April 30, 2026

Why encoding ='utf-8' is More Than Just a Bug Fix

We’ve all been there. You’re building a RAG pipeline or trying to load a fresh dataset into a notebook. You run f.read() and suddenly—Boom.

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2...

For years, as a Mechanical Engineer by training, my response was purely “patch-work.” I’d Google the error, find a StackOverflow thread telling me to add encoding='utf-8' to my code, and move on once the red text disappeared. It worked like magic.

But recently, while digging into speech recognition and pattern books, I realized I was treating a masterpiece of optimization like a simple bug fix. I never formally studied encodings, but once you look at the “hardware” logic behind them, it’s a beautiful story of solving a global data traffic jam.

The 7-Bit Prison: A Relic of the 1960s

Back in the 1960s, computing was a “walled garden.” Everything was built around ASCII — a system designed primarily for English teletype machines. To save every precious bit of expensive memory, ASCII only used 7 bits of a byte, keeping that 8th bit permanently locked at zero.

It was lean, sure, but it was incredibly narrow-minded. If you wanted to write a Spanish phrase like Señor or use the Devanagari script (the beautiful characters behind Hindi and Sanskrit), ASCII simply didn’t have the “slots” to hold them. You were literally trapped in a 127-character world.

The “Global ID” Spreadsheet

To break out of that prison, we created Unicode.

The best way to visualize Unicode isn’t as a file format, but as a massive, abstract spreadsheet. In this spreadsheet, every character in human history — from ancient Sumerian cuneiform to that “Grinning Face” emoji on your phone — is assigned a unique ID called a Code Point.

For example, the letter a is assigned the ID U+0061. It’s a beautiful, inclusive library that now supports over 150,000 characters and 168 different scripts. But as an engineer, the first thing I thought was: “Wait. If we have over a million potential IDs, how do we save them without making every simple text file four times larger?”

UTF-8: The “Chameleon” of Engineering

This is where the engineering gets truly clever.

If we used a fixed-width system like UTF-32 (4 bytes for every single character), a simple English sentence would be 75% “wasted” zeros. It would be like using a semi-truck to deliver a single envelope.

UTF-8 is the solution — a variable-length masterpiece. It’s essentially a chameleon that changes its size based on the character it’s carrying:

  • For the “Standard” stuff: It uses just 1 byte for the first 127 characters, making it perfectly backward-compatible with those old ASCII systems.
  • For the “Global” stuff: When it hits scripts like Hindi, Arabic, or Spanish, it smoothly expands to 2 or 3 bytes.
  • For the “Fun” stuff: Emojis and rare symbols take up 4 bytes.

But the real “Mechanical Engineer” in me loves the Self-Synchronization feature. In older systems, if one byte got corrupted, the whole file turned into gibberish. UTF-8 is built so that if a byte fails, the system can just “skip” and find the start of the next character by looking at the first few bits. It’s resilient, it’s efficient, and it’s why your RAG pipelines actually work when you feed them multilingual data.

The Proof in the Bytes: A Python Experiment

I ran a quick script to compare how these encodings “weigh” the same information.

import sys

def check_encoding_impact(text):
print(f"\nTarget Text: {text}")
print("-" * 30)
for enc in ['ascii', 'utf-8', 'utf-16', 'utf-32']:
try:
encoded_data = text.encode(enc)
print(f"{enc.upper():<8} | {len(encoded_data)} bytes")
except:
print(f"{enc.upper():<8} | not supported")

# Test 1: Standard English
check_encoding_impact("Hello")

# Test 2: Multilingual (Hindi)
check_encoding_impact("नमस्ते")

# Test 3: The Emoji Tax
check_encoding_impact("RAG 🤖")
OUTPUT --

Target Text: Hello
------------------------------

ASCII | 5 bytes
UTF-8 | 5 bytes
UTF-16 | 12 bytes
UTF-32 | 24 bytes

Target Text: नमस्ते
------------------------------

ASCII | not supported
UTF-8 | 18 bytes
UTF-16 | 14 bytes
UTF-32 | 28 bytes

Target Text: RAG 🤖
------------------------------

ASCII | not supported
UTF-8 | 8 bytes
UTF-16 | 14 bytes
UTF-32 | 24 bytes
Press enter or click to view image in full size

import os

# One combined string with three lines of text
# English + Spanish + Chinese
multilingual_content = (
"Hello, How are yu ? Have a great day ahead !\n" # English
"¡Hola! ¿Cómo estás? ¡Que tengas un gran día!\n" # Spanish
"你好,你好嗎?祝你有美好的一天!" # Chinese
)

def save_and_measure_global(text):
formats = ['utf-8', 'utf-16', 'utf-32']

print("--- Global Multilingual Test ---")
print(f"Total Characters (including newlines): {len(text)}")
print(f"{'Encoding':<10} | {'File Size (Bytes)':<18}")
print("-" * 35)

for fmt in formats:
filename = f"global_test_{fmt}.txt"

# Writing the combined string to disk
with open(filename, 'w', encoding=fmt) as f:
f.write(text)

# Measuring the physical footprint
file_size = os.path.getsize(filename)
print(f"{fmt.upper():<10} | {file_size:<18}")

# Clean up
os.remove(filename)

if __name__ == "__main__":
save_and_measure_global(multilingual_content)
--- Global Multilingual Test ---
Total Characters (including newlines): 106
Encoding | File Size (Bytes)
-----------------------------------
UTF-8 | 146
UTF-16 | 218
UTF-32 | 436

This experiment moves the discussion from theory to hardware, proving that encoding isn’t just about avoiding “broken” characters — it’s a critical decision for storage optimization. While the UTF-8 vs. UTF-16 debate often favors the former for English-heavy data, the results clearly show a “signaling tax” that flips the efficiency in favor of UTF-16 for non-Latin scripts like Hindi. By measuring the literal physical footprint on disk, we see that modern encodings are a game of strategic tradeoffs: choosing the right one can reduce your storage and memory overhead by nearly 30% depending on the linguistic footprint of your global RAG pipeline or localized application.

Reflection: The Hidden Infrastructure

As engineers, we often obsess over the “sexy” stuff — LLM parameters and vector databases. But all of it rests on these tiny, byte-level protocols.

There is no such thing as a “plain text file” without an encoding. Every time you specify encoding='utf-8', you aren't just fixing an error; you are participating in a global standard that allows an engineer in Bangalore, a developer in Seattle, and a machine from 1965 to understand each other.

It’s a reminder that sometimes, the most elegant engineering is the stuff we never see — until it fails.

Sunday, April 19, 2026

This is for those who Feel Stuck

If you are currently pursuing a Master’s, a CFA, or something similar—that’s great. You already have a short-to-mid-term goal to work toward, and I wish you the best of luck!

This post is for those who don’t know what to aim for, or who feel a bit stuck and unable to move toward a goal.

Five or six years ago, I felt what we now call a "mid-life crisis." I assumed it was a one-time event that would just pass. Oh boy, I couldn’t have been more wrong. It keeps coming back—and it is perfectly fine if you feel stuck momentarily.

I would say: Go back to basics. Go back to your roots.

Often in our day-to-day work, we lose touch with the fundamentals we once studied. It is a great idea to revisit your old reference books or walk through your old code and repositories. I am sure there are topics you struggled with back then—maybe chapters you kept as "optional" just to pass the course. Go back to those chapters now. Prepare as if you were studying for college exams, or look into new developments in your core area of expertise. Reinforce your foundation with new technology.

This may not directly unlock your "next step," but it re-strengthens your foundation. When you eventually figure out a goal in a few weeks, that reinforced foundation will ensure you achieve it much faster. Being an engineer, I’ll use a personal example: Integrals. I used to solve complex integrals in my sleep; now, when I see one, I find myself wondering how to even approach it. Picking that back up is a powerful way to reset.

Think of Creed 2 (or the Rocky movies) -- When Adonis is lost, Rocky takes him to the desert—away from the noise, the ego, and the comfort. He makes him go back to the basics to master what he once knew. It’s the same in Rocky 4. You cut off the distractions and return to the foundation. This is where you get rid of the doubt and the noise.



The Second Path: Learn something entirely new.

This is the opposite approach—inspired by the likes of Alex Hormozi. Learn something totally irrelevant to what you have learned so far to build new capabilities. This requires you to be ready to "suck" at something. You have to be willing to be bad at something for a long time before you can be great at it.

If you are a Doctor, maybe learn about AI. If you are an Engineer, learn about sales or philosophy.

A word of caution: When I say "learn something new," I don't mean a two-hour crash course or a quick YouTube tutorial. Learn from the base. If you are learning finance, pick up academic reference books or join a rigorous 6–9 month course. This deep dive might even land you on a new path you never considered.

In this fast-paced world, these ideas are "slow grinders." They require time, effort, and patience. However, both approaches help fire up your "learning neurons," which improves your thinking and helps you find clarity in the moments you feel most stuck.

This is for the people who want to play the long game. When you’re in a crisis, a pause—spent either returning to basics or learning something new without the pressure of a specific goal—will go a long way.

Trust me, it always pays off in the long run.

Sunday, March 8, 2026

𝗙𝗼𝗼𝗱 𝗳𝗼𝗿 𝗧𝗵𝗼𝘂𝗴𝗵𝘁 - 𝗦𝗮𝗻𝗱𝘆’𝘀 𝘁𝗮𝗸𝗲 𝗼𝗻 𝗟𝗟𝗠 - 𝗣𝗮𝗿𝘁 𝟮

If your near-and-dear one was having a health issue, who would you go to?

  • 𝘛𝘩𝘦 𝘣𝘦𝘴𝘵 𝘈𝘨𝘦𝘯𝘵𝘪𝘤 𝘈𝘐 𝘋𝘰𝘤𝘵𝘰𝘳 𝘰𝘶𝘵 𝘵𝘩𝘦𝘳𝘦

  • 𝘈 "𝘷𝘪𝘣𝘦-𝘤𝘰𝘥𝘦𝘥" 𝘋𝘰𝘤𝘵𝘰𝘳

  • 𝘈𝘯 𝘦𝘹𝘱𝘦𝘳𝘵, 𝘢𝘤𝘵𝘶𝘢𝘭 𝘥𝘰𝘤𝘵𝘰𝘳

  • 𝘈𝘯 𝘢𝘤𝘵𝘶𝘢𝘭 𝘥𝘰𝘤𝘵𝘰𝘳 𝘸𝘪𝘵𝘩 𝘈𝘐 𝘵𝘰𝘰𝘭𝘴 𝘢𝘵 𝘵𝘩𝘦𝘪𝘳 𝘥𝘪𝘴𝘱𝘰𝘴𝘢𝘭

One could say this is an extreme case and perhaps not worth a comparison, but I want to drop this here as food for thought.

I work in the AI/ML domain and I see its potential. I am all for change and adoption, but I am not yet fully bought into a 100% replacement.

Checkout my article I wrote 2 year back on LLM -- Sandy’s take on LLM and RAG so Far


𝗧𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗗𝗲𝗯𝘁

I am somewhat starting to like the term "Technical Debt." Let me use myself and some of my colleagues as examples.

I have been using GitHub Copilot for almost six months now. In the early days, I would prompt it, refine it, and let it create entire repositories and solutions for me—my day-to-day tasks and everything else. Once in a while, things would go wrong, and it would take me days to fix them. Also, when I started looking closer, I realized there could be some unnecessary code blocks mixed in with some really smart ones.

After a learning curve, I now go function-by-function or block-by-block, and I have my own way of testing the accuracy of the outputs. The majority of my tasks involve processing large chunks of data and making inferences from them—sometimes processing and passing them to domain experts or top management for decision-making. In this case, I have to be triply sure of what I deliver, so I go step-by-step.

For sure, a week's worth of tasks can now be done in a day or two, and I can generate code that is much more scalable and reusable. The point is: I wouldn't let it run on "auto-pilot" for my entire task just yet.

𝗩𝗶𝗯𝗲 𝗖𝗼𝗱𝗶𝗻𝗴

"Vibe coding"—well, for sure, some have successfully done it. Having gone through it myself, I would classify those successes as 0.01% or even less. If you think about it, no matter the field or the task, you will always find some outliers—those who defy the norm. As for the majority (including myself), we either aren't sure what we are doing or need more practice with the tools.

I like to use the example of Excel a lot. Corporate employees know how to use Excel, but how much one achieves with it depends on their skills and the effort they took to master it. I remember in my Quantitative Finance course, I was doing heavy Python coding for some bond pricing, and the instructor—an expert—did it in Excel right then and there with us. It's the same with my cousin; in five minutes or so, he made an entire loan repayment and amortization sheet for me in Excel.

LLM tools and agents are getting much smarter and faster, but without knowing the basics of the task at hand, things might spiral out of control, and that "Technical Debt" would keep on growing.


𝗜𝘁 𝗶𝘀 𝗔𝗹𝗹 𝗔𝗯𝗼𝘂𝘁 𝗡𝗮𝗿𝗿𝗮𝘁𝗶𝘃𝗲𝘀

Apple, now Anthropic—even recent politics and whatnot—throughout history, it has always been about the narratives one sets and how fast people catch up to them. Yes, you then need a product to support it.

This reminds me of Freedom 251 (India). It was advertised as a smartphone for ₹251. It looked like a scam, but the narrative it set got tons of bookings based on that story alone.

𝗧𝗵𝗲 𝗘𝗰𝗼𝗻𝗼𝗺𝗶𝗰 𝗟𝗼𝗼𝗽

I read this somewhere—imagine AI and robots can do everything and take over most jobs. If people don't have jobs, they have no income to buy stuff—both essentials and non-essentials. If that happens, who will these AI companies sell to? Who will buy the robots?

It is said that an equilibrium will be reached, but one cannot expect everything to go "all-in" in an instant. The world runs on consumerism.

Lastly, in one of his interviews, I heard Jamie Dimon, JP Morgan Chase CEO, saying (based on what I recollect):

We have autonomous driving—does that mean you take 2 million drivers out of work and the next best job they have pays only $25,000 a year? No, you can do it gradually, or have the government pitch in to say, "No, you can't do that," or "Let us do it sensibly."


Anyway, don’t get me wrong—I am all for AI, the change, and the new ways of working, as well as the new skills and job openings that will be brought about by it.

Thinking that AI can do it all? I am still not sold on it.

Tuesday, March 3, 2026

Whose Fault is it ?

Have been thinking for few days on writing this and stumbled on one Kosuke Takeuchi post - about a tool he developed to generate traffic scenarios - Thanks a Ton !





Now - First we see Cyclist and Car minding their own business and going in proper lanes and suddenly a HERO comes in motorcycle from wrong way - or as the biker says - 'MY WAY or the HIGHWAY'.

Then I play out just two of the many scenarios and what I want to know is - whose Fault is it ??


1. Cyclist - for riding cycles on Indian Roads ?
2. Cyclist - for riding on roads and in correct direction - may be cyclist should have used footPaths - if bikers can do it - why not our cyclist ?
3. Cyclist - for not driving in the right most lane ?

3. The Car - like the driver should have anticipated and left space for the cyclist ?
4. The Car - for whatever reason..

5. The Government/RTO - for not able to reinforce the traffic rules effectively ?

6. No One to Blame -- Like it is just another day in world - or city traffic - especially India or Bangalore !

Now - all yu experts and noobs out there - who would yu put a blame on if yu were the judge dealing with this case..


Linkedin post -- drawtonomy_linkedIn post
Tool Link -- https://www.drawtonomy.com/

Saturday, January 3, 2026

Discovering FastF1: Telemetry, Tyre Strategy, and My First Steps with MCP

 I stumbled upon FastF1 only recently — and honestly, I’m still surprised I missed it for so long.

FastF1 is a Python library that exposes an incredible amount of Formula 1 data across an entire race weekend: practice, qualifying, and race sessions. Not just results and lap times, but also detailed telemetry — speed traces, throttle application, braking, tyre stints, compound usage, and much more.

It turns out FastF1 has been around for a while. But as they say, better late than never.

As someone who loves both Formula 1 and building things, discovering FastF1 immediately opened up a flood of ideas.


Why FastF1 feels special

What makes FastF1 exciting is not just the data volume, but the granularity.

Instead of asking:

  • “Who was faster?”

You can start asking:

  • Where was a driver faster?

  • Why did a lap work?

  • How tyre choices shaped race outcomes

  • How two qualifying laps differ by just hundredths of a second

This moves F1 analysis away from headlines and into cause-and-effect.




Enter MCP: learning by building

Around the same time, I had been reading about MCP (Model Context Protocol) and wanted to understand it beyond theory. MCP, at a high level, is about exposing structured tools and data in a way that agents (or other clients) can call reliably.

Rather than learning MCP in isolation, I decided to combine both interests:

  • learn MCP properly

  • apply it to something I genuinely enjoy — Formula 1

So I started building a small F1 MCP server, backed by FastF1 data.

For now, this is very much a learning project — not a product — but it’s already been surprisingly rewarding.


The first two tools I built

At the moment, I’ve implemented just two core functions, keeping things intentionally simple.

1. Tyre strategy visualisation

The first tool generates a tyre strategy timeline for a given race, showing:

  • which compounds each driver used

  • how long each stint lasted

  • how strategies differed across the field

This makes race strategy immediately visual. Instead of reading pit-stop summaries, you can see how races unfolded strategically.

image shows tyre strategy for each drive — along with it — in github copilot chat — it throws more insights into the tyre strategy across the race




2. Qualifying lap telemetry comparison

The second tool focuses on qualifying, comparing telemetry from the top drivers’ fastest laps.

It plots:

  • speed vs distance

  • throttle application

  • brake application

Side by side, this reveals exactly where time was gained or lost — often in places that don’t show up in sector times alone.

Yu can compare segment by segment the driving style of top 3 drivers and in the chat window it throws details about remaining 7 drivers — ie top 10 qualifing results



Why MCP fits nicely here

Wrapping these analyses as MCP tools felt natural.

Instead of scripts that only I run locally, MCP encourages thinking in terms of:

  • clear inputs (season, race, session, drivers)

  • predictable outputs (tables, plots, structured data)

This also opens the door to multiple interfaces later:

  • CLI tools

  • dashboards (Streamlit / web)

  • or even AI-driven queries on top of the same data

For now, though, the goal is simple: learn MCP by doing, not by reading specs.


What’s next

There’s a lot more I want to explore:

  • combining telemetry with race notes, penalties, and regulations

  • richer driver-to-driver comparisons

  • experimenting with live data once the 2026 season starts

  • exposing more race-weekend concepts as structured MCP tools

I’ll keep this project intentionally lightweight and exploratory.

If there’s a specific race, driver comparison, or kind of plot you’d like to see, feel free to suggest it — I’ll be iterating on this over the coming weeks purely for learning and fun.

GitHub link coming soon once things settle a bit.

Monday, December 15, 2025

Wrapping up the year on a High! - The Dancing Champs :D

 In March this year, I wrote Fighting the Inner Demons We Create Ourselves, where I mentioned how I started dancing again because of an office event and our little group.

After that event, life went on as usual — no new dance practice, no rehearsals. Then, in the middle of the year, at one of my colleague’s weddings, we decided to perform again… and I went with it.

I joined the group, and this time we even stayed together in a hotel a day before, so we had good focused time to practice and perfect the moves — and everything went well.


Practice Video :)


I enjoyed it. It was fun again. And it made me let go of that hesitation I used to carry around dancing.

Fast-forward to a few days back — I danced again, probably the last time for the year. And like everything else, I became a bit overconfident and didn’t practice enough at home (the homework I was supposed to do 😅).

It showed — I missed a few steps here and there.


Just highlighting the fact that I missed few steps — but overall we synced and Rocked :D


Prof - Sandy !


But the group, as a whole, did well… so well that we ended up being crowned the dance champs for that event!
Fine, fine — it wasn’t a grand show — but winning is winning, and for me, it was a pleasant surprise.

This time it wasn’t just dance — we also had a storyline woven into the performance. The team did a fantastic job bringing everything together. And for the next opportunity, whenever it comes, I’ll make sure to put in more effort and perfect the nitty-gritties.

Sharing some snaps and clips from the event below…



And again — a big THANK YOU to Ragi.
She has been my dance teacher throughout — a trained classical dancer, an expert in her art. It was a privilege to watch her solo performance last evening as well.

I always assumed it must be a cakewalk for her to dance with us to movie songs… and I still remember her saying,
“I have to unlearn my dance to learn this dance.”

Just shows — everyone has their own unique challenges to deal with before they deliver their best. 😊






We Know What’s Healthy. So Why Don’t We Do It?

The real reason we don’t follow healthy habits is not lack of knowledge — it’s lack of a big enough purpose.


Last Sunday, I went to a street-side book vendor and noticed a guy wearing a retro-style cap and sunglasses walk up as well. We had a brief chat, and he casually mentioned that he was buying books for his 90-year-old girlfriend.


(He might have been joking — but honestly, it didn’t feel that way.)

That one line sent my mind in so many directions.

  1. Someone at 90, still active and still reading.

  2. A girlfriend at that age — I mean, the relationship must have gone through so much. Both of them clearly invested in life and each other to make it till here.

  3. The sheer energy and freshness that guy carried.

Then, just last week, while travelling to Mangalore for a friend’s wedding, I happened to cross paths with a Kannada actress — Geetha Suratkal. I learnt who she was during our conversation. She might be in her 70s now.

The first thing I noticed — she wasn’t wearing reading glasses or any kind of specs.
The second — her energy. Happy, smiley, warm.
She mentioned she still travels for work.

She also shared that she’s a retired bank employee and even while working full-time, she used to do theatre. If you’ve seen Sapne vs Everyone, she reminded me of Prashanth — managing a day job while still nurturing passion on the side.

And then this morning, a thought hit me.

We — almost all of us — know what’s good for our health and happiness. For our body, mind, and overall well-being.
Yet, many of us struggle to follow it.
While for a select few, that way of living feels effortless — almost automatic.

Maybe the problem isn’t that we don’t know what to eat, how to sleep, or when to exercise.

Maybe we don’t need to force healthy habits at all.

Maybe what we really need is something meaningful to move towards.

Have a purpose. A goal. A reason.

When you’re training for a marathon, you automatically move towards healthier food, better sleep, and early mornings.
When you’re preparing for a competitive exam, distractions reduce on their own, stressors are filtered out, and you naturally prioritise focus, rest, and mental clarity.

Yesterday, I was listening to ideas from the book 10x Is Better Than 2x. One thought stood out — big goals shape behaviour.
When the dream is big enough, the path becomes clearer. And while walking that path, taking care of your body and mind stops feeling like a task — it just becomes part of the journey.