Shift-Left, Shift-Right: The Twin Strategies Powering Modern IT and Data Operations

In today’s always-on digital enterprises, downtime and performance issues come at a steep cost. The modern DevOps philosophy has redefined how organizations build, test, deploy, and manage software and data systems. Two terms: Shift-Left and Shift-Right capture this evolution perfectly.

These approaches are not just technical buzzwords; they represent a cultural and operational transformation from reactive troubleshooting to proactive prevention and continuous improvement.

1. What Does “Shift-Left” Mean?

Shift-Left” is all about moving quality and risk management earlier in the lifecycle, to the “left” of the traditional project timeline.

Historically, teams tested applications or validated data after development. By that stage, identifying and fixing issues became expensive and time-consuming.
Shift-Left reverses that by embedding testing, data validation, and quality assurance right from design and development.

Real-world example:

  • Microsoft uses Shift-Left practices by integrating automated unit tests and code analysis in its continuous integration (CI) pipeline. Each new feature or update is tested within minutes of being committed, drastically reducing post-release defects.
  • In a data engineering context, companies like Databricks and Snowflake promote Shift-Left Data Quality – validating schema, freshness, and business rules within the pipeline itself before data lands in analytics or AI systems.

Why it matters:

  • Reduces defects and rework
  • Improves developer productivity
  • Speeds up deployment cycles
  • Builds confidence in production releases

2. What Does “Shift-Right” Mean?

Shift-Right” extends testing and validation after deployment, to the “right” of the timeline. It’s about ensuring systems continue to perform and evolve once they’re live in production.

Rather than assuming everything works perfectly after release, Shift-Right emphasizes continuous feedback, monitoring, and learning from real user behavior.

Real-world example:

  • Netflix uses Shift-Right principles through its famous Chaos Engineering practice. By intentionally disrupting production systems (e.g., shutting down random servers), it tests the resilience of its streaming platform in real-world conditions.
  • Airbnb runs canary deployments and A/B tests to validate new features with a subset of users in production before a global rollout – ensuring a smooth and data-driven experience.

Why it matters:

  • Improves reliability and resilience
  • Enables real-time performance optimization
  • Drives continuous learning from production data
  • Enhances customer experience through fast iteration

3. When Shift-Left Meets Shift-Right

In modern enterprises, Shift-Left and Shift-Right are not opposites – they’re complementary halves of a continuous delivery loop.

  • Shift-Left ensures things are built right.
  • Shift-Right ensures they continue to run right.

Together, they create a closed feedback system where insights from production feedback into design and development creating a self-improving operational model.

Example synergy:

  • A global retailer might Shift-Left by embedding automated regression tests in its data pipelines.
  • It then Shifts-Right by using AI-based anomaly detection in production dashboards to monitor data drift, freshness, and latency.
  • Insights from production failures are looped back into early validation scripts closing the quality loop.

4. The AI & Automation Angle

Today, AI and AIOps (AI for IT Operations) are supercharging both shifts:

  • Shift-Left AI: Predictive code scanning, intelligent test generation, and synthetic data generation.
  • Shift-Right AI: Real-time anomaly detection, predictive incident management, and self-healing automation.

The result? Enterprises move from manual monitoring to autonomous operations, freeing up teams to focus on innovation instead of firefighting.

The future of enterprise IT and data operations isn’t about reacting to problems – it’s about preventing and learning from them continuously.
“Shift-Left” ensures quality is baked in early; “Shift-Right” ensures reliability is sustained over time.

Together, they represent the heart of a modern DevOps and DataOps culture — a loop of prevention, observation, and evolution.

Canary Deployment Explained: Reducing Production Risk in DevOps with Controlled Releases

Canary deployment is one of those DevOps terms that sounds abstract but is actually a very clever, real-world technique used by top tech companies like Netflix, Amazon, and Google.

Let’s unpack it in a clear, practical way –

What Is a Canary Deployment?

A canary deployment is a progressive rollout strategy where a new version of an application (or data pipeline, model, etc.) is released to a small subset of users or systems first, before deploying it to everyone.

The goal: Test in the real world, minimize risk, and catch issues early – without impacting all users.

Where the Name Comes From

The term comes from the old “canary in a coal mine” practice.

  • Miners used to carry a canary bird underground.
  • If dangerous gases were present, the bird would show distress first – warning miners before it was too late.

Similarly, in software deployment:

  • The “canary group” gets the new version first.
  • If problems occur (e.g., errors, latency spikes, or crashes), the rollout stops or rolls back.
  • If all looks good, the new version gradually reaches 100% of users.

How It Works in Practice

Here’s the step-by-step flow:

  1. Deploy new version (v2) to a small portion of traffic (say 5-10%).
  2. Monitor key metrics: performance, error rates, user engagement, latency, etc.
  3. Compare results between the canary version and the stable version (v1).
  4. If KPIs are healthy, automatically scale up rollout (20%, 50%, 100%).
  5. If issues arise, rollback instantly to the previous version.

Example Scenarios

1) Web / App Deployment

A global streaming platform like Netflix releases a new recommendation algorithm:

  • First, 5% of users in Canada get the new algorithm.
  • Netflix monitors playback time, user retention, and error logs.
  • If everything looks good, it expands to North America, then globally.

2) Data Pipeline or Analytics System

A retailer introduces a new real-time data ingestion flow:

  • It runs in parallel with the old batch flow for one region (the canary).
  • Teams compare data accuracy, latency, and system load.
  • After validation, the new pipeline fully replaces the old one.

Benefits

BenefitDescription
Reduced RiskProblems affect only a small user group initially
Faster FeedbackReal-world validation of performance & stability
Controlled RolloutGradual scaling based on metrics
Easy RollbackQuick reversion to the stable version if issues occur

Challenges

  • Requires strong observability and real-time monitoring tools (like Datadog, Prometheus, or Azure Monitor).
  • Needs automated rollback scripts and infrastructure-as-code setup.
  • Works best in containerized environments (Kubernetes, Docker, etc.) for version control and isolation.

In Summary

Canary deployment = “Release small, observe fast, scale safely.”

It’s a smart middle ground between risky full releases and overly cautious manual rollouts – ensuring continuous innovation with minimal disruption.

Databricks AI/BI: What It Is & Why Enterprises Should Care

In the world of data, modern enterprises wrestle with three big challenges: speed, accuracy, and usability. You want insights fast, you want them reliable, and you want non‐technical people (execs, marketers, operations) to be able to get value without depending constantly on data engineers.

That’s where Databricks AI/BI comes in—a newer offering from Databricks that blends business intelligence with AI so that insights become more accessible, real‐time, and trustworthy.

What is Databricks AI/BI?

Databricks AI/BI is a product suite that combines a low-code / no-code dashboarding environment with a conversational interface powered by AI. Key components include:

  • AI/BI Dashboards: Allows users to create interactive dashboards and visualizations, often using drag-and-drop or natural-language prompts. The dashboards integrate with Databricks’ SQL warehouses and the Photon engine for high performance.
  • Genie: A conversational, generative-AI interface where users can ask questions in natural language, get responses in visuals or SQL, dig deeper through follow-ups, get suggested visualizations, etc. It learns over time via usage and feedback.
  • Built on top of Unity Catalog, which handles governance, lineage, permissions. This ensures that all dashboards or responses are trustable and auditable.
  • Native integration with Databricks’ data platform (SQL warehouses, Photon engine, etc.), so enterprises don’t need to extract data elsewhere for BI. This improves freshness, lowers duplication and simplifies management.

Databricks Genie

AI/BI Genie uses a compound AI system rather than a single, monolithic AI model.

Matei Zaharia and Ali Ghodsi, two of the founders of Databricks, describe a compound AI system as one that “tackles AI tasks using multiple interacting components, including multiple calls to models, retrievers, or external tools.”

Use Cases: How Enterprises Are Using AI/BI

Here are some of the ways enterprises are applying it, or can apply it:

  1. Ad-hoc investigations of customer behaviour
    Business users (marketing, product) can use Genie to ask questions like “Which customer cohorts churned in last quarter?” or “How did a campaign perform in region X vs Y?”, without waiting for engineers to build SQL pipelines.
  2. Operational dashboards for teams
    For operations, supply chain, finance etc., dashboards that update frequently, with interactive filtering, cross-visualization slicing, giving teams real-time monitoring.
  3. Reducing the BI backlog and bottlenecks
    When data teams are overwhelmed by requests for new dashboards, having tools that enable business users to do more themselves frees up engineering to focus on more strategic work (data pipelines, ML etc.).
  4. Governance and compliance
    Enterprises in regulated industries (finance, healthcare, etc.) need traceability: where data came from, who used it, what transformations it passed through. With Unity Catalog lineage + trusted assets in Databricks, AI/BI supports that.
  5. Data democratization
    Spreading data literacy: by lowering the barrier, a wider set of users can explore, ask questions, derive insights. This builds a data culture.
  6. Integration with ML / AI workflows
    Because it’s on Databricks, it’s easier to connect dashboards & conversational insights with predictive models, possibly bringing in forecasts, anomaly detection etc., or even embedding BI into AI‐powered apps.

Comparison

FeatureDatabricks AI/BI + GenieTableau Ask DataPower BI (with Copilot / Q&A)
Parent PlatformDatabricks Lakehouse (unified data, AI & BI)Tableau / Salesforce ecosystemMicrosoft Fabric / Power Platform
Core VisionUnify data, AI, and BI in one governed Lakehouse. BI happens where data lives.Simplify visualization creation via natural language.Infuse Copilot into all Microsoft tools — including BI — for everyday productivity.
AI LayerGenie – a generative AI agent trained on enterprise data, governed by Unity Catalog.Ask Data – NLP-based query translation for Tableau data sources.Copilot / Q&A – GPT-powered natural language for Power BI datasets, integrated into Fabric.
Underlying Data ModelDatabricks SQL Warehouse (Photon Engine) – operates directly on Lakehouse data (no extracts).Extract-based (Hyper engine) or live connection to relational DBs.Semantic Model / Tabular Dataset inside Power BI Service.
GovernanceStrong – via Unity Catalog (data lineage, permissions, certified datasets).Moderate – uses Tableau permissions and data source governance.Strong – via Microsoft Purview + Fabric unified governance.
User ExperienceConversational (chat-style) + dashboard creation. Unified with AI/BI dashboards.Type queries in Ask Data → generates visual. Embedded inside Tableau dashboards.Ask natural language inside Power BI (Q&A) or use Copilot to auto-build visuals/reports.
PerformanceVery high (Photon vectorized execution). Real-time queries on raw or curated data.Depends on extract refresh or live connection.Excellent on in-memory Tabular Models; limited by dataset size.
AI CustomizationUses enterprise metadata from Unity Catalog; can fine-tune prompts with context.Limited NLP customization (no fine-tuning).Some customization using “synonyms” and semantic model metadata.
Integration with ML/AI ModelsNatively integrated (Lakehouse supports MLflow, feature store, LLMOps).External ML integration (via Salesforce Einstein or Python).Integrated via Microsoft Fabric + Azure ML.
Ideal User PersonaEnterprises already in Databricks ecosystem (data engineers, analysts, PMs, CXOs).Business analysts and Tableau users who want easier visual exploration.Office 365 / Azure enterprises seeking seamless Copilot-powered analytics.

Conclusion

Databricks AI/BI is a powerful step forward in the evolution of enterprise analytics. It blends BI and AI so that enterprises can move faster, more securely, and more democratically with their data.

All three tools represent the evolution of Business Intelligence toward “AI-Native BI.” But here’s the philosophical difference:

  • Tableau → still visualization-first, AI as a helper.
  • Power BI → productivity-first, AI as a co-pilot.
  • Databricksdata-first, AI as the core intelligence layer that unifies data, analytics, and governance.

For organizations that already use Databricks or are building a data lakehouse / unified analytics platform, AI/BI offers a way to deprecate some complex pipelines, reduce their BI backlog, bring more teams into analytics, while maintaining governance and performance.

References:

https://learn.microsoft.com/en-us/azure/databricks/genie

https://atlan.com/know/databricks/databricks-ai-bi-genie

Machine Learning Without Fear: The Simple Math You Really Need to Know

When you hear “Machine Learning,” you might imagine walls of equations and Greek letters — but here’s a secret:

The math behind ML isn’t scary — it’s just describing how we humans learn from patterns.

Let’s decode it together, step by step, using things you already understand.

1. Statistics — Learning from Past Experience

Imagine you run a small café.
Every day, you note:

  • How many people came in,
  • What they ordered,
  • What the weather was like.

After a few months, you can guess:

  • “Rainy days = more coffee orders”
  • “Weekends = more desserts”

That’s Statistics in a nutshell — using past data to make smart guesses about the future.

Key ideas (in café language)

ConceptSimple ExplanationWhy It Matters in ML
Average (Mean)The typical day at your café.Models find the average behavior in data.
VariationSome days are busier, some quieter.Helps models know what’s “normal” or “unusual.”
Probability“If it rains, there’s a 70% chance coffee sales go up.”Used for making predictions under uncertainty.
Bayes’ TheoremWhen you get new info (e.g., forecast says rain), you update your belief about sales.Helps AI update its understanding as it gets new data.

Real-world ML use:

  • Spam detection: “Emails with 90% chance of having words like ‘win’ or ‘offer’ = spam.”
  • Credit card fraud: “Unusual spending = possible fraud.”

2. Linear Algebra — Understanding Data as Tables

Let’s stick with your café.

Every customer can be described by numbers:

  • Age
  • Time of visit
  • Amount spent

If you record 100 customers, you now have a big table — 100 rows and 3 columns.

That’s a matrix.
And the way you manipulate, compare, or combine these tables? That’s Linear Algebra.

Key ideas (in real-world terms)

ConceptEveryday AnalogyWhy It Matters in ML
VectorA list of numbers (like each customer’s data).One vector per customer, image, or product.
MatrixA big table full of vectors (like your sales spreadsheet).The main format for all data in ML.
Matrix MultiplicationCombining two tables — like linking customer orders with menu prices to find total sales.Neural networks do this millions of times per second.
Dimensionality ReductionIf you have too many columns (like 100 features), you find the most important ones.Speeds up ML models and removes noise.

Real-world ML use:

  • In image recognition: Each image = a giant table of pixel numbers.
    The computer uses matrix math to detect shapes, edges, and faces.
    (Like combining Lego blocks to build a face piece by piece.)

3. Calculus — The Math of Improvement

Imagine your café prices are too high — people stop coming.
If they’re too low — you don’t make profit.

So, you adjust slowly — a few rupees up or down each week — until you hit the sweet spot.

That’s what Calculus does in ML — it teaches the model how to adjust until it performs best.

Key ideas (in plain English)

ConceptAnalogyWhy It Matters in ML
Derivative / GradientThink of it as your “profit slope.” If the slope is going up, keep going that way. If it’s going down, change direction.Used to find which model parameters to tweak.
Gradient DescentLike walking down a hill blindfolded — one small step at a time, feeling which way is downhill.How models learn — by slowly reducing their “error.”
BackpropagationWhen the model realizes it made a mistake, it walks back through the steps and adjusts everything.How neural networks correct themselves.

Real-world ML use:

  • When you train an AI to recognize cats, it guesses wrong at first.
    Then, calculus helps it slowly tweak its “thinking” until it gets better and better.

4. Probability — The Science of “How Likely”

Let’s say your café app tries to predict what a customer will order.

It might say:

  • 70% chance: Cappuccino
  • 20% chance: Latte
  • 10% chance: Croissant

The app doesn’t know for sure — it just predicts what’s most likely.
That’s probability — the core of how AI deals with uncertainty.

Real-world ML use:

  • Predicting the chance a patient has a disease based on symptoms.
  • Suggesting the next movie you’ll probably like.

5. Optimization — Finding the Best Possible Answer

Optimization is just a fancy word for fine-tuning decisions.

Like:

  • What’s the best coffee price?
  • What’s the fastest delivery route?
  • What’s the lowest error in prediction?

Machine Learning uses optimization to find the best set of parameters that make predictions most accurate.

Real-world ML use:

  • Uber uses optimization to match drivers and riders efficiently.
  • Airlines use it to plan routes that save fuel and time.

The Big Picture: How It All Connects

StageWhat’s HappeningThe Math Behind It
Collecting DataYou record what’s happeningStatistics
Representing DataYou store it as rows and columnsLinear Algebra
Learning from DataYou tweak the model until it performs wellCalculus + Optimization
Making PredictionsYou estimate what’s most likelyProbability
EvaluatingYou check how good your guesses areStatistics again!

Final Analogy: The Learning Café

RoleIn Your CaféIn ML
StatisticsStudying what sells bestUnderstanding patterns
Linear AlgebraOrganizing all your customer dataRepresenting data
CalculusAdjusting prices and offersImproving model accuracy
ProbabilityGuessing what customers might buyMaking predictions
OptimizationFinding best combo of price & menuFine-tuning model for best results

In short:

Machine Learning is just a smart café — serving predictions instead of coffee!

It learns from data (customers), improves over time (adjusting recipes), and uses math as the recipe book that makes everything work smoothly.

Understanding Tribes, Guilds, Pods/Squads in Agile

When working with large enterprises, understanding the organizational constructs of scaled Agile deliveryTribes, Guilds, Pods, ARTs, PI Planning, and more – is critical. These aren’t just buzzwords; they define how data, analytics, and product teams operate together at scale under frameworks like SAFe (Scaled Agile Framework) or Spotify Model (which many organizations have blended).

Let’s unpack everything in simple, visual-friendly terms

Big Picture: Why These Structures Exist

When Agile scaled beyond small software teams, companies realized:

  • One team can’t own end-to-end delivery for large systems.
  • But dozens of Agile teams working in silos = chaos.
  • Hence, Scaled Agile introduced structures that balance autonomy + alignment.

That’s where Tribes, Pods, Guilds, ARTs, Value Streams, and Chapters come in.

Key Organizational Constructs in SAFe + Spotify-style Agile

TermOriginWhat It MeansTypical Use in D&A / Tech Organizations
PodSpotify modelA small, cross-functional team (6–10 people) focused on a single feature, domain, or use-case.e.g., “Revenue Analytics Pod” with Data Engineer, BI Developer, Data Scientist, Product Owner.
SquadSpotify modelSimilar to a Pod — autonomous Agile team that delivers end-to-end functionality.e.g., “Guest Personalization Squad” responsible for AI-driven recommendations.
TribeSpotify modelA collection of related Pods/Squads working on a common business domain.e.g., “Customer 360 Tribe” managing all loyalty, guest data, and personalization products.
ChapterSpotify modelA functional community across squads — ensures consistency in technical skills, tools, and best practices.e.g., Data Engineering Chapter, BI Chapter, Data Science Chapter.
GuildSpotify modelA community of interest that cuts across the org — informal learning or best-practice sharing group.e.g., Cloud Cost Optimization Guild, AI Ethics Guild.
ART (Agile Release Train)SAFeA virtual organization (50–125 people) of multiple Agile teams aligned to a common mission & cadence (PI).e.g., “D&A Platform ART” delivering all analytics platform capabilities.
Value StreamSAFeA higher-level grouping of ARTs focused on delivering a business outcome.e.g., “Customer Experience Value Stream” containing ARTs for loyalty, personalization, and customer analytics.
PI (Program Increment)SAFeA fixed timebox (8–12 weeks) for ARTs to plan, execute, and deliver.Enterprises do PI Planning quarterly across D&A initiatives.
RTE (Release Train Engineer)SAFeThe chief scrum master of the ART — facilitates PI planning, removes impediments.Coordinates between multiple pods/squads.
Product Owner (PO)AgileOwns the team backlog; defines user stories and acceptance criteria.Often aligned with one pod/squad.
Product Manager (PM)SAFeOwns the program backlog (features/epics) and aligns with business outcomes.Defines strategic direction for ART or Tribe.
Solution TrainSAFeCoordinates multiple ARTs when the solution is large (enterprise-level).e.g., Enterprises coordinating multiple ARTs for org-wide data modernization.
CoE (Center of Excellence)Enterprise termA centralized body for governance, standards, and enablement.e.g., Data Governance CoE, AI/ML CoE, BI CoE.

What is unique with Spotify-model?

The Spotify model champions team autonomy, so that each team (or Squad) selects their framework (e.g. Scrum, Kanban, Scrumban, etc.). Squads are organized into Tribes and Guilds to help keep people aligned and cross-pollinate knowledge. For more details on this, I encourage you to read this article.

There is also one more useful material on Scaling Agile @ Spotify.

Simplified Analogy

Think of a cruise ship 🙂

Cruise ConceptAgile Equivalent
The ShipThe Value Stream (business goal)
Each DeckAn ART (Agile Release Train) – a functional area like Guest Analytics or Revenue Ops
Each Department on DeckA Tribe (Marketing, Data, IT Ops)
Teams within DepartmentPods/Squads working on features
Crew with Same Skill (Chefs, Engineers)Chapters – same skill family
Community of Passion (Wine Enthusiasts)Guilds – voluntary learning groups
Captain / OfficersRTE / Product Manager / Architects

In a Data & Analytics Organization (Example Mapping)

Agile ConstructD&A Example
Pod / SquadLoyalty Analytics Pod building retention dashboards and models.
TribeCustomer 360 Tribe uniting Data Engineering, Data Science, and BI pods.
ChapterData Quality Chapter ensuring consistent metrics, lineage, and governance.
GuildAI Experimentation Guild sharing learnings across data scientists.
ARTD&A Platform ART orchestrating data ingestion, governance, and MLOps.
PI PlanningQuarterly sync for backlog prioritization and dependency resolution.
RTE / PMEnsuring alignment between business priorities and data delivery roadmap.

Summary

  • Pods/Squads → Smallest Agile unit delivering value.
  • Tribes → Group of pods delivering a shared outcome.
  • Chapters → Skill-based group ensuring quality & standards.
  • Guilds → Interest-based communities sharing best practices.
  • ARTs / Value Streams → SAFe structures aligning all of the above under a common business mission.
  • PI Planning → The synchronization event to plan and execute at scale.
Software Is Changing (Again) – The Dawn of Software 3.0

When Andrej Karpathy titled his recent keynote “Software Is Changing (Again),” it wasn’t just a nice slogan. It marks what he argues is a fundamental shift in the way we build, think about, and interact with software. Based on his talk at AI Startup School (June 2025), here’s what “Software 3.0” means, why it matters, and how you can prepare.

What Is Changing: The Three Eras

Karpathy frames software evolution in three eras:

EraWhat it meantKey Characteristics
Software 1.0Traditional code-first era: developers write explicit instructions, rules, algorithms. Think C++, Java, manual logic.Highly deterministic, rule-based; heavy human specification; hard to scale certain tasks especially with unstructured or subtle data.
Software 2.0Rise of machine learning / neural nets: train models on data vs hand-coding every condition. The model “learns” patterns.Better handling of unstructured data (images, text), but still needs labeled data, training, testing, deployment. Not always interpretable.
Software 3.0The new shift: large language models (LLMs) + natural language / prompt-driven interfaces become first-class means of programming. “English as code,” vibe coding, agents, prompt/context engineering.You describe what you want; software (via LLMs) helps shape it. More autonomy, more natural interfaces. Faster prototyping. But also new risks (hallucinations, brittleness, security, lack of memory) and need for human oversight.

Let’s break down what he means.

The “Old” World: Software 1.0

For the last 50+ years, we’ve lived in the era of what Karpathy calls “Software 1.0.” This is the software we all know and love (or love to hate). It’s built on a simple, deterministic principle:

  1. A human programmer writes explicit rules in a programming language like Python, C++, or Java.
  2. The computer compiles these rules into binary instructions.
  3. The CPU executes these instructions, producing a predictable output for a given input.

Think of a tax calculation function. The programmer defines the logic: if income > X, then tax = Y. It’s precise, debuggable, and entirely human-written. The programmer’s intellect is directly encoded into the logic. The problem? Its capabilities are limited by the programmer’s ability to foresee and explicitly code for every possible scenario. Teaching a computer to recognize a cat using Software 1.0 would require writing millions of lines of code describing edges, textures, and shapes—a nearly impossible task.

The Emerging World: Software 2.0

The new paradigm, “Software 2.0,” is a complete inversion of this process. Instead of writing the rules, we curate data and specify a goal.

  1. A human programmer (or, increasingly, an “AI Engineer”) gathers a large dataset (e.g., millions of images of cats and “not-cats”).
  2. They define a flexible, neural network architecture—a blank slate capable of learning complex patterns.
  3. They specify a goal or a “loss function” (e.g., “minimize the number of incorrect cat identifications”).
  4. Using massive computational power (GPUs/TPUs), an optimization algorithm (like backpropagation) searches the vast space of possible neural network configurations to find one that best maps the inputs to the desired outputs.

The “code” of Software 2.0 isn’t a set of human-readable if/else statements. It’s the learned weights and parameters of the neural network—a massive matrix of numbers that is completely inscrutable to a human. We didn’t write it; we grew it from data.

As Karpathy famously put it, think of the neural network as the source code, and the process of training as “compiling” the data into an executable model.

The Rise of the “AI Engineer” and the LLM Operating System: Software 3.0

Karpathy’s most recent observations take this a step further with the explosion of Large Language Models (LLMs) like GPT-4. He describes the modern AI stack as a new kind of operating system.

In this analogy:

  • The LLM is the CPU—the core processor, but for cognitive tasks.
  • The context window is the RAM—the working memory.
  • Prompting is the programming—the primary way we “instruct” this new computer.
  • Tools and APIs (web search, code execution, calculators) are the peripherals and I/O.

This reframes the role of the “AI Engineer.” They are now orchestrating these powerful, pre-trained models, “programming” them through sophisticated prompting, retrieval-augmented generation (RAG), and fine-tuning to build complex applications. This is the practical, applied side of the Software 2.0 revolution that is currently creating a gold rush in the tech industry.

Core Themes from Karpathy’s Keynote

Here are some of the biggest insights:

  • Natural Language as Programming Interface: Instead of writing verbose code, developers (and increasingly non-developers) can prompt LLMs in English (or human language) to generate code, UI, workflows. Karpathy demos “MenuGen,” a vibe-coding app prototype, as example of how quickly one can build via prompts.
  • LLMs as the New Platform / OS: Karpathy likens current LLMs to utilities or operating systems: infrastructure layers that provide default capabilities and can be built upon. Labs like OpenAI, Anthropic become “model fabs” producing foundational layers; people will build on top of them.
  • Vibe Coding & Prompt Engineering: He introduces / popularizes the idea of “vibe coding” — where the code itself feels less visible, you interact via prompts, edits, possibly via higher levels of abstraction. With that comes the need for better prompt or context engineering to reduce errors.
  • Jagged Intelligence: LLMs are powerful in some domains, weak in others. They may hallucinate, err at basic math, or make logically inconsistent decisions. Part of working well with this new paradigm is designing for those imperfections. Human-in-the-loop, verification, testing.
  • Building Infrastructure for Agents: Karpathy argues that software needs to be architected so LLMs / agents can interact with it, consume documentation, knowledge bases, have memory/context, manage feedback loops. Things like llms.txt, agent-friendly docs, and file & knowledge storage that is easy for agents to read/interpret.

Final Thoughts: Is This Another Shift – Or the Same One Again?

Karpathy would say this is not just incremental – “Software is changing again” implies something qualitatively different. In many ways, Software 3.0 composes both the lessons of Software 1.0 (like performance, correctness, architectural rigor) and Software 2.0 (learning from data, dealing with unstructured inputs), but adds a layer where language, agents, and human-AI collaboration become central.

In a nutshell: we’re not just upgrading the tools; we’re redefining what software means.

Vibe Coding: The Future of Intuitive Human-AI Collaboration

In the last decade, coding has undergone multiple evolutions – from low-code to no-code platforms, and now, a new paradigm is emerging: Vibe Coding. Unlike traditional coding that demands syntax mastery, vibe coding focuses on intent-first interactions, where humans express their needs in natural language or even visual/gestural cues, and AI translates those “vibes” into functional code or workflows.

Vibe coding is the emerging practice of expressing your intent in natural language – then letting artificial intelligence (AI), typically a large language model (LLM), turn your request into real code. Instead of meticulously writing each line, users guide the AI through prompts and incremental feedback.

The phrase, popularized in 2025 by Andrej Karpathy, means you focus on the big-picture “vibes” of your project, while AI brings your app, script, or automation to life. Think of it as shifting from “telling the computer what to do line by line” to “expressing what you want to achieve, and letting AI figure out the how.”

What Exactly Is Vibe Coding?

Vibe coding is the practice of using natural, context-driven prompts to co-create software, analytics models, or workflows with AI. Instead of spending time memorizing frameworks, APIs, or libraries, you explain the outcome you want, and the system translates it into executable code.

It’s not just about speeding up development — it’s about democratizing problem-solving for everyone, not just developers.

Who Can Benefit from Vibe Coding?

1. Software Developers

  • Use Case: A full-stack developer wants to prototype a new feature for a web app. Instead of manually configuring routes, data models, and UI components, they describe:
    “Build me a login page with Google and Apple SSO, a dark theme toggle, and responsive design.”
  • Impact: Developers move from repetitive coding to higher-order design and architecture decisions.
  • Tools: GitHub Copilot, Replit, Cursor IDE.

2. Data Scientists

  • Use Case: A data scientist is exploring customer churn in retail. Instead of hand-coding all preprocessing, they vibe with the AI:
    “Clean this dataset, remove outliers, and generate the top 5 predictors of churn with SHAP explanations.”
  • Impact: Faster experimentation and less time lost in boilerplate tasks like data cleaning.
  • Tools: Jupyter Notebooks with AI assistants, Dataiku

3. Business Professionals (Non-Technical Users)

  • Use Case: A marketing manager needs a personalized email campaign targeting lapsed customers. Instead of calling IT or external agencies, they simply ask:
    “Create a 3-email reactivation journey for customers who haven’t purchased in 90 days, with subject lines optimized for open rates.”
  • Impact: Empowers business teams to execute data-driven campaigns without technical bottlenecks.
  • Tools: Jasper, Canva, HubSpot with AI assistants, ChatGPT plugins.

Case-study: Vanguard & the Webpage-Prototype Case in Vibe Coding

“Even financial giants like Vanguard are using vibe coding to prototype webpages — cutting design/prototyping time from ~two weeks to ~20 minutes.”

Vanguard’s Divisional Chief Information Officer for Financial Adviser Services (Wilkinson) described how Vanguard’s team (product + design + engineering) is using vibe coding to build new webpages more quickly. Andrew Maddox

They reported that a new webpage which used to take ~2 weeks to design/prototype now takes 20 minutes via this vibe-coding process. That’s about a 40% speedup (or more, depending on what part of the process you’re comparing) in prototyping/design handoff etc.

The caveat: engineers are still very involved — particularly in defining boundaries, quality / security guard rails, ensuring what the AI or product/design people produce makes sense and is safe / maintainable.

Why Vibe Coding Matters

  • Bridges the gap between technical and non-technical stakeholders.
  • Accelerates innovation by reducing time spent on repetitive, low-value tasks.
  • Fosters creativity, allowing people to focus on “what” they want instead of “how” to build it.
  • Democratizes AI/ML adoption, giving even small businesses the ability to leverage advanced tools.

  • Lovable: Full-stack web apps; “dream out loud, deploy in minutes”.
  • Bolt: Integrates with Figma, GitHub, Stripe; great for visual + technical users.
  • Cursor: Chat-based AI coding, integrates with local IDE and version control.
  • Replit: Cloud IDE, easy deployment, collaborative.
  • Zapier Agents: No-code workflows automated by AI

The Road Ahead

Vibe coding is not about replacing developers, analysts, or business strategists — it’s about elevating them. The people who thrive in this new era won’t just be coders; they’ll be designers of intent, skilled in articulating problems and curating AI-driven solutions.

In the future, asking “what’s the vibe?” may not just be slang — it might be the most powerful way to code.

From BOT to Co-Innovation: Emerging Client–Service Provider Operating Models in IT and Analytics

In today’s hyper-competitive business environment, IT, analytics, and data functions are no longer just support arms – they are core drivers of growth, innovation, and customer experience. As organizations seek to unlock value from technology and data at scale, the way they engage with external service providers is evolving rapidly.

Gone are the days when a single outsourcing contract sufficed. Instead, we’re seeing flexible, outcome-oriented, and co-ownership-driven operating models that deliver speed, scalability, and sustained impact.

This article explores some common, successful, and emerging operating models between enterprise clients and IT/Analytics/Data services firms, focusing on sustainability, strategic value, and growth potential for the vendor

Established & Common Models

  1. Staff Augmentation:
    • How it Works: You provide individual skilled resources (Data Engineers, BI Analysts, ML Scientists) to fill specific gaps within the client’s team. Client manages day-to-day tasks.
    • Pros (Client): Quick access to skills, flexibility, lower perceived cost.
    • Pros (Vendor): Easy to sell, predictable FTE-based revenue.
    • Cons (Vendor): Low strategic value, commoditized, easily replaced, limited growth per client. Revenue = # of Resources.
    • When it Works: Short-term peaks, very specific niche skills, initial relationship building.
  2. Project-Based / Statement of Work (SOW):
    • How it Works: You deliver a defined project (e.g., “Build a Customer 360 Dashboard,” “Migrate Data Warehouse to Cloud”). Fixed scope, timeline, price (or T&M). Build-Operate-Transfer (BOT) model is one such example where you build the capability (people, processes, platforms), operate it for a fixed term, and then transfer it to the client.
    • Pros (Client): Clear deliverables, outcome-focused (for that project), controlled budget.
    • Pros (Vendor): Good for demonstrating capability, potential for follow-on work.
    • Cons (Vendor): Revenue stops at project end (“project cliff”), constant re-sales effort, scope creep risks, less embedded relationship. Revenue = Project Completion.
    • When it Works: Well-defined initiatives, proof-of-concepts (PoCs), specific technology implementations.
  3. Managed Services / Outsourcing:
    • How it Works: You take full responsibility for operating and improving a specific function or platform based on SLAs/KPIs (e.g., “Manage & Optimize Client’s Enterprise Data Platform,” “Run Analytics Support Desk”). Often priced per ticket/user/transaction or fixed fee.
    • Pros (Client): Predictable cost, risk transfer, access to specialized operational expertise, focus on core business.
    • Pros (Vendor): Steady, annuity-like revenue stream, deeper client integration, opportunity for continuous improvement upsells.
    • Cons (Vendor): Can become commoditized, intense SLA pressure, requires significant operational excellence. Revenue = Service Delivery.
    • When it Works: Mature, stable processes requiring ongoing maintenance & optimization (e.g., BI report production, data pipeline ops).

Strategic & High-Growth Models (Increasingly Common)

  1. Dedicated Teams / “Pods-as-a-Service” (Evolution of Staff Aug):
    • How it Works: You provide a pre-configured, cross-functional team (e.g., 1 Architect + 2 Engineers + 1 Analyst) working exclusively for the client, often embedded within their GCC. You manage the team’s HR/performance; the client directs the work.
    • Pros (Client): Scalable capacity, faster startup than hiring, retains control.
    • Pros (Vendor): Stronger stickiness than individual staff aug, predictable revenue (based on team size), acts as a “foot in the door” for broader work. Revenue = Team Size.
    • Emerging Twist: Outcome-Based Pods: Pricing linked partially to team output or value metrics (e.g., features delivered, data quality improvement).
  2. Center of Excellence (CoE) Partnership (Strategic):
    • How it Works: Jointly establish and operate a CoE within the client’s organization (often inside their GCC). You provide leadership, methodology, IP, specialized skills, and training. Mix of your and client staff. A GCC could have multiple CoEs within it and each client business unit can customize their operating model like BOT, BOTT. In BOTT (Build-Operate-Transform-Transfer), you are adding a transformation phase (modernization / automation) before transfer it to the client to maximize value and maturity.
    • Pros (Client): Accelerated capability build, access to best practices/IP, innovation engine.
    • Pros (Vendor): Deep strategic partnership, high-value positioning (beyond delivery), revenue from retained expertise/IP/leadership roles, grows as CoE scope expands. Revenue = Strategic Partnership + Services.
    • Key for Growth: Positioned for all high-value work generated by the CoE.
  3. Value-Based / Outcome-Based Pricing:
    • How it Works: Fees tied directly to measurable business outcomes achieved (e.g., “% reduction in equipment maintenance downtime,” “$ increase in ancillary revenue per customer,” “hours saved in operations planning”). Often combined with another model (e.g., CoE or Managed Service).
    • Pros (Client): Aligns vendor incentives with client goals, reduces risk, pays for results.
    • Pros (Vendor): Commands premium pricing, demonstrates true value, transforms relationship into strategic partnership. Revenue = Client Success.
    • Challenges: Requires strong trust, robust measurement, shared risk.

Emerging & Innovative Models

  1. Product-Led Services / “IP-as-a-Service”:
    • How it Works: Bundle your proprietary analytics platforms, accelerators, or frameworks with the services to implement, customize, and operate them for the client (e.g., “Your Customer Churn Prediction SaaS Platform + Implementation & Managed Services”). Recurring license/subscription + services fees.
    • Pros (Client): Faster time-to-value, access to cutting-edge IP without full build.
    • Pros (Vendor): High differentiation, recurring revenue (licenses), strong lock-in (healthy, value-based). Revenue = IP + Services.
    • Emerging: Industry-Specific Data Products: Pre-built data models/analytics for client’s domain (e.g., predictive maintenance suite).
  2. Joint Innovation / Venture Model:
    • How it Works: Co-invest with the client to develop net-new data/AI products or capabilities. Share risks, costs, and rewards (e.g., IP ownership, revenue share). Often starts with a PoC funded jointly.
    • Pros (Client): Access to innovation without full internal investment, shared risk.
    • Pros (Vendor): Deepest possible partnership, potential for significant upside beyond fees, positions as true innovator.
    • Cons: High risk, complex legal/financial structures. Requires visionary clients.
  3. Ecosystem Orchestration:
    • How it Works: Position your firm as the “quarterback” managing multiple vendors/platforms (e.g., Snowflake, Databricks, AWS) within the client’s data/analytics landscape (e.g., you integrate cloud platforms, data providers, and niche AI vendors). Charge for integration, governance, and overall value realization.
    • Pros (Client): Simplified vendor management, ensures coherence, maximizes overall value.
    • Pros (Vendor): Highly strategic role, sticky at the architectural level. Revenue = Orchestration Premium.

Key Trends Shaping Successful Models

  1. Beyond Resources to Outcomes: Clients demand measurable business impact, not just FTEs or project completion.
  2. Co-Location & Integration: Successful vendors operate within client structures (like GCCs/CoEs), adopting their tools and governance.
  3. As-a-Service Mindset: Clients want consumption-based flexibility (scale up/down easily).
  4. IP & Innovation Premium: Vendors with unique, valuable IP command higher margins and loyalty.
  5. Risk/Reward Sharing: Willingness to tie fees to outcomes builds trust and strategic alignment.
  6. Focus on Enablement: Successful vendors actively transfer knowledge and build client capability

The “right” operating model isn’t static – it evolves with the client’s business priorities, tech maturity, and market conditions. Successful partnerships in IT, analytics, and data are increasingly hybrid, combining elements from multiple models to balance speed, cost, flexibility, and innovation.

Forward-looking service providers are positioning themselves not just as vendors, but as strategic co-creators – integrated into the client’s ecosystem, jointly owning outcomes, and driving continuous transformation.

LLM, RAG, AI Agent & Agentic AI – Explained Simply with Use Cases

As AI continues to dominate tech conversations, several buzzwords have emerged – LLM, RAG, AI Agent, and Agentic AI. But what do they really mean, and how are they transforming industries?

This article demystifies these concepts, explains how they’re connected, and showcases real-world applications in business.

1. What Is an LLM (Large Language Model)?

A Large Language Model (LLM) is an AI model trained on massive text datasets to understand and generate human-like language.

Think: ChatGPT, Claude, Gemini, or Meta’s LLaMA. These models can write emails, summarize reports, answer questions, translate languages, and more.

Key Applications:

  • Customer support: Chatbots that understand and respond naturally
  • Marketing: Generating content, email copy, product descriptions
  • Legal: Drafting contracts or summarizing case laws
  • Healthcare: Medical coding, summarizing patient records

2. What Is RAG (Retrieval-Augmented Generation)?

RAG is a technique that improves LLMs by giving them access to real-time or external data.

LLMs like GPT-4 are trained on data until a certain point in time. What if you want to ask about today’s stock price or use your company’s internal documents?

RAG = LLM + Search Engine + Brain.

It retrieves relevant data from a knowledge source (like a database or PDFs) and then lets the LLM use that data to generate better, factual answers.

Key Applications:

  • Enterprise Search: Ask a question, get answers from your company’s own documents
  • Financial Services: Summarize latest filings or regulatory changes
  • Customer Support: Dynamic FAQ bots that refer to live documentation
  • Healthcare: Generate answers using latest research or hospital guidelines

3. What Is an AI Agent?

An AI Agent is like an employee with a brain (LLM), memory (RAG), and hands (tools).

Unlike a chatbot that only replies, an AI Agent takes action—booking a meeting, updating a database, sending emails, placing orders, and more. It can follow multi-step logic to complete a task with minimal instructions.

Key Applications:

  • Travel: Book your flight, hotel, and taxi – all with one prompt
  • HR: Automate onboarding workflows or employee helpdesk
  • IT: Auto-resolve tickets by diagnosing system issues
  • Retail: Reorder stock, answer queries, adjust prices autonomously

4. What Is Agentic AI?

Agentic AI is the next step in evolution. It refers to AI systems that show autonomy, memory, reflection, planning, and goal-setting – not just completing a single task but managing long-term objectives like a project manager.

While today’s AI agents follow rules, Agentic AI acts like a team member, learning from outcomes and adapting to achieve better results over time.

Key Applications:

  • Sales: An AI sales rep that plans outreach, revises tactics, and nurtures leads
  • Healthcare: Virtual health coach that tracks vitals, adjusts suggestions, and nudges you daily
  • Finance: AI wealth advisor that monitors markets, rebalances portfolios
  • Enterprise Productivity: Multi-agent teams that run and monitor full business workflows

Similarities & Differences

FeatureLLMRAGAI AgentAgentic AI
Generates text
Accesses external data❌ (alone)
Takes actions
Plans over timeBasic✅ (complex, reflective)
Has memory / feedback loopPartial✅ (adaptive)

I came across a simpler explanation written by Diwakar on LinkedIn –

Consider LLM → RAG → AI Agent → Agentic AI …… as 4 very different types of friends planning your weekend getaway:

📌 LLM Friend – The “ideas” guy.
Always full of random suggestions, but doesn’t know you at all.
“Bro, go skydiving!” (You’re scared of heights.)

📌 RAG Friend – Knows your tastes and history.
Pulls up better, fresher plans based on what you’ve enjoyed before.
“Bro, let’s go to Goa- last time you enjoyed a lot!”

📌 AI Agent Friend – The one who gets things done.
tickets? Done. Snacks? Done. Hotel? Done.
But you need to ask for each task (if you miss, he misses!)

📌 Agentic AI Friend – That Superman friend!
You just say “Yaar, is weekend masti karni hai”,
And boom! He surprises you with a perfectly planned trip, playlist, bookings, and even a cover story for your parents 😉

⚡ First two friends (LLM & RAG) = give ideas
⚡ Last two friends (AI Agent & Agentic AI) = execute them – with increasing level of autonomy

Here is an another visualization published by Brij explaining how these four layers relate – not as competing technologies, but as an evolving intelligence architecture –

Conclusion: Why This Matters to You

These aren’t just technical terms – they’re shaping the future of work and industry:

  • Businesses are using LLMs to scale creativity and support
  • RAG systems turn chatbots into domain experts
  • AI Agents automate work across departments
  • And Agentic AI could someday run entire business units with minimal human input

The future of work isn’t human vs. AI—it’s human + AI agents working smarter, together.

The Smartest AI Models: IQ, Mensa Tests, and Human Context

AI models are constantly surprising us – but how smart are they, really?

A recent infographic from Visual Capitalist ranks 24 leading AI systems by their performance on the Mensa Norway IQ test, revealing that even the best AI can outperform the average human.

AI Intelligence, by the Numbers

Visual Capitalist’s analysis shows AI models scoring across categories:

  • “Highly intelligent” class (>130 IQ)
  • “Genius” level (>140 IQ) with the top performers
  • Models below 100 IQ still fall in average or above-average ranges

For context, the average adult human IQ is 100, with scores between 90–110 considered the norm.

Humans vs. Machines: A Real-World Anecdote

Imagine interviewing your colleague, who once aced her undergrad finals with flying colors – she might score around 120 IQ. She’s smart, quick-thinking, adaptable.

Now plug her into a Mensa Norway-style test. She does well but places below the top AI models.

That’s where the surprise comes in: these AI models answer complex reasoning puzzles in seconds, with more consistency than even the smartest human brains. They’re in that “genius” club – but wholly lacking human intuition, creativity, or emotion.

What This IQ Comparison Really Shows

InsightWhy It Matters
AI excels at structured reasoning testsBut real-world intelligence requires more: creativity, ethics, emotional understanding.
AI IQ is a performance metric – not characterModels are powerful tools, not sentient beings.
Human + AI = unbeatable comboMerging machine rigor with human intuition unlocks the best outcomes.

Caveats: Why IQ Isn’t Everything

  • These AI models are trained on test formats – they’re not “thinking” or “understanding” in a human sense.
  • IQ tests don’t measure emotional intelligence, empathy, or domain-specific creativity.
  • A “genius-level” AI might ace logic puzzles, but still struggle with open-ended tasks or novel situations.

Key Takeaway

AI models are achieving IQ scores that place them alongside the brightest humans – surpassing 140 on standardized Mensa-style tests . But while they shine at structured reasoning, they remain tools, not people.

The real power lies in partnering with them – combining human creativity, ethics, and context with machine precision. That’s where true innovation happens.