Last Week Highlights: Python in Excel and the Next Stage of AI Democratization

5 Key Developments and Their Strategic Impact on Lean Data Professionals

Aug 29, 2023

A Python Coming Out of A SpreadSheet - Midjourney

Another week has passed, and the world of AI is buzzing with news and developments. This issue will cover everything from Python in Excel to AI latest developments toward democratization. Still, we will share in more detail our take on what it means for data professionals who want to stay lean, and focus on delivering data experiences rather than purely the technical part. Let's dive in!

Python Makes Its Way to Excel

Anaconda and Microsoft have teamed up to integrate Python into Excel, effectively shaking up the status quo of data manipulation within spreadsheets. For years, VBA has been the reluctant hero for those Excel users with capabilities beyond pivot tables or simple formulas. The truth is that 99% of Excel users only crave clean datasets and a user-friendly interface for visualizing data. It's only a niche group, roughly 1% of Excel specialists, who run into the turbulent waters of VBA to make automation happen.

To comprehend the transformative potential of this development, consider this: Excel has an estimated user base of 750 million. Even if only 1% of those users adopt Python within Excel, that equates to an incredible 7.5 million people. It isn't merely about sidestepping the hurdles of VBA; it's a monumental leap toward democratizing data science and AI functionalities. Imagine companies developing Python libraries tailored for Excel, enabling users to invoke specialized functions like any other Excel formula. We're pioneering this approach at Naas with our low-code python formula, especially with our Naas drivers that let you call APIs in a single line of code and get a clean data frame out of it.

This integration marks a significant stride toward democratizing data and AI. Kudos to the Anaconda team for making this monumental shift possible.

Check out the release note on Anaconda's website.

Fine-Tuning GPT-3.5 Turbo and ChatGPT Enterprise

OpenAI recently unveiled its fine-tuning API for GPT-3.5 Turbo, and I recommend Sophia Yang's video to walk you through fine-tuning this model for bank customer service data. But OpenAI also announced ChatGPT Enterprise, a development that has received mixed reactions, as highlighted by Charles Demontigny:

Data Privacy: One highlighted feature is that your data isn't used for further model training. However, this has already been the case via the API, Playground, and even the Plus version, making this less of a breakthrough.
Advanced Features: The Enterprise version brings added availability for GPT-4 & Advanced Data Analysis (formerly known as Code Interpreter). While good to have, these feel more like extensions rather than revolutionary changes.
Security: Basic security features like SSO and domain verification have been introduced, but shouldn't these be standard offerings?
Context Tokens: OpenAI offers a GPT-4 version with 32k context tokens, which seems like an artificial bottleneck compared to Claude2 from Anthropic, which has a context window of 100k tokens.

In summary, while ChatGPT Enterprise has its merits, it falls short in direct integrations with enterprise systems such as databases, emails, and messaging platforms (what we are currently working on with Naas V2).

The fine-tuning capabilities offer, nonetheless, an avenue to tailor large language models for specific tasks without the need for extensive data or computational resources. This means even small teams can derive valuable insights and automate tasks efficiently.

More about: Fine-tuning GPT-3.5 and ChatGPT Enterprise

Code Llama: The New Code Model Prodigy

Another standout development this week comes from Meta with their Code Llama model, which is built on the Llama 2 framework. Not only does it support multiple programming languages, but it has also secured a top spot on the code model leaderboards.

For lean data teams, Code Llama presents a compelling tool for automating various coding tasks, from data cleaning to preprocessing. The model's fine-tuning will allow it to be adapted to particular needs and coding styles, thereby elevating its utility.

For example, suppose you're a development consulting company that often reuses functions and patterns to create custom applications. In that case, you can fine-tune and retrain the model according to your unique methodologies. The real power lies in embedding the wisdom accumulated from past code into all new developments, creating a virtuous cycle of continuous improvement.

The paper released by Meta about Code Llama's conception is insightful if you want to dive deeper.

Hugging Face's Big Week

Hugging Face has recently closed a staggering $235 million funding round at an eye-popping valuation of $4.5 billion. Major tech players like Google's Alphabet Inc., Amazon.com Inc., Nvidia Corp., Intel Corp., and Salesforce Inc. have all joined the funding frenzy, underlining Hugging Face's accelerating growth in the AI ecosystem. Kudos to the team for this monumental achievement!

What makes this funding round particularly noteworthy is its implication for the broader AI landscape. It signals a robust trend towards enterprise adoption of AI solutions, with a distinct emphasis on open-source models. In my conversations with companies formulating their AI strategies, it's becoming increasingly clear that open-source accessibility and explainable AI are non-negotiable prerequisites for adoption. We're just scratching the surface of the AI revolution, and the integration of AI into standard business workflows is on the horizon. The significance of Hugging Face's funding lies not just in the numbers but in what it forecasts for the future of AI in enterprise settings.

Read the Tech Crunch article about the funding.

That's it for this week! If you have thoughts, questions, or insights I missed, please add them in the comments.

The Lean Data Journal

Discussion about this post