AI and ML on the Command Line, Local LLMs, and How to Really Build Chatbots

Plus What we Learned from a Year of Building with LLMs and Other Events

Jun 17, 2024

Hi all! I’ve recently started a newsletter around all things data science, ML, and AI, primarily to keep track of interesting things in the space and what I’ve been up to. This is an experiment so please do let me know what you’d like to see here. There’s a lot to share this week so let’s jump right in.

AI and ML on the Command Line

I recently hosted a session with Simon Willison for Hamel Husain and Dan Becker’s Mastering LLMs: A Conference For Developers & Data Scientists. Simon’s talk, which I encourage you to watch, was all about

Using LLMs from the command line with his CLI utility llm,
Piping and automating with llm,
Exploring LLM conversations with his tool datasette (llm conversations are automatically logged to a local SQLite database!),
Working with embeddings from the command line, and
Building RAG systems from the command line.

You can watch it here and read about it here:

10 Brief Arguments for Local LLMs and AI

I love working with GenAI models and LLMs locally but couldn’t quite formalize why so I sat down to reason through the benefits and wrote this short post. As it turns out,

key benefits include data privacy, performance, cost efficiency, customization, offline capabilities, learning opportunities, open-source support, scalability, ethical considerations, and autonomy.

Check out the post and let me know what resonates and what doesn’t!

I also talk through some fun options that will get you started with local LLMs:

Ollama is a great way to get started locally with SOTA models, such as Llama 3, Phi 3, Mistral, and Gemma;
Simon’s LLM cli utility allows you to explore LLMs of all kinds from the command and has all the fun mentioned above: can be piped according to unix-like philosophy, logs to local sqlite db, can be explored interactively using datasette, can work with embeddings to build RAG apps, and more!
LlamaFile is a great project from Mozilla and Justine Tunney that is “an open source initiative that collapses all the complexity of a full-stack LLM chatbot down to a single file that runs on six operating systems”, has a cool front-end GUI, and you can get up and running with lots of models, including multimodal models such as LLaVa, immediately;
LM Studio is one of the more advanced GUI I’ve seen to interact with local LLMs: you can discover new models on the homepage, browse and download lots of models from HuggingFace, easily chat with models, and even chat with several simultaneously to compare responses, latency, and more.
Oobagooba’s text generation webUI, which allows you to interact with local models (and others) through a webUI – lots of fun stuff for fine-tuning, chatting and so on.

I use all of these tools a bunch and hope to demo some of them soon and perhaps do some write-ups so let me know on twitter and/or LinkedIn, if you’d find this useful, and I’ll likely get to it sooner!

How to Really Build Chatbots

We recently released a Vanishing Gradients podcast episode, in which I had the pleasure to speak with Alan Nichol, cofounder and CTO of Rasa, where they have been building conversational AI and chatbots for over a decade.

We covered a lot of ground, including

History of chatbots and conversational AI ,
Use cases for conversational AI,
Impact of ChatGPT on the conversational AI industry,
Limitations of prompt-based LLMs for conversational AI,
Overrated and underrated aspects of LLMs,
Advice for getting started with LLMs and conversational AI, and
Demo of Rasa CALM (Conversational AI with Language Models)

One thing I found super insightful were Alan’s thoughts on how we should be thinking about building Conversational AI these days and how to incorporate business logic into conversational AI instead of giving one big model a lot of freedom:

Check out the clip above and/or the whole conversation here (or on your app of choice). You can also watch the livestream here.

Saving the Conda Ecosystem with Rust

I also recently had the pleasure of catching up with Wolf Vollprecht, Creator of Mamba, CEO of prefix.dev, and now the creator of Pixi, about

package management and software supply chain challenges for data scientists and machine learning engineers,
the magic of making it all "just work" for developers across stacks and platforms, and
the future of package management and accessibility for GenAI and foundation models.

I strongly urge you to check out the video in the below tweet because Wolf is doing work with package management that will set us all free!

You can also check out the entire fireside chat here:

What we Learned from a Year of Building with LLMs and Other Events

Later this week, I’ll be doing a livestreamed Vanishing Gradients recording with the affectionately named LLM mafia: Eugene Yan (Amazon), Bryan Bischof (Hex), Charles Frye (Modal), Hamel Husain (Parlance Labs), Jason Liu (Instructor), and Shreya Shankar (UC Berkeley).

Over the past year, these absolute legends have been building real-world applications on top of LLMs. They have identified crucial and often neglected lessons that are essential for developing and building AI products.

They have recently written a 3-part report based on these learnings and, in this conversation, they’ll share advice and lessons for anyone who wants to build products informed by LLMs, ranging from tactical to operational and strategic.

You can sign up for free here.

We’ve got several other exciting livestreams coming up, including

“Rethinking Data Science” with Vincent Warmerdam, a senior data professional and machine learning engineer at :probabl, the exclusive brand operator of scikit-learn. Vincent is known for challenging common assumptions and exploring innovative approaches in data science and machine learning.
“Validating the Validators: GenAI, LLMs, and Humans-in-the-Loop” with Shreya Shankar, a researcher working at the intersection of human-computer interaction (HCI) and AI. Shreya's research focuses on building tools and interfaces that enable collaboration between humans and AI systems, particularly in the context of large language models (LLMs).
“What we learned Teaching LLMs to 1,000s of Data Scientists” with Dan Becker and Hamel Husain. In this special live-streamed recording of Vanishing Gradients, I’ll speak with Hamel and Dan, the instructors of the immensely successful "Mastering LLMs: A Conference For Developers & Data Scientists," as they share their experiences and insights from teaching over 2,000 students.

I’ll be announcing more livestreams, events, and podcasts soon, so subscribe to the Vanishing Gradients lu.ma calendar to stay up to date. Also subscribe to our YouTube channel, where we livestream, if that’s your thing!

That’s it for now. Please let me know what you’d like to hear more of, what you’d like to hear less of, and any other ways I can make this newsletter more relevant for you,

Hugo

Vanishing Gradients