In the previous post I’ve had a look at how I can integrate an LLM into my programming environment and use a prompt to produce and modify code, to find bugs and security issues, and to discuss options and fix issues. I find the result stunning. So is AI assisted coding a good or a bad thing? Maybe this is the wrong question to ask, it’s like wondering if programming in Python is a good or bad thing compared to programming in assembly language. Let’s dwell on this a bit.
Continue reading My AI Learning Journey – Part 11 – AI Assisted Coding – Good or Bad?My AI Learning Journey – Part 10 – AI Assisted Coding – VSCodium and Continue

All right, I’ve touched quite a few topics in my AI learning journey so far and today it is time to have a look at a central piece of the whole exercise: AI assisted programming.
Over the past years, I’ve been using AI systems in my private programming projects a few times but it has always been separate from the programming environment and I had to copy and paste code at some point. And while this has sped up quite a number of things, it always felt quite limited. In the meantime, it seems that a number of plugins for programming environments have become available to enable LLMs to directly interact with the code. Also, I’ve heard that high end LLMs can get an overview of the complete code base of a project and suggest + implement solutions across many source files at once. That sounds all quite nice but I was not willing to give up the privacy of my tool chain and I also didn’t want any tool to create a personal profile of my activities. So I looked around a bit and came across a very nice tool chain that fit my privacy needs while giving me access to the latest LLM models for programming.
Continue reading My AI Learning Journey – Part 10 – AI Assisted Coding – VSCodium and ContinueMy AI Learning Journey – Part 9 – Local Documents and LLMs

In part 7, I’ve had a look at how external search engines can be integrated into Open WebUI and combined with local LLMs to get external references when asking questions instead of having to rely just on the output of the LLM. This is referred to as Retrieval Augmented Generation (RAG). The same approach can also be used to combine local LLMs with local documents for search.
Continue reading My AI Learning Journey – Part 9 – Local Documents and LLMsMy AI Learning Journey – Part 8- OpenRouter – 350+ Models to Experiment With

After looking into local LLMs and Retrieval Augmented Generation (RAG) in the previous post, this post focuses how to experiment and use public LLMs in a privacy friendly way. The basic problem: While one can anonymously ask a limited number of questions to public LLMs such as ChatGPT, Perplexity, etc., more in depth questions require an account and often a monthly subscription. And with an account comes the loss of privacy. But there’s an interesting solution:
Continue reading My AI Learning Journey – Part 8- OpenRouter – 350+ Models to Experiment WithMy AI Learning Journey – Part 7 – Combining LLMs with Web Search

For me, like most people, the everyday default use case for LLMs is to get answers to questions, i.e. I use it as kind of an enhanced search engine. So far, my experiments with local LLMs have focused on getting answers out of what the LLMs have learnt during the one time learning phase, i.e. what is ‘stored’ in their ‘weight’ parameters. While this works, there are two major shortcomings: First, information about current events is not available, as the training phase was in the past. And secondly, there are no references to check if the information returned was correct. In many cases, the information is correct, but there are also spectacular hallucinations that sound credible but are just plain false. This is why many online LLMs such as Perplexity can combine their training with web search and give references in their output, so the information that was given can be verified. This is what is referred to as Retrieval Augmented Generation (RAG). It’s not a panacea, but it helps quite a bit. So my obvious next step: How can I get RAG working with my local Ollama and Open WebUI installation in a privacy friendly way?
Continue reading My AI Learning Journey – Part 7 – Combining LLMs with Web SearchMy AI Learning Journey – Part 6 – A Reverse Proxy for the LLM GUI
Welcome to part 6 of my LLM learning journey. Before I continue to explore the many features of Open WebUI (OWUI) in combination with Ollama, I wanted to do one other thing: By default, Open WebUI has a http frontend, there is no https port available and hence a reverse-proxy is required to use the service securely over the Internet. If the server on which OWUI runs can be reached over a public IP address, getting a reverse-proxy with Letsencrypt certificates up and running with docker compose is straight forward. Have a look at this post for the details on how to do that. In my case, the server I’m running OWUI and Ollama on does not have a public IP address so I needed to look for something slightly different.
Continue reading My AI Learning Journey – Part 6 – A Reverse Proxy for the LLM GUIMy AI Learning Journey – Part 5 – A GUI for the LLM at Home

In part 4 of this series, I’ve installed Ollama on my server at home so I could download large language models (LLMs) and interact with them on the command line. Questions and answers on the shell are nice, but the next step was obviously to get a nice web based user interface. And again, I started to look for options and pretty quickly came to the conclusion that Open WebUI is probably the thing to go for. I’ll call it OWUI in this post. Their page on Github indicates that it is a huge and broadly supported project. After that, however, things become a bit opaque. There is no Wikipedia page on the project and there are only somewhat indirect references that refer to Open WebUI Inc. as the company behind it. But that is pretty much it. I’m not sure what I should make of this, but decided to go ahead and have a closer look anyway.
Continue reading My AI Learning Journey – Part 5 – A GUI for the LLM at HomeMy AI Learning Journey – Part 4 – GPU over CPU Speed
In part 3 of this series, I’ve set-up my headless local LLM execution environment with Ollama so I’m ready for further experiments. One of the questions I wanted to answer with this setup is how much faster LLMs run on a GPU vs. the CPU. Part of the LLMs one can download is a configuration file that defines how the LLM is set-up for execution. One parameter in this file defines how many of the neuron layers of the LLM are to be executed on the GPU and how many of them should be run on the CPU. This is what is called ‘layer offloading’, and splitting up the work between the CPU and GPU can be useful when GPU memory is smaller than what is necessary to run the LLM on the GPU alone.
Continue reading My AI Learning Journey – Part 4 – GPU over CPU SpeedMy AI Learning Journey – Part 3 – LLMs at Home

Finally, finally, some hands on stuff with AI in a blog post. After discussing some theory in part 1 and part 2 of this series, the first thing I wanted to explore was how I can run Large Language Models (LLMs) locally instead of using them in the cloud. After looking around a bit I saw two potential options for me: Ollma and LM Studio. As I wanted to have a solution I could run on a server without a GUI, I decided to go for Ollma, as it is an open source and MIT licensed command line solution to download and run LLMs. LM Studio is probably a good choice, too, but it is centered around a graphical user interface, which is not what I wanted to have for my initial ‘raw’ experiments.
Continue reading My AI Learning Journey – Part 3 – LLMs at HomeMy AI Learning Journey – Part 2 – How Do LLMs Work?
Before going to the practical aspects of my AI learning journey, I set out to understand how Large Language Models (LLMs) work, what the future potential of the technology might be and what kind of dangers might await us if things are running away. There is a lot of material out there on all of these topics and the first thing I wanted to find was an explanation of how LLMs work in less than 10 minutes.
The 10 Minute Intro
Here’s a good video on the topic by Matt Penny on Youtube. He has lots of videos on AI topis that have more than 10k views and a subscriber base of 17k at the time of writing. One of his videos has even been viewed more than 200k times. And yet, this particular video on the basics, posted 9 months ago, has less than 500 views. 200k vs. 500 views, I think that says a lot where the interest of 99.99% of the population lies. Anyway, if you want to understand the basics, including what a ‘token’ is, a word that is used all the time when talking about LLMs, how they work and how much a question costs that is being run through them, then this is your place to go.
Long story short, i.e. my personal elevator pitch: All LLMs we use today generate output text based on input text. They do so based on a previous massive learning phase, in which an LLM has learnt which words follow after each other in a large text corpus. Large in this context means the better part of text that can be found on the Internet today. An LLM doesn’t understand the output that it makes, it doesn’t reason, it doesn’t think, it applies the previously learnt knowledge which words follow which from a statistical point of view.
Continue reading My AI Learning Journey – Part 2 – How Do LLMs Work?