LLM Assistant for Markdown Documents

fizzle@quokk.au · 1 month ago

LLM Assistant for Markdown Documents

fizzle@quokk.au · 29 days ago

I’m out of my depth here but trying to piece this together.

If I understand correctly the first component of this workflow is to use an inference API (like huggingface or so) to convert each file from your notes into semantic vectors and store them in chromadb, ready to be used in future prompts.

Are you using any software to do that or have you written some code to load the files from disk, call the API, and store the response?

29 days ago

So my notes are just a directory of thousands of MD files. I wrote some code that watches the files in this dir to see when anything changes and when it does it will do the following:

Splits the file into chunks with some overlap on each side it does something like 300token chunks with a 25token overlap this is done by loading the model tokeniser via the huggiface python library and using the huggingface chunker (this happens locally).
I send each chunk to my local ollama instance that converts it to a semantic vector (just another local docker container)
I then delete all semantic vectors in chromadb for that file and create new entries for the updated file.
If /sydney is contained within the file it sends a message to the matrix chat as the user saying “read <filename> and follow the instructions provided by the /sydney command” the agent manager will then get this message and pass it off to an agent to handle. All this happens locally.

My ai agent is a separate component (just another docker container, with the notes dir mounted as a volume) using pi which uses an LLM via remote api (openrouters). I have a custom tool for that agent where the agent can write a text search that returns the top n most semantically similar chunks of text (along with some metadata notably the filename and line numbers where this chunk came from). The vectors are never seen by the LLM they exists purely for the search ranking. The agent also has file editing capabilities so it can then go read that file or modify that file like any coding agent. The agent also has a tool to send messages via matrix.

I have a service that watches a specific matrix chat and if a message is recieved does 1 of 2 things: Option 1: if an agent is already running it will pass the message into the existing agent as a user message. Option 2: if no agent is running it will start a new agent instance and pass the message into the agent as the user message. This agent manager service is the same docker image that runs the agent. This is the same docker container that runs the agent when the agent finishes running it takes and final agent output and sends that to the matrix chat as the agents matrix user.

I got an agent to write all this code so its probably dodgy as shit with all sorts of security holes hence I haven’t published it on github (security through obscurity etc etc lol).

I also have a searxng instance running accessible to the agent via MCP. And I have a chrome MCP allowing the agent to do things from inside a virtual chrome browser.