I’m having a play about with Yacy today. I have had the thoughts that it would be great to have the possibility to search through my paperless-ngx documents. Is it doable?? Has anyone got it working??
You can do a search on paperless and get a json output with the following http://192.168.1.34:8001/api/documents/?query=test but Yacy cant crawl paperless?
Maybe you could add it to SearXNG as it’s own engine?
That seems like it could work. I seen SearXNG has a template engine to use. Just need to figure what how to use it with paperless :)
I’d be curious how well it works if you try it. I kind of want to, but I’m not sure how I feel about letting something unauthenticated (SearXNG) access my paperless instance with some personal docs in
I have it working!! Once I get the configs tidied up I’ll share them :)
OH!! with searxng not yacy
YaCy indexes http content, so if your documents are all reachable via a http interface they can be indexed.
Paperless will store documents in plain text. Maybe one could write a small webserver or extend Paperless to serve an RSS feed that could be consumed by Yacy.