this post was submitted on 07 Jan 2025
3 points (100.0% liked)

Privacy

0 readers
40 users here now

Everything about privacy (the confidentiality pillar of security) -- but not restricted to infosec. Offline privacy is also relevant here.

founded 1 year ago
MODERATORS
 

I'm checking out various "personal knowledge management" tools in a sandbox to see if it be an upgrade my ragtag collection of text file-based notes.

First candidate is #Logseq, supposedly "privacy-first".

How #privacy friendly is something based on Electron (aka Chrome)? Debatable, but then they also do this:

  1. Have "Send usage data" on by default
  2. Start with an example page that embeds a YouTube video, and accepts all cookies

tcpdump and mitmproxy go wild when starting the program.

Shows that the "Send usage data and diagnostics to Logseq" setting is enabled by default.
Shows the services being contacted by Logseq over HTTPS right after starting it for the first time.  Hosts that are being contact: www.youtube.com, googleads.g.doubleclick.net, jnn-pa-googleapis.com, play.google.com, app.posthog.com, o416451.ingest.sentry.io

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 2 points 3 weeks ago (6 children)

Next up is #Obsidian, a tool I'm hesitant to consider because of the developers' view on open source. Hence, the source is not available except the obfuscated JavaScript that's ran by Electron.

Despite that, Obsidian itself only does a version check (which can be disabled) and starts in "restricted mode" by default, which disallows third-party plugins (but does still embed external content when asked to.)

There's some phoning home by Chrome but far less than with Logseq.

Color me surprised.

The program defaults to "restricted mode."  "Would you like to exit Restricted Mode to enable community plugins?   We strongly recommend making backups of your data before doing so."

[–] [email protected] 2 points 3 weeks ago (1 children)

When installing plugins all bets are off.

Loading dependencies from CDNs, doing their own version checks, or showing a YouTube video on install, the most popular Obsidian plugin (Excalidraw) does it all without asking.

[–] [email protected] 1 points 3 weeks ago (1 children)

@[email protected] ooh hey this thread seems really useful, any plans to check out https://anytype.io/ been eying it up as a replacement for notion on my personal projects.

[–] [email protected] 1 points 3 weeks ago (1 children)

@[email protected] Thanks!

Yes, Anytype is next. I played around with it yesterday (without monitoring it) but its complexity was both alluring and also a reason to check other tools, despite my (initial) distrust of them.

[–] [email protected] 1 points 3 weeks ago

@[email protected] haha yeah that makes sense look forward to reading your thoughts on it.

[–] [email protected] 2 points 3 weeks ago (1 children)

Candidate number 3, #Anytype, is a whole different beast conceptually. More than a Markdown editor, it's a database consisting of all kinds of document "objects" and templates (Notion-like, I'm told)

I don't have enough characters (500 is the limit on this instance...) to describe my surprise and disappointment about the difference between how they present themselves versus reality, so this will be multiple posts.

The attached pictures are a collage of my expectations for Anytype.

1/n

On the left: "Enjoy true privacy"  On the right: "Nobody can see what's in your vault, except for you  Local, on-device encryption. Only you have encryption keys"
image/png

[–] [email protected] 2 points 3 weeks ago* (last edited 3 weeks ago) (2 children)

Reality: everything you do in the program is being tracked and there is *no opt-out*.

The program records all your actions and sends them every few minutes to Amplitude, a commercial analytics company.

Deep down in the documentation this is mentioned, but there is no consent or even a mention in the program itself or in the privacy policy.

It also communicates constantly with a few AWS EC2 instances, presumably the IPFS nodes it uses to backup your (encrypted) vault of documents.

2/n

[–] [email protected] 3 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

So all your actions are being logged, fortunately (because who knows at this point) without the actual contents of what you type.

But everything else is there: did you add a page, did you click around, did you add some paragraphs of text. All neatly ordered, timestamped, and identified with a user and session ID.

There's also data about the machine you're using the app on.

Of course, being an Electron app, it also has Chrome phoning home. And there's a version check (cannot be disabled)

3/n

[–] [email protected] 3 points 3 weeks ago (1 children)

That there is no opt-out for this nor a consent dialog or even a warning is unacceptable in my view.

For a company that likes to talk about trust they sure have no idea about how to gain it.

4/4

[–] [email protected] 2 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

Tested the fourth PKM: #SiYuan (https://b3log.org/siyuan/), which is pretty similar to Anytype feature-wise.

It's also a product that starts off with saying that it's "privacy-first", supported by what might be the world's shortest privacy policy, which clearly states: "Does not collect user personal information and usage data."

Unfortunately, the Google Analytics and Google Tag Manager scripts that are loaded on start are nowhere mentioned. No warning, no consent question, on by default.

1/n

[–] [email protected] 2 points 3 weeks ago (3 children)

What data is being collected? Mostly details about your machine: OS (name, kernel version), CPU architecture, screen resolution, a unique identifier, but also what's in the title bar of the program window, which can be problematic.

You see, the title of the note you had open when you quit the program last is also in the title bar, which might contain personal information like someone's name, or the name of an illness you have that you are taking notes about.

2/n

[–] [email protected] 2 points 3 weeks ago* (last edited 3 weeks ago) (2 children)

You might feel I'm nitpicking about a possible edge case here, but you are promised privacy.

Without sniffing the network traffic, or going through the source code, you have no idea that your note titles are being sent to Google Analytics. Even the opt-out toggle tells you that no user data is collected.

It's another example of a company (they sell premium services) using "privacy-first" as a buzzword instead of living by it as a guiding principle.

At least there is an opt-out, I guess

3/3

[–] [email protected] 2 points 3 weeks ago (1 children)

Ok then, number 5: the desktop version of #TiddlyWiki, #TiddlyDesktop.

The Chromium wrapper isn't as old as the wiki web software itself but still goes back to 2014.

Standard Chrome traffic and... a lot of calls to googleapis.com. Why? Because it calls the Google spell check API with everything you enter.

All your text is being sent to Google.

I couldn't turn it off and on top of that a dummy API key is used so the API returns an error, meaning the functionality is completely useless.

A screenshot of the page editor of TiddlyWiki in TiddlyDesktop.  The contents of the page read: "Secrets"  "So I've disabled ""network activity"", surely it won't pass my biggest secrets on to Google, right?  ...  Right?"  It also shows that the "network activity" option has been disabled   (I've also tested it with the option enabled, restarting the program, etc. Google's API was still being contacted)
Shows the contents of one of the calls to the Google spell check API.  The payload of the call contains the following JSON:  {   "text": "So I've disabled \"network activity\", surely it won't pass my biggest secrets on to Google, right?\n\n...\n\nRight?",   "language": "en",   "originCountry": "USA" }

[–] [email protected] 1 points 3 weeks ago (1 children)

@[email protected] I use the Node.js version of Tiddlywiki and see no such traffic. I don't use TiddlyDesktop, though, so can't comment on that.

[–] [email protected] 1 points 3 weeks ago

@[email protected] This is definitely TiddlyDesktop only. It was added because people were missing the spell checking that their browser normally does (https://github.com/TiddlyWiki/TiddlyDesktop/issues/32)

[–] [email protected] 1 points 3 weeks ago

Ok then, number 5: the desktop version of #TiddlyWiki, #TiddlyDesktop.

The Electron wrapper isn't as old as the wiki web software itself but still goes back to 2014.

Standard Electron traffic and... a lot of calls to googleapis.com. Why? Because it calls the Google spell check API with everything you enter.

All your text is being sent to Google.

I couldn't turn it off and on top of that a dummy API key is used so the API returns an error, meaning the functionality is completely useless.

A screenshot of the page editor of TiddlyWiki in TiddlyDesktop.  The contents of the page read: "Secrets"  "So I've disabled ""network activity"", surely it won't pass my biggest secrets on to Google, right?  ...  Right?"  It also shows that the "network activity" option has been disabled   (I've also tested it with the option enabled, restarting the program, etc. Google's API was still being contacted)
Shows the contents of one of the calls to the Google spell check API.  The payload of the call contains the following JSON:  {   "text": "So I've disabled \"network activity\", surely it won't pass my biggest secrets on to Google, right?\n\n...\n\nRight?",   "language": "en",   "originCountry": "USA" }

[–] [email protected] 1 points 3 weeks ago

You might feel I'm nitpicking about a possible edge case.

But you are promised privacy, and without going through every screen in the options menu, sniffing the network traffic, or going through the source code you have no idea that your note titles are being sent to Google Analytics.

It's another example of a company (they sell premium services) using "privacy-first" as a buzzword instead of living by it as a guiding principle.

At least there is an opt-out, I guess.

[–] [email protected] 1 points 3 weeks ago

You might feel I'm nitpicking about a possible edge case.

But you are promised privacy, and without going through every screen in the options menu, sniffing the network traffic, or going through the source code you have no idea that your note titles are being sent to Google Analytics.

It's another example of a company (they sell premium services) using "privacy-first" as a buzzword instead of living by it as a guiding principle.

At least there is an opt-out, I guess.

3/3

[–] [email protected] 2 points 3 weeks ago

Correction: it is mentioned in a privacy policy, but not the first one you get to. You have to click through to the second privacy policy.

https://anytype.io/app_privacy

[–] [email protected] 1 points 3 weeks ago

Funnily enough, when it comes to source from other people the developers do see the value of open source.

[–] [email protected] 1 points 3 weeks ago (1 children)

@[email protected] Huge fan of opensourse, but I do use Obsidian as my main notes tool these days. It's so pretty, just works, and while the core tooling isn't open, I have peace of mind that I can leave any time and move to any other text/markdown based tool.

That's a big win over other polished note-taking tools like Evernote, for instance.

I'd love to see open tools like Joplin get to the level of visual appeal Obsidian has.

[–] [email protected] 1 points 3 weeks ago

@[email protected] That's definitely a big plus for Obsidian (and the current version of Logseq.)

Anytype hides everything away in a database blob that can be somewhat exported, but when doing it in Markdown format the "relation" metadata (think Dataview) is lost, where with Obsidian Dataview's metadata is just there in the Markdown.

Despite the misgivings I had about Obsidian it's looking like a very good option indeed.

[–] [email protected] 1 points 3 weeks ago

Funnily enough, when it comes to code by other people the developers do see the value of open source.

[–] [email protected] 0 points 3 weeks ago (1 children)

@[email protected] I use Obsidian fairly regularly. The advantage is that your data's all markdown files on your own disk. If Obsidian for some reason becomes sketchy (which I doubt will happen), I can move on to another app.

The plugins are great and is probably what drives Obsidian for the most part if you wanted more than just a note taking app.

[–] [email protected] 0 points 3 weeks ago (2 children)

@trinsec Plain-files-on-disk is certainly is a big advantage compared to Anytype (and possibly the next version of Logseq), where everything is stored in a database blob.

Anytype "objects" are exportable as Markdown (but with loss of metadata) or as a Protobuf-parseable packet but I didn't find any CLI tool to do that in an automated way. So something I need to consider in my choice.

I'm pleasantly surprised by Obsidian so far, just need to keep an eye on the background activity of plugins.

[–] [email protected] 1 points 3 weeks ago (1 children)

@[email protected] @[email protected] fyi : https://community.anytype.io/t/concerns-about-the-current-allegedly-severe-limitations-of-the-export-function/25258/4

"We are transitioning to a new storage foundation based on SQL, where all objects will be stored as JSON. This format is highly standard, making interoperability much easier. Our upcoming API will also be based on this structure."

[–] [email protected] 1 points 3 weeks ago

@projetslibres_[email protected] @[email protected] That's good to hear, I hope they'll just go with SQLite.

Also nice that there's an API planned to interact with the notes, because I was thinking of how you'd get a quick note in from, for example, the CLI.

[–] [email protected] 1 points 3 weeks ago (1 children)

@[email protected] @[email protected] thank you for your very informative thread !

have you heard about https://b3log.org/siyuan/ ?

[–] [email protected] 1 points 3 weeks ago (1 children)

@projetslibres_[email protected] @[email protected] Thanks!

I hadn't heard of it but it looks pretty good. A bit like Anytype with its templates and relations, but with contents just stored as plain files on disk and hopefully with less tracking.

I will give it try tomorrow, thanks :)

[–] [email protected] 1 points 3 weeks ago

@[email protected] @[email protected] I'd just heard about it recently (I'm an Anytype user) and wanted to give it a try also.