this post was submitted on 31 Jan 2025

351 points (94.4% liked)

Open Source

38156 readers

189 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Posts must be relevant to the open source ideology
No NSFW content
No hate speech, bigotry, etc

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 5 years ago

MODERATORS

[email protected]

351

Proton's biased article on Deepseek (lemmy.ml)

submitted 4 months ago by [email protected] to c/[email protected]

122 comments fedilink hide all child comments

Article: https://proton.me/blog/deepseek

Calls it "Deepsneak", failing to make it clear that the reason people love Deepseek is that you can download and it run it securely on any of your own private devices or servers - unlike most of the competing SOTA AIs.

I can't speak for Proton, but the last couple weeks are showing some very clear biases coming out.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 105 points 4 months ago (2 children)

DeepSeek is open source, meaning you can modify code(new window) on your own app to create an independent — and more secure — version. This has led some to hope that a more privacy-friendly version of DeepSeek could be developed. However, using DeepSeek in its current form — as it exists today, hosted in China — comes with serious risks for anyone concerned about their most sensitive, private information.

Any model trained or operated on DeepSeek’s servers is still subject to Chinese data laws, meaning that the Chinese government can demand access at any time.

What???? Whoever wrote this sounds like he has 0 understanding of how it works. There is no "more privacy-friendly version" that could be developed, the models are already out and you can run the entire model 100% locally. That's as privacy-friendly as it gets.

"Any model trained or operated on DeepSeek's servers are still subject to Chinese data laws"

Operated, yes. Trained, no. The model is MIT licensed, China has nothing on you when you run it yourself. I expect better from a company whose whole business is on privacy.

[–] [email protected] 32 points 4 months ago (1 children)

To be fair, most people can't actually self-host Deepseek, but there already are other providers offering API access to it.

[–] [email protected] 31 points 4 months ago (2 children)

There are plenty of step-by-step guides to run Deepseek locally. Hell, someone even had it running on a Raspberry Pi. It seems to be much more efficient than other current alternatives.

That's about as openly available to self host as you can get without a 1-button installer.

[–] [email protected] 15 points 4 months ago (2 children)

You can run an imitation of the DeepSeek R1 model, but not the actual one unless you literally buy a dozen of whatever NVIDIA’s top GPU is at the moment.

[–] [email protected] 9 points 4 months ago

A server grade CPU with a lot of RAM and memory bandwidth would work reasonable well, and cost "only" ~$10k rather than 100k+...

[–] [email protected] 0 points 4 months ago

I saw posts about people running it well enough for testing purposes on an NVMe.

[–] [email protected] 5 points 4 months ago* (last edited 4 months ago) (1 children)

Those are not deepseek R1. They are unrelated models like llama3 from Meta or Qwen from Alibaba "distilled" by deepseek.

This is a common method to smarten a smaller model from a larger one.

Ollama should have never labelled them deepseek:8B/32B. Way too many people misunderstood that.

[–] [email protected] 3 points 4 months ago (1 children)

I'm running deepseek-r1:14b-qwen-distill-fp16 locally and it produces really good results I find. Like yeah it's a reduced version of the online one, but it's still far better than anything else I've tried running locally.

[–] [email protected] 1 points 4 months ago* (last edited 4 months ago) (1 children)

Have you compared it with the regular qwen? It was also very good

[–] [email protected] 1 points 4 months ago (1 children)

The main difference is speed and memory usage. Qwen is a full-sized, high-parameter model while qwen-distill is a smaller model created using knowledge distillation to mimic qwen's outputs. If you have the resources to run qwen fast then I'd just go with that.

[–] [email protected] 1 points 4 months ago (1 children)

I think you're confusing the two. I'm talking about the regular qwen before it was finetuned by deep seek, not the regular deepseek

[–] [email protected] 1 points 4 months ago (1 children)

I haven't actually used that one, but doesn't the same point apply here too? The whole point of DeepSeek is in distillation that makes runtime requirements smaller.

[–] [email protected] 1 points 4 months ago (1 children)

No cause I was already running regular (non-deepseek) qwen 14B, admittedly a heavily quantized and uncensored version, so I was just curious if it would be any better

[–] [email protected] 1 points 4 months ago

The big difference with R1 at least is that it is trained using reinforcement learning, and it gives you an explanation of how it arrived at a solution. I find the thinking bit is really fascinating cause it really reads like a train of thought. Here's an example of what it outputs when I ask it to write a react component with a text input and a button:

First, I'll start by setting up the basic structure of a React component. I'll import React and useState since we'll need state management for the input value.

Next, I'll create a functional component called TextInputButton. Inside this component, I'll use the useState hook to manage the text input's value. The initial state can be an empty string.

Then, I'll define a handler function, handleChange, that updates the state whenever the user types into the input. This function will take the event object and set the new value using setTextInputValue.

I also need a function to handle the button click. For now, I'll use handleClick with an alert message as a placeholder. The actual functionality can be added later based on requirements.

In the return statement, I'll structure the JSX. I'll include an input element of type text, binding its value to the state variable and attaching the handleChange function to the onChange event. This ensures that the state always reflects the current input value.

Below the input, I'll add a button element. On clicking this button, it will trigger handleClick. I'll also give it a class for styling purposes, like 'button' which can be styled with CSS if needed.

Finally, I'll export the component so it can be used elsewhere in the application. I'll make sure to include comments explaining each part for clarity.

Here's a simple React component with a text input and a button:

import React from 'react';
import { useState } from 'react';

function TextInputButton() {
  const [text, setText] = useState('');

  const handleChange = (event) => {
    setText(event.target.value);
  };

  const handleClick = () => {
    // Add your click handler logic here
    alert('Button clicked!');
  };

  return (
    <div>
      <input
        type="text"
        value={text}
        onChange={handleChange}
        placeholder="Enter text..."
      />
      <button onClick={handleClick}>Submit</button>
    </div>
  );
}

export default TextInputButton;

This component:

Uses useState to manage the input value
Has a handleChange function that updates the state when the user types
Has a handleClick function that currently just shows an alert (you can add your own logic here)
Renders an input field and a submit button

You can use this component in your application like:

<TextInputButton />

The input value is controlled through the text state variable, and you can access it using text. When the button is clicked, the handleClick function will execute whatever logic you need.

[–] [email protected] 4 points 4 months ago

I think they mean privacy friendly version of the infrastructure could be developed.