this post was submitted on 02 Nov 2024
268 points (98.2% liked)

Python

6416 readers
4 users here now

Welcome to the Python community on the programming.dev Lemmy instance!

📅 Events

PastNovember 2023

October 2023

July 2023

August 2023

September 2023

🐍 Python project:
💓 Python Community:
✨ Python Ecosystem:
🌌 Fediverse
Communities
Projects
Feeds

founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 3 points 3 weeks ago (1 children)

As for data science using Python, something tells me that this has to do with memory heap capacities. I'm not sure about Python's max memory heap, but Javascript through Node.js seems to have only 512MB. I've been using Node.js to deal with big datasets and my most recent experimentation stumbled across the need of loading 100 million numbers to the RAM: while my PC has a fair amount of physical RAM (12GB) and a great part of it was available, it'll simply error when filling an array. I needed an additional parameter, --max-old-space-size, so Node.js could deal with such amount of data. I didn't try the same task with Python because I'm used to Javascript (yet I'm done some things in Python), but I wonder how much memory can Python hold until an error like "out of memory" happens, because ML models (for example, those hosted and served in HuggingFace) loads training weights with dozens of GBs

[–] [email protected] 1 points 2 weeks ago* (last edited 2 weeks ago)

I wonder how much memory can Python hold until an error like “out of memory” happens, because ML models (for example, those hosted and served in HuggingFace) loads training weights with dozens of GBs

All the stuff that's LLM and the actual "serious" python libraries are implemented in C/C++ and only made accessible via python.

Which doesn't directly answer the question of what the maximum is, in those cases, but it should be obvious that C/C++ have some good ways to deal with memory.

You can still do "traditional" memory management in python, or "memory aware programming" like, e.g. not trying to read a file in one piece, but reading and processing line by line.

And using C from python is actually very easy and convenient with ctypes. https://docs.python.org/3/library/ctypes.html