Python

7128 readers

2 users here now

Welcome to the Python community on the programming.dev Lemmy instance!

📅 Events

Past

November 2023

PyCon Ireland 2023, 11-12th
PyData Tel Aviv 2023 14th

October 2023

PyConES Canarias 2023, 6-8th
DjangoCon US 2023, 16-20th (!django 💬)

July 2023

PyDelhi Meetup, 2nd
PyCon Israel, 4-5th
DFW Pythoneers, 6th
Django Girls Abraka, 6-7th
SciPy 2023 10-16th, Austin
IndyPy, 11th
Leipzig Python User Group, 11th
Austin Python, 12th
EuroPython 2023, 17-23rd
Austin Python: Evening of Coding, 18th
PyHEP.dev 2023 - "Python in HEP" Developer's Workshop, 25th

August 2023

PyLadies Dublin, 15th
EuroSciPy 2023, 14-18th

September 2023

PyData Amsterdam, 14-16th
PyCon UK, 22nd - 25th

🐍 Python project:

💓 Python Community:

#python IRC for general questions
#python-dev IRC for CPython developers
PySlackers Slack channel
Python Discord server
Python Weekly newsletters
Mailing lists
Forum

✨ Python Ecosystem:

🌌 Fediverse

Communities

#python on Mastodon
c/django on programming.dev
c/pythorhead on lemmy.dbzer0.com

Projects

Pythörhead: a Python library for interacting with Lemmy
Plemmy: a Python package for accessing the Lemmy API
pylemmy pylemmy enables simple access to Lemmy's API with Python
mastodon.py, a Python wrapper for the Mastodon API

Feeds

founded 2 years ago

MODERATORS

[email protected]

did you moved from Pandas to Polars? why and how was your experience? (lemmy.eco.br)

submitted 3 days ago by [email protected] to c/[email protected]

8 comments fedilink hide all child comments

I'm finding myself with a couple of really big databases and my PC is throwing memory errors so I'm moving the project to polars and learning on the way in, and would like to read your experience in how you did it, what frustrate you and what you found good (I'm still getting used with the syntax, but I'm loving how fast it reads the databases)

top 8 comments

sorted by: hot top controversial new old

[–] [email protected] 6 points 3 days ago (1 children)

Polars has essentially replaced Pandas for me. It is MUCH faster (in part due to lazy queries) and uses much less RAM, especially if the query can be streamed. While syntax takes a bit of getting used to at first, it allows me to specify a lot more without having to resort to apply with custom Python functions.

My biggest gripe is that the error messages are significantly less readable due to the high amount of noise: the stacktrace into the query executor does not help with locating my logic error, stringified query does not tell me where in the query things went wrong...

[–] [email protected] 1 points 2 days ago (1 children)

I had to move away from apply a while ago because it was extremely slow, and started using masks and vectorize operations. That's actually what is being a roadblock for me right now, can't find a way to make it work (use to do df.loc[mask, 'column'], but df.with_columns(pl.when(mask).then()...) is not working as expected)

[–] [email protected] 1 points 2 days ago (1 children)

It is unclear to me what you are trying to accomplish, do you want to update the elements for where masked?

[–] [email protected] 1 points 2 days ago

There's this categorical column of integers that have some excepcional cases where some letters are included. I need to process the column except the excepcional cases to format the column, but I just found put that it was giving me a problem because pandas imported it as mixed type while polars just imported it as string respecting the original correct formatting.

[–] [email protected] 3 points 3 days ago (1 children)

I thought I’d be using Polars more but in the end, professionally, when I have to process large amounts of data I won’t be doing that on my computer but on a Hadoop cluster via PySpark which also has a very non-pythonic syntax. For smaller stuff Pandas is just more convenient.

[–] [email protected] 2 points 2 days ago

My company is moving to databricks, that I know uses pyspark but never used it, guess eventually I'm going to have to learn it too.

[–] gigachad 2 points 3 days ago

Nope. I am working with geodata so I need geopandas for my work. Sadly, there is no serious alternative until now. If, in the future, that will change, I am am absolutely on board giving polars a try.

[–] [email protected] 1 points 3 days ago

I moved from pandas.

that's it, there is no polars. Its been great !